Monday, June 28, 2010

SharePoint 2010 Databases

Introduction

One of the first things that will come up with any SharePoint 2010 is what databases are needed to support SharePoint 2010. With every SharePoint deployment there are two people who you need to become your best friends, the system admin and database admin. Now both these guys/gals have very distinctive views of the world – and at times we can find them down right annoying J However they do what they do because they have gone through lessons we as solution developers do not.

In this blog I am going to go over the databases that are needed for SharePoint 2010. The number and types of databases that are needed to support SharePoint 2010 have changed from SharePoint 2007. As you are about to see when I say more databases, I really mean more databases. Many of the maintenance, sustainment, governance, etc. challenges a SharePoint engagement suffer from tend to take this for granted or think that is can be resolved later – and by then it is too late.

I would highly recommend you understand this along with the new service architecture and logical topology of SharePoint 2010.

External Databases

I figured I put this first because it is an important. I have always said that understanding the databases of SharePoint is not good enough. Once you bring in data from an external line of business systems, the databases become part of SharePoint from a user's perspective. So capacity planning, continuity management, etc. need to be part of your SharePoint governance plan.

Configuration Database (SharePoint 2010 Foundation)

This database is responsible for managing data associated to the all the SharePoint databases in the farm, all IIS web sites, trusted solutions, WSP packages, site templates, all web application and farm settings.

From a size perspective this database will be small and there can only be one per farm. This database has a strong dependency to the Central Administration Content Database and they must reside on the same database instance.

Central Administration Content Database (SharePoint 2010 Foundation)

This is the content database from for the Central Administration web site. It will not grove very large and has a strong dependency on the Configuration Database (i.e. they must be located on the same instance). Only one of these databases will be created per farm.

Content Database (SharePoint 2010 Foundation)

This is the database(s) that is responsible for storing all content stored in SharePoint websites. This would include lists, documents, web part properties, audit logs, sandboxed solutions, etc. It will also store data for Office Web Applications (Excel, Access, OneNote, InfoPath, etc.)

A content database can store data for multiple site collections however data within a specific site can only be store in on content database. There will potentially be numerous content databases based on the design of your SharePoint environment.

Microsoft strongly recommends that content databases size should be limited to 200GB. Supporting content databases with terabytes of data is supported for large single repositories of data like a Records Center. If you have gone over 200GB of data in a content database, you have not done your planning nor put the governance in to manage your environment. I personally would recommend making dedicated content databases per site collection and for an enterprise deployment of SharePoint there should be multiple site collections, not just one big one.

Content databases can be located anywhere; there are no dependencies other than being accessible to the SharePoint farm. For very large sites, you may even created dedicated instances to support performance.

Usage Database (SharePoint 2010 Foundation)

This is a new database which is dedicated to supporting the new Usage and Health Data Collection Service Application service. This database stores all of the health monitoring and usage data collected and the data within it is temporary. This database needs to support heavy write operations because data will be continually written to it. The health monitoring service will later take this data, aggregate it and then store it in the Web Analytics Reporting Database.

This database can get very large relative to the size of the amount of content you have stored in SharePoint as well as how many reports you have running. It will never be as large as the content database(s) because the actual data will not be store in it but it will store information about all data in all content databases across the entire farm. There can only be one of these databases per farm.

Business Data Connectivity (SharePoint 2010 Foundation)

This is the database that is used to support BCS services. All it stores is external content types and associated metadata. This database will remain small because it does not store any data from the external systems. The only thing this database will need to support if heavy read operations because on the usage of BCS within SharePoint.

Application Registry Database (SharePoint 2010 Foundation)

This database stores data required to support backwards compatibility for Business Data Connectivity (BDC) from SharePoint 2007. This database is only used during the upgrade process and can actually be deleted after the upgrade is complete.

Subscription Settings Database (SharePoint 2010 Foundation)

This is a new database for SharePoint 2010 and supports the Subscription Settings Service. This database is used to support the new partitioning feature for SharePoint 2010. If you did not know, SharePoint data can now be partitioned by service subscription. This is will be used if you are providing hosted, centrally managed services and you want to make sure one service subscriber cannot access data of another service subscriber. This way services can be shared in a farm but the data can be protected. This database needs to support heavy read operations for hosted services that are highly utilized.

This database is not big and will not be created by default. The SharePoint administrator will create this database using PowerShell.

Search Administration Database (SharePoint 2010 Standard)

This database is used to support SharePoint 2010 Search service. It contains all the configuration information associated to search and Access Control List (ACL) which is used for securing content that is indexed. This data bases is neither small nor big. An instance of this database can be created for each search service that is running.

Crawl Database (SharePoint 2010 Standard)

This is another database that is used to support SharePoint 2010 Search service. This database will store the state of the crawled data and the crawl history.

This database can grow to be very large based on the amount of content that you are indexing. More crawl databases can always be added into the farm to scale out. This database must support heavy read operations and it is recommended to run on SQL Server Enterprise Edition.

Property Database (SharePoint 2010 Standard)

This is the third database that is used to support SharePoint 2010 Search service. This database will store information associated to crawled data (i.e. properties, history, and crawl queries). This database can become large but not as big as the Crawl Database. It recommended for very large SharePoint environments that this database be put in a different database server; separate from the crawl database. This database must support heavy write operations and it recommended to run on SQL Server Enterprise Edition.

Web Analytics Staging Database (SharePoint 2010 Standard)

This database store temporary usage data collected from the Usage Database. The data comes to this database in an un-aggregated format and web analytics service will take this data, process it, aggregate it and then sent it to the Web Analytics Reporting Database. This database will be cleaned out every 24 hours but then refilled with new data that has been collected.

Web Analytics Reporting Database (SharePoint 2010 Standard)

This is new database for SharePoint 2010 used to support the Web Analytics Service. This database stores all the aggregated analytics data collected across the SharePoint 2010 farm. This is the database the usage reports run against and there will only be one of these databases per farm.

This database can grow to become very large relative to the amount of data stored in the entire farm. This database will only have analytics data; it will not have any actual data from the content databases. By default, data will be stored in here for up to 25 months.

State Database (SharePoint 2010 Standard)

The state service is used to support storing temporary data across HTTP request. This database is utilized by InfoPath Form Services, Visio Services, Exchange, Chart Web Part, etc. (). The space required for this database is driven by the usage of the services that utilize of this database. Multiple state databases can be added through PowerShell commands.

Profile Database (SharePoint 2010 Standard)

This is a database used by the User Profile service and is used to store profile data. This database will not become very big and the size will be based on amount data be stored about each user. The database needs to support heavy read operations to get user data which is access commonly (user permissions are not store here; they would be in the content database).

Synchronization Database (SharePoint 2010 Standard)

This is another database used by the User Profile service. Its purpose is to store the configuration of the service that brings user profile data into SharePoint. It is also used to stage data that is being synchronized from directory services like Active Directory. The size of this database will be relative to the number of users and groups that are being synchronized. This database needs to support both heavy reading and writing when the synchronization service is running.

Social Tagging Database (SharePoint 2010 Standard)

This is the third database used by the User Profile service. It is used for storing social tags and notes created by users for content in SharePoint. The size of this database is completely based on the utilization of social networking services. This database will experience mostly heavy read operations.

Managed Metadata Service Database (SharePoint 2010 Standard)

This is the database used to support the new Managed Metadata Service will stored centralized content types that can be used across the farm. This database will not get very big. If managed metadata is used a lot, this database will need to support heavy read operations.

Secure Store Database (SharePoint 2010 Standard)

This is used by the secure store service which is the new SharePoint 2010 service to support Single Sign-On. It stores user credentials and passwords. This database will be small. It is recommended that this database have limited access and potentially even in a different location from the other databases.

Word Automation Services Database (SharePoint 2010 Enterprise)

This database is used by the Word Automation service and stores all pending and completed document conversions. The database will not get very large and has processes to ensure that it does not get too large.

PerformancePoint Database (SharePoint 2010 Enterprise)

This is another small database used to support PerformancePoint. It will store temporary objects and settings needed to support dashboards.

FAST Search Administration Database (SharePoint 2010 FAST)

This stores all configurations associated to groups, keywords, synonyms, term entity, inclusions, exclusions, spell check, best bets, search schema, etc. This will be a small database but must support heavy read operations to support both indexing and querying of data.

References

1 comment:

F. L. said...

Interesting overview !