Saturday, July 31, 2010

DomainOperationException Overriding Custom Exceptions

Issue

We ran into a rather silly issue with a recent Silverlight deployment. In our RIA services is it not uncommon for use to throw custom exceptions from the RIA side and then have our Silverlight application display them to the user.

For instance we would throw a custom exception if there was an issue with the login and we would expect the following to be "User and password is invalid" to be displayed. However we were getting "Load operation failed for query 'Login'. Exception of type 'System.ServiceModelServices.Client.DomainOperationException' was thrown."

What we noticed is anytime we through a custom exception it was being returned as DomainOperationException instead of the custom exception type. This drove us nuts because it was working in each of our development environments but not on this integration machine we were using for testing. We did some searches on DomainOperationException but really could not find much that helped. We did find this blog (http://weblogs.asp.net/fredriknormen/archive/2009/12/08/wcf-ria-services-exception-handling.aspx) however it was poorly written.

Solution

Well the solution was really simple. All we had to do was make the following modification to our web.config where the RIA Services were hosted.

<customErrors mode="Off"/>

Saturday, July 10, 2010

SharePoint LINQ Query Error

Issue

If you are using the new Microsoft.SharePoint.Linq API for SharePoint 2010 web parts you may run into the following problem. Here is some code:

TeamSiteDataContext context = new TeamSiteDataContext(SPContext.Current.Web.Url);

TeamSiteDataContext context = new TeamSiteDataContext(SPContext.Current.Web.Url);
EntityList<Task> tasks = context.GetList<Task>("Tasks");

DateTime? queryDate = new DateTime(2010, 7, 16);

var results = from tsk in tasks
where tsk.DueDate == queryDate
select tsk;

spGridView.DataSource = results;
spGridView.DataBind();

All I am doing is running a LINQ query for tasks. This code will give the following error:

One or more field types are not installed properly. Go to the list settings page to delete these fields.

Solution

While researching this, I found that if I got rid of the DueDate, the error would go away but that was not an acceptable solution. I did some research on and could not find anything on SharePoint 2010, as this was an issue you could have when writing SPQuery in older versions of SharePoint.

The ultimate solution was to change the second line to the following:

EntityList<TasksTask> tasks = context.GetList<TasksTask>("Tasks");

The reason is I used SPMetal to generate the proxy to SharePoint 2010. In the generated class there will be a Task and TasksTask classes. The Task maps to the generic content type while the TasksTask maps to the content type of the task list. When I used the class for the content type mapped to the list I was querying against, the error went away.

Tuesday, July 6, 2010

SharePoint 2010 Free Development Hands On Labs

Introduction

Now that you have a development environment you may need to spin up on SharePoint 2010 development. I personally learn best by doing hands on labs and tweaking them and extending them; this is the best way to really learn.

Whether you are experienced or not with SharePoint development here is the best way to learn. I liked this because as an experience SharePoint developer, these hands on labs I found really showed me many of the new features I had been hearing about.

If you need a local SharePoint 2010 development environment, please my detailed blog on how to build one.

Part 1 - SharePoint 2010: Professional Developer Evaluation Guide and Walkthroughs

First I would read this entire document if you want to learn about the new development features of SharePoint 2010.

This has a couple awesome labs to get started on:

  • Lab 1 - shows how to deploy a web part using the new SharePoint 2010 project template and using the new LINQ to SharePoint API.
  • Lab 2 – Deploying WSP solutions in a sandbox.
  • Lab 3 – Show some advanced stuff on how to create custom workflow activity events in SharePoint Designer 2010 and then pull it into a Visual Studio 2010 project. Awesome example of creating a workflow outside the context of a list item which has been a huge limitation for SharePoint 2007.
  • Lab 4 – Shows the new Javascript API for SharePoint.
  • Lab 5 – Shows the new BCS feature and how to build up an external content type in Visual Studio 2010.
  • Lab 6 – Shows you how create a Silverlight project and reference that Silverlight application in a web part.

Part 2 – SharePoint Getting Started Hands On Labs

There are videos and you have the ability to do the labs virtually online.

You can also just download all of the hands on labs and build them on your in your development environment.

  • Lab 1 - Build a simple web part.
  • Lab 2 - Intro into all of the new development features.
  • Lab 3 – Third part of building up web parts.
  • Lab 4 – Server side APIs
  • Lab 5 – SharePoint 2010 client side APIs – a must.
  • Lab 6 – BCS integration.
  • Lab 7 – SharePoint 2010 workflows.
  • Lab 8 – Silverlight integration.
  • Lab 9 – Sandboxed solutions.
  • Lab 10 – Working with the new dialog and ribbon controls for SharePoint 2010.

Conclusion

If you can complete this – I think you will be well on your way to support any SharePoint 2010 development task thrown at you.

How to Create a Local SharePoint 2010 Development Environment


Introduction


On challenge a lot of people have when wanting to learn SharePoint 2010 is having an environment to do it. Up this point, I have been reliant on getting access to SharePoint 2010 that would run in a hosted environment. The biggest hindrance for running SharePoint 2010 is the dependency of running on 64 bit and having sufficient memory (4GB for development; 8GB for production).


The following are the steps I went through to create a SharePoint 2010 environment that is running locally on my laptop.


Once you have the environment created, read this to do get free development hands on labs.


Host Environment


This is the configuration I have for my laptop.



  • Host OS - Currently on my laptop I am running Windows 7 64 bit. I had thought about using Windows Server 2008 on my laptop but decided against in for the long-term.

  • Memory - I was finally able to upgrade my laptop memory from 4GB to 8GB. That was needed so I could provide the minimum of 4GB.

Guest OS


Now I am lucky that I have a MSDN subscription so obtaining all this was very easy <g>


The following is the configuration you will need to do prior to installing SharePoint.



  • Hardware/Software Requirements – I first reviewed Hardware and software requirements (SharePoint Server 2010).

  • OS – I created a virtual environment and installed Windows Server 2008 R2 on it.

  • Configure OS Server 2008 R2 – I made sure I got the latest updates.

  • Server Roles – Next I added the Web IIS and Application Server roles.

  • Desktop Feature - You will need to turn on the Desktop Experience Feature so Office can save to SharePoint.

  • Static IP Address – Next you will need to apply a static IP address to the VM for the next step to create a Domain Controller. Otherwise you will get an error saying "This computer has dynamically assigned IP address(es)" when installing the domain controller (http://orbitalrobot.com/blog/Lists/Posts/Post.aspx?ID=3).

  • Create a Domain Controller – Next you will need to add a Domain Controller. I thought I could get away without having a Domain Controller however you will it later when configuring SharePoint 2010. This is because the service accounts need to belong to a domain. Read these simple instructions to set up a Domain Controller (Setting Up Your First Domain Controller With Windows Server 2008). I called my domain SharePoint.local.

  • SMTP Service – You will need this for incoming and outgoing email on your development environment. You will need to incorporate this into the solutions you build. To install and configure the SMTP Service use the guidance here - Configure incoming e-mail (SharePoint Server 2010).

  • SQL Server 2008 R2 – Next I installed SQL Server 2008 R2. Note make sure SSRS was installed and I configured it to run in SharePoint Integrated Mode right off the bat.

  • SSRS Configuration – This was not well documented as part of the installation process for SharePoint. I found this - How to: Configure SharePoint Integration on a Stand-alone Server – specifically focus on the making sure you have the stuff done that you need to do before installing SharePoint 2010. If you are not familiar with Reporting Services configuration you should also check this out - How to: Configure a URL (Reporting Services Configuration). I would recommend creating a new web site in IIS on port 8080 and configure reporting services to run from there. When later configure SharePoint 2010 it will take port 80 on IIS.

  • SharePoint 2010 Prerequisites – Read Hardware and software requirements (SharePoint Server 2010) which has a list of perquisites you need to install. Several of these will be installed for you when you run the SharePoint 2010 Prerequisites tool. What I did was install several of prerequisites myself. What I am going to try next time is see if I can skip this step and let the SharePoint 2010 Prerequisites tool do it all for me.

Install SharePoint 2010 Bits



  • Run SharePoint 2010 Prerequisites Tool – Next I ran the SharePoint 2010 prerequisites tool using the following instructions as guidance - Deploy a single server with SQL Server (SharePoint Server 2010). I ended up getting an error first time saying "The tool was unable to install SQL 2008 R2 Reporting Services SharePoint 2010 Add-in". I just reran the Prerequisites Tool again and the error took care of itself.

  • Create Account Server Farm Account – You will need this during the install. You do not need to give it any permissions; the installation wizard will grant needed permissions. Read the additional references below about all the service accounts that you made need.

  • Install SharePoint 2010 – Next I installed SharePoint 2010 bits using the following instructions as guidance - Deploy a single server with SQL Server (SharePoint Server 2010). It is very similar to SharePoint 2007.

Additional References:



Additional Software


The following is additional software that you will need for your new SharePoint 2010 Development environment!!!!



  • Visual Studio 2010

  • SharePoint Designer 2010

  • SharePoint Workspace 2010

SharePoint 2010 Configuration Wizard


Now comes the fun part of configuring SharePoint 2010, which was significantly quicker than SharePoint 2007.



  • Create SharePoint 2010 Service Account – I will need a service account when going through the SharePoint 2010 Configuration Wizard. I just created a new account on my domain with and gave it no permissions and the Configuration Wizard would take care of that for me.

  • Run SharePoint 2010 Configuration Wizard - Next I needed to run the SharePoint 2010 Configuration Wizard. I followed the guidance in - Deploy a single server with SQL Server (SharePoint Server 2010). I used the new service account and since I had an Enterprise license, I went ahead and installed all the application services so I can play with them. One of the steps in this is to also run the Farm Configuration Wizard in Central Admin. I pretty much turned on all the services that I could.

Now you have completed this, you can open Central Admin. One thing you will notice right off the bat is there is a new status bar that is red sitting right in the middle of the screen with issues or errors. I had a couple but none really worth noting because they are not issues because this is a development environment. For instance I am not going to get all wrapped around the axel on making sure service accounts are not shared across services in my local development environment…


Configure SharePoint 2010 Wizard (Highly Recommended)


Once you have run the configuration wizard, even though you have used the new Farm Configuration Wizard in Central Admin to initialize a ton of services, there are still some steps you need to do. The following are highly recommended configurations you should do for your development environment.



Configure SharePoint 2010 Wizard (Optional)


Much of the configuration below is optional because the Farm Configuration Wizard took care of most of it. You may have to mess with these configurations when you are developing a solution so here are some references you should skim over to get familiar with things.



Additional Resources:



Thursday, July 1, 2010

SharePoint 2010 Physical Topology

Introduction

In this blog I am going to build off what we has been discussed thus far in this series. Whenever I start working with a new SharePoint client, the first thing that comes up is what is needed to install to get going. Well that answer is always based on business requirements however I am rather quickly forced into a corner to show them how it works J

To be able to define the physical topology you need to know:

  • SharePoint service architecture
  • SharePoint logical architecture
  • Having a strong understanding of availability, capacity, and continuity

Once you have a good idea of that, you will be in a good position to draw out sketches of the Physical Topology for SharePoint you need. For SharePoint 2010, things have changed a lot and the logical architecture is significantly more flexible and scalable. I would say the game has changed in lots of regards.

In this blog I am going to focus on understanding some of the decision logic I would go through when architecting a new SharePoint 2010 farm.

Getting Started

I am going to keep this simple as possible. The first place to get started is to understand your business requirements so you can come up with a list of services that you need. You can read my previous post on SharePoint service architecture.

Next once you have understanding of what services are needed, you will need to come up with a logical architecture for SharePoint. In this series I went over a lot of the design decisions for coming up with that logical architecture for SharePoint.

Here is a refresh on some of those logical architectures. You may come up with a diagram as simple as this one below which is for a Single Farm Single Service Group where services are shared across farms.

Untitled1

You may come up with something like below which is a Single Farm with Multiple Service Groups. The service groups provide dedicated services to site collections.

Untitled2

You may come up with something more complicated which is a Multi Farm environment with centrally hosted services and localized dedicated services.

Untitled3

You come up with a Multi-farm with application services that are partitioned so services are centralized but data is not shared.

Untitled4

Or you may come up with a hosted partitioned farm where farm subscribe to centralized services.

Untitled5

To get more details I recommend you read the Logical Architecture posting of this series.

Selection of a logical architecture should driven by business requirements. The logical architectures I presented above are by no means the only ones; you can come up with tons of different permutations. What I did want to demonstrate is that I made no assumptions yet on how many machines would be needed to support the logical architectures. I will say in a perfect world this would always be true however sometimes the physical architecture affects the logical architecture.

Physical Topology

When we start talking physical topology we need to identify how we are going to configure the environment. There are three types of machines you will have in your farm:

  1. Web tier
  2. Application tier
  3. Database tier

There were a couple things I left out where going over the logical architecture:

  • I did not discuss web or database tiers. I focused only on understanding the application services.
  • I did not discuss how many services would be configured one a single application server or across multiple application servers. Understanding how many servers are needed will be driven by capacity and availability planning which I have also discussed in this series. So it is really important to understand the demographics of the users, their profile, their expectations and how they plan to use SharePoint for both the short and long term.
  • The service groups that I defined in the logical architecture do not drive what services will be configured onto what machines. What we need to focus on is identifying services that have common performance characteristics and configure them onto dedicated server resources (physical or virtual).

There is some basic guidance that Microsoft provides that will help you get started which I am going to go over.

Limited Deployments

This is described as either:

  • One-server farm with all the tiers installed on one server.
  • Two-server where the web and application tiers are installed one server and the database would be installed on an existing SQL Machine.

Untitled6

The one-server farm is described as an Evaluation environment for under 100 users while the two-server farm would support up to 10,000 users. Personally I would never see either of these configurations used except demonstration, proof of concept or development environments. I would not recommend using this for production environments.

Small Farm Topology Deployments

This farm would be scaled out a little more to support more users or provide dedicated servers for services. Web servers, search service and other application services will be distributed across multiple machines. Some potential configurations are:

  • Three-server configuration where there are two web servers that are load balanced. We web server will have the query server configured (which searches an index) and one will be dedicated for all other application services. The third machine is the SQL machine.
  • Four-server configuration where there are two load balanced web servers with the query service running on each. The third server is dedicated to hosting application services and the fourth is used for SQL Server.
  • Five-server configuration is similar to the four-server farm but we separate the index service onto a separate application server.
  • Six-server configuration which is the same as the five-server farm except we create dedicated SQL database for search service to better support crawling.

Untitled7

Microsoft states that two WFEs can support 10,000 to 20,000 users. As well this configuration can support searching up to 10M documents.

The five and six-server farm configurations are very common configurations out there today. It is highly recommended starting place for a company division or for a dedicated solution farm because:

  • There is redundancy for the WFEs – if one goes down users can still access the site.
  • There is performance improvement for users because there are two load balanced WFEs.
  • The application services are dedicated to a machine.
  • Search indexing which can be an expensive operation is dedicated onto a separate application server.
  • Separating the search query and index services provide both performance improves in search and redundancy (i.e. users can still search of the search server is down).

The three and four-server farms I really do not see as viable configurations. At minimum you need to consider a five or six-server farm for production environments.

Medium/Large Farm Topology Deployments

There is no definition which says exactly that X amount servers means you have a medium or large farm. Really I always start with a five farm configuration and from there I will start scaling out servers based on capacity requirements and the logical architecture that I created earlier.

Untitled8

Some observations:

  • Adding more web servers will support more users. A rule of thumb is every web server will support an additional 10,000 users.
  • Based on capacity and availability requirements, add more application servers and spread your services across those servers. These servers can be dedicated to supporting specific service applications. The logical architecture you created will not have a one-to-one mapping of services to server. In some cases, you may have to create multiple servers to support a single service from the logical architecture. Review the blog in this series on capacity planning for more information.
  • There is a potential that you may want to scale content databases onto their own dedicated database server. You may have to consider adding more databases to better support search.
  • You may consider creating dedicated WFEs for search, so when content is crawled, the WFEs that users use is not affected.
  • Scaling out search is very important. SharePoint Search is more scalable and if you are considering using FAST, even more servers will have to be considered. I am not going to dive deeply into scaling search in this blog.
  • It is highly recommended to plan out server groups and with service applications that have similar requirements. You can consider centrally hosting these servers and making them available to other farms.

Extranet/Internet Farm Topology Deployments

Extranets and Internets will require you to have an understanding of how you are going to secure SharePoint with a DMZ. This model by Microsoft is the best place to start. The diagrams for SharePoint 2010 are identical to the ones for SharePoint 2007. I usually recommend implementing the Split Back-to-Back server configuration. Even though the configuration may be more complex, all of the application and database servers will reside behind the firewall and only the WFEs will be placed onto the DMZ.

References

Hopefully this will help you get started with architecting your SharePoint 2010 environments.

Monday, June 28, 2010

SharePoint 2010 Databases

Introduction

One of the first things that will come up with any SharePoint 2010 is what databases are needed to support SharePoint 2010. With every SharePoint deployment there are two people who you need to become your best friends, the system admin and database admin. Now both these guys/gals have very distinctive views of the world – and at times we can find them down right annoying J However they do what they do because they have gone through lessons we as solution developers do not.

In this blog I am going to go over the databases that are needed for SharePoint 2010. The number and types of databases that are needed to support SharePoint 2010 have changed from SharePoint 2007. As you are about to see when I say more databases, I really mean more databases. Many of the maintenance, sustainment, governance, etc. challenges a SharePoint engagement suffer from tend to take this for granted or think that is can be resolved later – and by then it is too late.

I would highly recommend you understand this along with the new service architecture and logical topology of SharePoint 2010.

External Databases

I figured I put this first because it is an important. I have always said that understanding the databases of SharePoint is not good enough. Once you bring in data from an external line of business systems, the databases become part of SharePoint from a user's perspective. So capacity planning, continuity management, etc. need to be part of your SharePoint governance plan.

Configuration Database (SharePoint 2010 Foundation)

This database is responsible for managing data associated to the all the SharePoint databases in the farm, all IIS web sites, trusted solutions, WSP packages, site templates, all web application and farm settings.

From a size perspective this database will be small and there can only be one per farm. This database has a strong dependency to the Central Administration Content Database and they must reside on the same database instance.

Central Administration Content Database (SharePoint 2010 Foundation)

This is the content database from for the Central Administration web site. It will not grove very large and has a strong dependency on the Configuration Database (i.e. they must be located on the same instance). Only one of these databases will be created per farm.

Content Database (SharePoint 2010 Foundation)

This is the database(s) that is responsible for storing all content stored in SharePoint websites. This would include lists, documents, web part properties, audit logs, sandboxed solutions, etc. It will also store data for Office Web Applications (Excel, Access, OneNote, InfoPath, etc.)

A content database can store data for multiple site collections however data within a specific site can only be store in on content database. There will potentially be numerous content databases based on the design of your SharePoint environment.

Microsoft strongly recommends that content databases size should be limited to 200GB. Supporting content databases with terabytes of data is supported for large single repositories of data like a Records Center. If you have gone over 200GB of data in a content database, you have not done your planning nor put the governance in to manage your environment. I personally would recommend making dedicated content databases per site collection and for an enterprise deployment of SharePoint there should be multiple site collections, not just one big one.

Content databases can be located anywhere; there are no dependencies other than being accessible to the SharePoint farm. For very large sites, you may even created dedicated instances to support performance.

Usage Database (SharePoint 2010 Foundation)

This is a new database which is dedicated to supporting the new Usage and Health Data Collection Service Application service. This database stores all of the health monitoring and usage data collected and the data within it is temporary. This database needs to support heavy write operations because data will be continually written to it. The health monitoring service will later take this data, aggregate it and then store it in the Web Analytics Reporting Database.

This database can get very large relative to the size of the amount of content you have stored in SharePoint as well as how many reports you have running. It will never be as large as the content database(s) because the actual data will not be store in it but it will store information about all data in all content databases across the entire farm. There can only be one of these databases per farm.

Business Data Connectivity (SharePoint 2010 Foundation)

This is the database that is used to support BCS services. All it stores is external content types and associated metadata. This database will remain small because it does not store any data from the external systems. The only thing this database will need to support if heavy read operations because on the usage of BCS within SharePoint.

Application Registry Database (SharePoint 2010 Foundation)

This database stores data required to support backwards compatibility for Business Data Connectivity (BDC) from SharePoint 2007. This database is only used during the upgrade process and can actually be deleted after the upgrade is complete.

Subscription Settings Database (SharePoint 2010 Foundation)

This is a new database for SharePoint 2010 and supports the Subscription Settings Service. This database is used to support the new partitioning feature for SharePoint 2010. If you did not know, SharePoint data can now be partitioned by service subscription. This is will be used if you are providing hosted, centrally managed services and you want to make sure one service subscriber cannot access data of another service subscriber. This way services can be shared in a farm but the data can be protected. This database needs to support heavy read operations for hosted services that are highly utilized.

This database is not big and will not be created by default. The SharePoint administrator will create this database using PowerShell.

Search Administration Database (SharePoint 2010 Standard)

This database is used to support SharePoint 2010 Search service. It contains all the configuration information associated to search and Access Control List (ACL) which is used for securing content that is indexed. This data bases is neither small nor big. An instance of this database can be created for each search service that is running.

Crawl Database (SharePoint 2010 Standard)

This is another database that is used to support SharePoint 2010 Search service. This database will store the state of the crawled data and the crawl history.

This database can grow to be very large based on the amount of content that you are indexing. More crawl databases can always be added into the farm to scale out. This database must support heavy read operations and it is recommended to run on SQL Server Enterprise Edition.

Property Database (SharePoint 2010 Standard)

This is the third database that is used to support SharePoint 2010 Search service. This database will store information associated to crawled data (i.e. properties, history, and crawl queries). This database can become large but not as big as the Crawl Database. It recommended for very large SharePoint environments that this database be put in a different database server; separate from the crawl database. This database must support heavy write operations and it recommended to run on SQL Server Enterprise Edition.

Web Analytics Staging Database (SharePoint 2010 Standard)

This database store temporary usage data collected from the Usage Database. The data comes to this database in an un-aggregated format and web analytics service will take this data, process it, aggregate it and then sent it to the Web Analytics Reporting Database. This database will be cleaned out every 24 hours but then refilled with new data that has been collected.

Web Analytics Reporting Database (SharePoint 2010 Standard)

This is new database for SharePoint 2010 used to support the Web Analytics Service. This database stores all the aggregated analytics data collected across the SharePoint 2010 farm. This is the database the usage reports run against and there will only be one of these databases per farm.

This database can grow to become very large relative to the amount of data stored in the entire farm. This database will only have analytics data; it will not have any actual data from the content databases. By default, data will be stored in here for up to 25 months.

State Database (SharePoint 2010 Standard)

The state service is used to support storing temporary data across HTTP request. This database is utilized by InfoPath Form Services, Visio Services, Exchange, Chart Web Part, etc. (). The space required for this database is driven by the usage of the services that utilize of this database. Multiple state databases can be added through PowerShell commands.

Profile Database (SharePoint 2010 Standard)

This is a database used by the User Profile service and is used to store profile data. This database will not become very big and the size will be based on amount data be stored about each user. The database needs to support heavy read operations to get user data which is access commonly (user permissions are not store here; they would be in the content database).

Synchronization Database (SharePoint 2010 Standard)

This is another database used by the User Profile service. Its purpose is to store the configuration of the service that brings user profile data into SharePoint. It is also used to stage data that is being synchronized from directory services like Active Directory. The size of this database will be relative to the number of users and groups that are being synchronized. This database needs to support both heavy reading and writing when the synchronization service is running.

Social Tagging Database (SharePoint 2010 Standard)

This is the third database used by the User Profile service. It is used for storing social tags and notes created by users for content in SharePoint. The size of this database is completely based on the utilization of social networking services. This database will experience mostly heavy read operations.

Managed Metadata Service Database (SharePoint 2010 Standard)

This is the database used to support the new Managed Metadata Service will stored centralized content types that can be used across the farm. This database will not get very big. If managed metadata is used a lot, this database will need to support heavy read operations.

Secure Store Database (SharePoint 2010 Standard)

This is used by the secure store service which is the new SharePoint 2010 service to support Single Sign-On. It stores user credentials and passwords. This database will be small. It is recommended that this database have limited access and potentially even in a different location from the other databases.

Word Automation Services Database (SharePoint 2010 Enterprise)

This database is used by the Word Automation service and stores all pending and completed document conversions. The database will not get very large and has processes to ensure that it does not get too large.

PerformancePoint Database (SharePoint 2010 Enterprise)

This is another small database used to support PerformancePoint. It will store temporary objects and settings needed to support dashboards.

FAST Search Administration Database (SharePoint 2010 FAST)

This stores all configurations associated to groups, keywords, synonyms, term entity, inclusions, exclusions, spell check, best bets, search schema, etc. This will be a small database but must support heavy read operations to support both indexing and querying of data.

References

Saturday, June 26, 2010

SharePoint 2010 Cache

Introduction

When architecting your SharePoint 2010 solution need consider how you will leverage cache to make your applications more fast and scalable. In think a lot people (myself included) forget about caching strategies and how they can benefit from it. Each one has different pros/cons and correctly picking one based on the business requirements is important for success.

I found a detailed whitepaper called SharePoint Server Caches Overview, Advanced details on the SharePoint BLOB, Output, and Object Caches which goes over the topic. You will need to download SharePointServerCachesPerformance.docx.

The following are my summary notes from the whitepaper.

Really the purpose of caching for SharePoint is to reduce the amount of calls to SQL Server such that you can quickly return results to users while lowering SQL Server utilization. The negative is there can be a lag in showing the user the latest and greatest content. Once the cache is created, it is maintained locally on the SharePoint WFEs. There are three caching strategies you need to be aware of for SharePoint 2010: BLOB cache, output cache and object cache.

BLOB Cache

BLOB cache help improve performance by storing requested files on the WFE Server such that they do not need to retrieved every time from SQL Server. There are two basic ways files can be store in SharePoint, they can be placed directly on the server (like in the layouts directory in WFE) or they can be stored in SharePoint library. Placing the files on the SharePoint server is quicker than retrieving a file out of SharePoint however only administrators can update the files. BLOB caching specifically solves the issue of retrieving files from the document library by caching them on the WFE. This gives you the best of both worlds, centrally managed in SharePoint and improved load time of files.

  • BLOB cache should be used when pages that are accessed frequently have Javascript, CSS, images files, and large rich media files that can be cached on the WFEs. However BLOB caching is not useful if the files are not frequently accessed or if the files are modified on a frequent basis.
  • Another advantage of BLOB cache is that it reduces the time to reload web pages. This is because cache control headers can be added to the HTTP responses for the cached files on the WFE. What this will do is push cached files on the WFE down to the user's browser's cache. This will reduced in even less HTTP requests to the WFE itself.
  • BLOB cache is particularly helpful for cache large files out of the SQL server. The while paper goes into the details but there is no disk buffering for serving up larger files (5MB) which results in low latency. SharePoint is optimized for server up smaller chunk sizes (100KB).
  • When using BLOB cache, HTTP range request is supported which allows the browser to request pieces of the file to cache locally instead of the entire file. Media players that run on the client benefit when this is supported.

Let's take a deeper look at BLOB caching:

  • There is performance overhead to initially build the BLOB cache, which is around five times more expensive. One reason why permissions and metadata associated to the file is needs to be brought over to the cache to ensure security is still maintained.
  • BLOB cache is created by each web application on each WFE machine on the farm. This translates to each virtual site in IIS (which maps to a WFE) will have its own BLOB cache. BLOB cache cannot be reused across web gardens (or zones).
  • It is possible to configure the files that can be placed into the BLOB cache. There is a file with a list of extensions which can be modified based on your business requirements.
  • BLOB cache can handle multiple concurrent requests even when the requested file has not been cached yet. The example given was if a link to a large video file is mailed out you want to make sure when everyone starts clicking that link, the server does not get flooded requesting the same large file. With BLOB cache on, even if the file has not been fully cached, the video file will only be retrieved once per WFE from the SQL Server.
  • An interesting thing to know is BLOB files are stored on the WFE in folders that match the location in SharePoint. There is a 260 character limitation on file paths in Windows, so if you URLs are larger then there will be problems building your cache. It is recommended to keep relative URLs smaller than 160 characters.
  • You will need to plan for RAM utilization when using BLOB caching. The BLOB cache index will use 800 bytes of REM per entry.
  • BLOB cache is persistent cache because the cache is periodically written to file on the WFE. This means an IIS recycle or shut down will not lose the built cache. If the BLOB cache is very larger, there is a lag on when the cache will become available again once the IIS operation is complete.
  • Items in the cache are invalidated based on a polling mechanism that checks SQL Server every 5 seconds. This interval can be changed. They will not be added back to the cache until it is requested again.
  • BLOB cache also has a configurable size limit to keep the cache from growing at an uncontrolled rate. If the max is exceeded, files used the least will be removed until the cache is 70% below the max size. If this threshold is exceeded a lot, there will be performance overhead incurred, and it would be recommended to increase the max size.
  • It is possible to manually flush the BLOB cache forcing it to be rebuilt.
  • BLOB cache is optimized for returning files anonymously because the file can be immediately returned without making any SQL Server round trips. This can be done by marking the site as anonymous or storing the files in a library that has AllowEveryoneViewItems set to true.

Configuring and Managing BLOB Cache:

  • A mentioned earlier BLOB Cache is enabled by IIS site or SharePoint Web Applicatio. All you need to do is go to the web.config and modify the BlobCache node as enabled (Reference). There are several other configurations that are available for tuning the BLOB cache, I recommend reading the whitepaper for those details.
  • For the application pool you will need to increase the startup and shutdown time limits. It is recommended to set it to 300 seconds which will allow enough time initialize or serialize on startup or shutdown. Note this does not mean it takes 300 seconds to perform the operation, however it prevents IIS from terminating the application until 300 has elapsed so the cache is not lost (great reference).
  • It is recommended to keep all content to be cached in a specified list and sure the site containing that list is stable. This is because frequent changes to the site or list will invalidate all the cached files.
  • To flush cache a simple IIS Reset can be performed, the SharePoint API can be used or you can finally disable the cache, delete the folder that contains the cache and then re-enable the cache in the web.cong.

In summary use BLOB Cache:

  • If there is a high read to write ratio BLOB caching should be used. For instance you would want to cache a site logo that is used on every page request versus a collaboration word document that is actively updated.
  • It is optimized for supporting large files which can significantly reduce bandwidth between the WFEs and SQL Server.
  • It is optimized to support cache control headers so that clients can cache small files which can reduce overall number of hits to the WFEs.
  • If there is anonymous access, there can be dramatic improvements because permissions do not have to be validated for cache files.
  • Client applications that use range requests can optimize load times to access large files.

Output Cache

The second caching option you have with SharePoint 2010 is ASP.net Output Cache. This is an in-memory cache that saves rendered ASPX pages. Using Output cache improves performance in two ways first it reduces the amount of SQL calls. Second it reduces workload on the WFE because pages do not need to be re-rendered. Along those lines if the pages are anonymous, then no SQL check needs to be done at all present the cached pages. Microsoft testing concluded a ninefold improvement in throughput when compared to having to render the page every time it was rendered.

The only catch for using Output Cache is that it can only be used in conjunction with Publishing pages. It cannot be used with a collaboration site. Output cache is configured on a per site collection basis using cache profiles. A cache profile is the settings and parameters used to control how pages will be cached. Some examples of rules that can be capture in a cache profile would be to not cache if the requestor is a user who can edit pages to ensure they see the latest version of content. The cache profile also specifies rules for when a page is invalidated so that when the next request is made, it comes from the database.

There are two options for cache invalidation:

  • Time to live (TTL) – Is a basic rule that will retain the page until a length of time has been exceeded. Microsoft testing results found that TTL cache invalidation did perform well when the site content changes frequently.
  • Check for Changed – Is a rule that states all pages using the profile are invalidated when if there is any site change or TTL has passed. This is best used for sites where changes do not occur often.

One of the main considerations for Output cache is the memory needed to support it. For each rendered page, 2(size of the page) + 32KB is needed to store the rendered page in memory. Depending on your cache profile, you may store multiple different versions of the cache. You may create different cache versions based on what the users role is, what type of browser they use, or page layout type. For each version a different cache entry will be made for the same page. So it would actually be possible to create a rule that says specific types of publishing pages may become invalid every 10 minutes while other types would become invalidated every 24 hours.

To configure Output Cache:

  • To configure and set up Output cache and profiles read here.
  • Next you need to go to the site collection where publishing has been turned on and in the collection settings page turn on the output cache.
  • On the page you can set up the Output cache profiles – read the whitepaper for details on those configurations.

In summary Output Cache should not be used with sites using a low read to write ratio because frequent changes to content make it hard to keep the cache fresh. So understanding how important it is to have the most current content available to the user is important. Another consideration to know is how dynamic the content is and if per-user content has to be supported. Having to support per user cache, more space it is needed to store the cache. As well having to support lots of variations of cache will again require memory to support the cache.

Object Cache

Object cache is the third caching option we have for SharePoint 2010. What Object cache does is stores metadata about SharePoint Server objects (like SPWeb, SPSite, SPList, etc.) on the WFEs. When a page is rendered, if there is data that needs to be retrieved through these objects, the SQL Server will not be hit. Features of SharePoint that use Object cache are publishing, content query web part, navigation, search query box and metadata navigation. These features are specifically written to use the Object cache API instead of the SharePoint API directly. Developers writing custom functionality can also tap into the Object cache API.

The Output cache algorithm for how in determines what to cache is complicated because the user permissions have to be accounted for. Obviously we want to make sure security trimming is respected but it would be completely inefficient to create an Object cache of each and every user who comes to the web site. At a high-level there accounts (Portal Super Accounts) which can be created that can have standard permission levels assigned to them. Cache will be created based on these accounts and then the ACL of the current user will be applied to show the user data from the Object cache they have permission to. Along with this, there is the Object cache multiplier which is set at in a site collection. This multiplier controls the number of rows that will be cached. Increasing the multiplier will increase the number items returned from cache at the expense of utilizing more RAM to store more data.

Object cache cannot be disabled. Object cache configuration is very similar to Output cache. Object cache can check the site for changes every time there is a request or check for updates periodically. The only difference is Object cache will not become completely invalidated when any change is made to the site. For instance, if a list item were to change, only that list item would be invalidated and all of the other list items would remain in Object cache. Also, dependencies between cached items are maintained and if the list itself were deleted, the all the list items for the list in the cache would as well become invalidated.

The first step in configuration of Object Cache is the set up of the Portal Super Accounts which control how the cache is built. Please read the whitepaper for more information about how to set up these accounts. Some configurations can be easily access in the site collection administration page. As well, there are some configurations that can be applied to the web.config which are new to SharePoint 2010.

Finally you need to plan for sizing of the Object Cache. According to Microsoft, a small number of site collections (fewer than 50) the object cache should have a little to no memory issues but more than that you may need to do some planning. Microsoft's recommendations is to plan 500KB of RAM per site collection that has Object Caching turned on.