Tuesday, February 1, 2011

Customize SPMetal Generated Code for SharePoint to LINQ

A common question that has come up with SharePoint 2010 developers. If they are using the SPMetal tool to generate entities (or class definitions) for objects in SharePoint, can they modify the generated code? The first response someone may say is no; that is not possible. The reason being is when you rerun the SPMetal command and then replace the new generated code overtop your existing code, you will lose your changes.

Well there is a solution and is it extremely simple. I actually got the idea from all the coding experiences I had with building customizations into RIA Services with Silverlight. All the SPMetal tool is doing is generating Proxy code and notice that all of the code is generated as partial classes!!! Now the sky is the limit.

So in the example below, here is a custom content type I created called Anonymous Comment. Below is the generated code and you can see it is a partial class. Now all you need to do is add another partial class to your project called, AnonymousComment and add in custom methods. When you use the AnonymousComment class in some other custom code it will be treated as one class definition.

Another neat thing you can is see there are partial methods for OnLoaded, OnValidate and OnCreated. You can extend those methods if you want and they will be called by the framework. One unfortunate thing you cannot do is modify the behavior of the getter and setter properties, but you can handle that in your custom partial class.

/// <summary>
/// The item that will capture an anonymous comment.
/// </summary>
[Microsoft.SharePoint.Linq.ContentTypeAttribute(Name="Anonymous Comment", Id="0x0100E3A8FCEBECB140D2812D4F3DE177EFEC")]
public partial class AnonymousComment : Item {

private string _comment;

private System.Nullable<System.DateTime> _created;

private string _response;

private System.Nullable<System.DateTime> _responseDate;

private System.Nullable<Category0> _category;

private System.Nullable<Status> _status;

#region Extensibility Method Definitions
partial void OnLoaded();
partial void OnValidate();
partial void OnCreated();
#endregion

public AnonymousComment() {
this.OnCreated();
}

[Microsoft.SharePoint.Linq.ColumnAttribute(Name="AnonymousComment", Storage="_comment", FieldType="Note")]
public string Comment {
get {
return this._comment;
}
set {
if ((value != this._comment)) {
this.OnPropertyChanging("Comment", this._comment);
this._comment = value;
this.OnPropertyChanged("Comment");
}
}
}

[Microsoft.SharePoint.Linq.ColumnAttribute(Name="Created", Storage="_created", ReadOnly=true, FieldType="DateTime")]
public System.Nullable<System.DateTime> Created {
get {
return this._created;
}
set {
if ((value != this._created)) {
this.OnPropertyChanging("Created", this._created);
this._created = value;
this.OnPropertyChanged("Created");
}
}
}

[Microsoft.SharePoint.Linq.ColumnAttribute(Name="AnonymousCommentResponse", Storage="_response", FieldType="Note")]
public string Response {
get {
return this._response;
}
set {
if ((value != this._response)) {
this.OnPropertyChanging("Response", this._response);
this._response = value;
this.OnPropertyChanged("Response");
}
}
}

[Microsoft.SharePoint.Linq.ColumnAttribute(Name="AnonymousCommentResponseDate", Storage="_responseDate", FieldType="DateTime")]
public System.Nullable<System.DateTime> ResponseDate {
get {
return this._responseDate;
}
set {
if ((value != this._responseDate)) {
this.OnPropertyChanging("ResponseDate", this._responseDate);
this._responseDate = value;
this.OnPropertyChanged("ResponseDate");
}
}
}

[Microsoft.SharePoint.Linq.ColumnAttribute(Name="AnonymousCommentCategory", Storage="_category", FieldType="Choice")]
public System.Nullable<Category0> Category {
get {
return this._category;
}
set {
if ((value != this._category)) {
this.OnPropertyChanging("Category", this._category);
this._category = value;
this.OnPropertyChanged("Category");
}
}
}

[Microsoft.SharePoint.Linq.ColumnAttribute(Name="AnonymousCommentStatus", Storage="_status", FieldType="Choice")]
public System.Nullable<Status> Status {
get {
return this._status;
}
set {
if ((value != this._status)) {
this.OnPropertyChanging("Status", this._status);
this._status = value;
this.OnPropertyChanged("Status");
}
}
}
}

SharePoint 2010 Server Topology Examples

I figured I write a short blog about SharePoint, logical / physical architecture, YET AGAIN!!!

Well I actually received some advice that made me want to tweak my message a little. If you read my earlier blog series on SharePoint 2010 Service architecture you will know how much SharePoint 2010 has changed. Then you may have also read my series on the new SharePoint 2010 Search architecture and found out there have been tons of interesting changes here too.

I had a great presentation from Shane Young (SharePoint MVP) which drove me to write this blog. What I realized is I have to stop thinking about architecting in SharePoint 2007 and take advantage of SharePoint 2010. Heck I should know them given all the new architecture features I have been writing about all this time.

In many SharePoint 2007 implementations, the one service we had much of our planning around was SharePoint Search and the SSP. Now all of those limitations are gone.

I am commonly asked how many servers I need to get started with SharePoint 2010. Now that is a LOADED question because such things as SQL high availability, network, service utilization, enterprise features, etc. which all play into an optimum configuration of SharePoint 2010. There really is no one size fit all. However there is always a good place to start the discussion. Shane Young provided two recommendations which I liked the best.

The Three Server Farm

This one is pretty fun one when you think about. There are a ton of SharePoint 2007 environments out there where there are 3 SharePoint servers and with dedicated SQL Servers. In those cases there are two web front ends and one application server. The one application server runs the index service, maybe something like Excel and the search query services are running on the web front ends. In the SharePoint 2007 days, this gave some good redundancy but with SharePoint 2010 we can do better.

Now one approach you can consider in SharePoint 2010 is just making all three servers the same. Rather novel idea right?

clip_image001

Observations:

  • You now have web traffic distributed across three web front ends.
  • All three servers can support SharePoint 2010 services traffic such as Excel, Visio, Access, etc.
  • Search crawler service should be pretty spiffy because you can configure all three servers to perform full crawling during off hours, and partial crawls depending on how fresh you need to keep search.
  • All three servers can be made redundant from a search perspective by having index mirrors.
  • Search results will actually be quicker because you will have three index partitions that can be searched across simultaneously.

Now I had a colleague point out that this does not fly well because “this does not maximize the separation of services architecture from the web front ends”. I get and completely understand that standpoint but I really do not think that should be a consideration for disqualifying this option. Even though everything is running on each machine, the SharePoint roles / services will continue to run independently in different IIS websites, in separate application pools and with different accounts. There is no reason why you cannot continue to configure it as it if they are running on different machines. This is probably a good thing too because someday you will have to scale SharePoint up to meet demand. Remember you can always scale up by adding boxes or separating services onto their own dedicated machines. All I can say is configure SharePoint 2010 based on your requirements, adhere to best practices, plan as much as you can and do the right configuration for your company.

The Four Server Farm

Now let’s look at a four server SharePoint farm. There will be cases where the three server farm recommendation will not make sense for whatever the valid reason. In this configuration we go a little more “traditional”. In this case, Shane recommend two web front ends and then two application servers.

clip_image002

Observations:

  • We have two dedicated web front ends. Add more if you need to.
  • Application services will run on the same application servers with search. Hopefully they will not interfere with full crawls unless you require dedicated resources for the application services. If you need; just add more application servers.
  • The two application servers give us the needed redundancy we need for search. You will have two crawlers and two index partitions with mirrors to start with.
  • One little trick I was told for this configuration is you should also configure the application servers where the Search crawling component is running to also be web front ends (WFE). Do not load balance them with the other WFEs. Then add the SharePoint site URLs to be searched as entries into the hosts file on the application servers to point back to itself and not to the load balanced URL. What this effectively does is trick the crawler into crawling the WFE on the application server. This will give performance improvements twofold. First the load balanced WFEs where users access content will have no performance issues associated to the crawler crawling content. Second, the load of the network will be reduced because nothing will go across the wire when indexing is occurring.

Additional Notes:

  • High Availability – SQL is a big player in this discussion and you need to read this blog.
  • FAST for SharePoint 2010 - Now if you are going to consider using FAST for SharePoint, I would probably vote for the three server architecture recommendation as a starting off point. Here is a blog series on FAST for SharePoint 2010 if you are interested.

Closing

As I have said, there is no one size fits all. This is just a place to start and based on your plans you will scale up the SharePoint 2010 architecture as needed.

SharePoint 2010 Health Analyzer and RSS Feeds

In Central Admin of SharePoint 2010 you are probably familiar with the Health Analyzer. It was around in SharePoint 2007. Now in SharePoint 2010 – you see a ribbon right on the central admin homepage.

clip_image002

So a colleague and I were joking around that why not put an RSS feed in Outlook against the list that stores health errors so you can get automatic notification of when there is a Health Issue.

Guess what, not that bad of an idea. Now I will get notification if one of my rules fail in production. This is not a replacement for SCOM, but just something simple you can do to monitor your SharePoint Health Analyzer rules. Now I know many folks could debate some of the rules but at the end of the day most of them are good best practices you may want to adhere to in a production environment.

clip_image004

Health Analyzer Rules

If you are not familiar with the Health Analyzer rules, they can be managed in Central Admin. Below is screenshot of some of the actual rule definitions.

clip_image006

Here is an actual rule definition. You can see there is a schedule to when the rule is evaluated. Also notice there is a Run Now button which will execute rule manually. You even have the ability to disable rules which do not apply to your governance rules so that they will not continually appear in your RSS feed.

clip_image008

Monday, January 17, 2011

Part 6 - FAST for SharePoint 2010 References

References

Below is a full list of all the resources I used for this series on FAST for SharePoint.

Background References

FAST for SharePoint Planning and Architecture References

FAST for SharePoint Deployment References

Part 5 - FAST for SharePoint 2010 Service Configuration

In part five of this series, we are going to focus on the SharePoint side of FAST farm configuration.

Configuration of SharePoint Services

After the FAST farm has been set up, you will need to configure the SharePoint 2010 farm to communicate with it. As I mentioned earlier in this blog series this is done through two services:

  • FAST Search Connector service – also commonly referred to as the Content SSA.
  • FAST Search Query service

Before configuring the search components in SharePoint, there is a file that will help. Go to the server where the FAST Administration Component is installed. There will be a file under c:\FASTSearch called install_info.txt. The values within this file will be used in the configuration of the SharePoint services that communicate with the FAST for SharePoint Farm.

FAST Search Connector service

To set up a new FAST Search Connector service, go to Central Administration as normal and create a new service. Add a New Search Service Application and in the FAST Service Application section, select the FAST Search Connector option.

Next you need to specify a search service account – which in this case will be a SharePoint managed service account. As well you need to specific an application pool.

image

Next we need to configure some stuff specific to FAST and this is where information from install_info.txt file will assist us.

As I mentioned earlier, the job of the FAST Search Connector service is to send information to the FAST Content Distributor component which will subsequently feed documents to the various processing components on the FAST farm. As you see below, there is a Content Distributor server name.

As well there is a Content Collection Name. The term “collection” is a carryover from the FAST ESP product. A collection is a logical grouping of searchable documents in FAST. If you like to read more about what a FAST collection is read my old blog here. You only have the ability to specific one collection name. The default collection name is “sp”. You may ask why should I care about this? Well you may have multiple SharePoint farms that are feeding content to the same FAST farm and you may want to logically organize content being fed (i.e. intranet content versus extranet content).

image

Once the FAST Search Connector service has been created, go take a look at the service. You right off the bat that is almost identical to the out of the box search administration, except none crawling links on the left are available. Everything is about crawling.

All you need to do from here is add content sources, set up some crawl schedules and you are off to the races.

image

Below shows the topology of the FAST Connector service (which is only the topology of the SharePoint side, not the FAST side). There is nothing of real interest other than the fact there is an administration component. The crawl component is not the same as the SharePoint Search. It is used to just support the feeding of content but it is not crawling in the tradition sense. There are no SQL performance considerations you have to account for here.

image

FAST Search Query service

Next you need to configure the FAST Search Query service. As I mentioned earlier in this blog, this service has two purposes: to send / retrieve query results from FAST and to perform the People Search. To support the people search, this service includes both indexing and querying.

Just like before, add a New Search Service Application but this time select FAST Search Query. Again you will need to specify a service account; you can probably just reuse the one you created for the connector service.

image

Next you will need to configure two applications pools.

image

Finally there are four configurations for FAST. All of the information needed to enter here are again located in the install_info.txt file we pulled off the FAST Admin server.

image

When it is all done, you will have a service that looks like the following. It looks identical to the out of the box SharePoint search. You will notice that there are both Crawling and Query and Result links on the left. Remember there are Crawling links because People Search is part of this service. You will need to configure everything like you normally would for search service.

image

Finally here again is the search topology for this FAST Search Query service. You see that there are several databases and what not that have been created all to support the crawling and querying.

image

Once we have completed created both of these services, you will see the two FAST Services in the Service Application list in Central Admin.

image

Conclusion

I really hope that this blog series will help you to get started in understanding what a FAST for SharePoint deployment would be like before you go off and do one.

Part 4 - FAST for SharePoint 2010 Farm Configuration

Part four of this series is going to focus on configuration of FAST for SharePoint.

Configuration of the FAST for SharePoint Farm

Now the next question is how to configure this FAST farm because in many instances you know how to do SharePoint but FAST is a foreign concept. Well the configuration of the FAST for SharePoint 2010 farm is not configured through SharePoint 2010 Administration. At a high level, there is an xml file that you need to create that captures what FAST for SharePoint components are configured on which server in the FAST farm. This xml file basically drives the entire configuration of the FAST farm.

In this part of the series I am going to give an introduction and pointers to information on how to configure your FAST for SharePoint 2010 Farm.

The best resource for you to begin your understanding of a FAST for SharePoint is “Deployment guide for FAST Search Server 2010 for SharePoint” located at - http://go.microsoft.com/fwlink/?LinkId=204984. This whitepaper pretty much has it all. This really just a supplementary with some added details.

Understanding this document will go a long ways in understanding the architecture of FAST for SharePoint. At a high level:

  1. There are several service accounts, firewall configuration, IP address work, windows updating, anti-virus and proxy settings you need to do before you can even start the installation.
  2. Next I would make sure you have SharePoint 2010 installed and then make sure that the servers that will host the FAST for SharePoint farm have access to the same SQL environment.
  3. Next you need to install FAST for SharePoint onto each server in the FAST farm. There is a prerequisites installer that you need to run to make sure all the requirement components are on the machine.
  4. Then you need to configure the FAST for SharePoint farm. There are two options: stand-alone or multiple server farm. The stand-alone is as simple as it sounds and you need to go through the configuration steps.
  5. The multiple server farm configuration entails the creation of a deployment.xml file that I referred to earlier. After installing the FAST for SharePoint bits on each server in the farm, you will go through a configuration process which will use the deployment.xml file to activate components on that server. We will take a deeper looking into that shortly.
  6. Next we need to create the FAST Search Connector service in SharePoint Central Administration which will send SharePoint content to FAST. You will need to configure SSL for the communication between the two.
  7. Finally you will need to set up the FAST Search Query service in SharePoint Central Administration which crawls people information and calls the FAST for SharePoint query servers. There is an extra step of this configuration which requires claims authentication to support the call to the FAST query servers.
  8. There are several other steps in this whitepaper that layout how to get the SharePoint search centers set up and to test your installation.

For the remaining parts of this blog, I am going to focus on steps 5, 6 and 7.

Deployment XML File

As I mentioned, the deployment.xml file is the key the deployment the FAST for SharePoint farm. There is no nice GUI that will show you the farm and allow you to configure it. In step five you will install the FAST for SharePoint bits on each server of the FAST farm. Then on the machine that will have the admin component installed on it, you need go through the configuration process. The deployment.xml file will be used. Then you will go to each other server in the FAST farm and configure it using the same deployment.xml file.

Now you see the important of this file and why I wanted to focus on it. I was able to find some good references and examples.

First let’s talk about the deployment.xml file. You need to read this - http://technet.microsoft.com/en-us/library/ff354931.aspx - albeit it may be a boring read however it shows how it works. Below is a listing of the components that I provided in the blog with a mapping to the XML nodes.

  • Administration component - <admin>
  • Document Processing component - <document-processor>
  • Content Distributor component - <content-distributor>
  • Indexing Dispatcher component - <indexing-dispatcher>
  • Web Crawler component - <crawler>
  • Web Analyzer component - <webanalyzer>
  • Indexing component - <searchengine>
  • Query Matching component - <searchengine>
  • Query Processing component - <query>

All of the tags described in the deployment.xml reference are important and several do not map directly components. It was not easy at first to gain an understanding of how this works. The best way to do this is to basically open up the deployment.xml reference and then review some examples of deployment.xml files to really understand how it works.

The following are several places where I found examples of the deployment.xml file:

To save myself time, I am going to pull one of the examples from the FAST Search Server 2010 for SharePoint Capacity Planning document here. I picked this one because it really shows what a FAST Medium size farm that is scaled out. I am not saying this is the best configuration to start off the bat either – read the FAST Search Server 2010 for SharePoint Capacity Planning to find a base farm that best meets your needs and tweak as needed. In many cases a simple three server farm configuration is the best place to start (i.e. a small server farm deployment which was shown earlier in this series). But it would be no fun to talk about J

So here is a picture of a medium farm. As you can see there are:

  • This farm has two rows and three columns.
  • On the first row, document processing components have been spread across multiple servers to provide good content consumption.
  • The first row also really dedicated to indexing as well as the Indexing components will be turned on their along with Query Matching components.
  • The second row is dedicated to searching as there are Query Matching components and Query Processing components.
  • Note that components such as content distribution and index dispatcher have been turned on in the first row.
  • Looking at the very first server in the farm, you may ask why there are query and document processor components are there too? The justification was that they would not be actively referenced by the FAST farm however available if maintenance is being done on the FAST farm.

Now honestly, given all the information I have provided thus far in this blog series, there are a few things I would potentially change in this configuration to ensure that I had a highly available FAST farm. I will get into those changes shortly.

image

Here is the deployment.xml file for this farm configuration above:

<?xml version="1.0" encoding="utf-8" ?>
<deployment version="14" modifiedBy="contoso\user"
modifiedTime="2009-03-14T14:39:17+01:00" comment="M4"
xmlns=”http://www.microsoft.com/enterprisesearch
xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance
xsi:schemaLocation="http://www.microsoft.com/enterprisesearch deployment.xsd">

<instanceid>M4</instanceid>

<connector-databaseconnectionstring>
[<![CDATA[jdbc:sqlserver://sqlbox.contoso.com\sql:1433;DatabaseName=M4.jdbc]]>
</connector-databaseconnectionstring>

<host name="fs4sp1.contoso.com">
<admin />
<query />
<webanalyzer server="true" link-processing="true" lookup-db="true" max-targets="4"/>
<document-processor processes="12" />
</host>

<host name="fs4sp2.contoso.com">
<content-distributor />
<searchengine row="0" column="0" />
<document-processor processes="12" />
</host>

<host name="fs4sp3.contoso.com">
<content-distributor />
<indexing-dispatcher />
<searchengine row="0" column="1" />
<document-processor processes="12" />
</host>

<host name="fs4sp4.contoso.com">
<indexing-dispatcher />
<searchengine row="0" column="2" />
<document-processor processes="12" />
</host>

<host name="fs4sp5.contoso.com">
<query />
<searchengine row="1" column="0" />
</host>

<host name="fs4sp6.contoso.com">
<query />
<searchengine row="1" column="1" />
</host>

<host name="fs4sp7.contoso.com">
<query />
<searchengine row="1" column="2" />
</host>

<searchcluster>
<row id="0" index="primary" search="true" />
<row id="1" index="none" search="true" />
</searchcluster>

</deployment>


Now looking at that deployment.xml file, here are some notes that will help you better understand it:

  • The <deployment> tag is a wrapper tag. It has some attributes for you to manage the version, last modified date, etc. Highly recommend you use these attributes for configuration management reasons.
  • The <instanceid> tag can be used by SCOM.
  • The <connector-databaseconnectionstring> is the location where you specify JDBC connection strings if that connector is being used.
  • The <host> tag is a specific server in the farm. Within the <host> tag is where you identify all of the components that will turned on in a specific server. This where <admin>, <document-processor>, <content-distributor>, <indexing-dispatcher>, <crawler>, <webanalyzer>, <searchengine> and <query> are defined. You can read the specification about these tags as they are pretty straight forward in their configuration.
  • The <searchengine> tag is the most important component tag when comes to understanding how many Indexing and Query Matching servers there are. In the <searchengine> tag, you specify the row number and column number however you do not specify here if the Indexing and/or Query Matching components are turned on. This is done in correlation with the <searchcluster> and <row> tags.
  • The <searchcluster> tag is an important tag that is used to wrap the <row> tag. Earlier in this series I presented a diagram that discusses columns and rows. These <row> tags are used to define the number of Indexing and Query Matching component rows that are being used in the farm configuration.
  • A <row> tag can be defined as a primary, secondary or none using the index attributed. There can only be one primary index row. Marking the <row> tag as primary indicates this is the primary row of servers with Indexing components. Marking the <row> tag as secondary means the row is redundant index row. Finally, marking the <row> tag with the value of “none” means there is no Indexing components on this row.
  • The <row> tag also has an attribute called search which can be true or false. If marked true, Query Matching component is deployed on that row.
  • So here are some examples of the <row> tag which find critical to understanding the deployment. There are more permutations however these are the ones you will run up against the most:
    • <row id="0" index="primary" search="true" /> - this means this is a primary row with the Indexing component. As well there are Query Matching components on that row.
    • <row id="0" index="primary" search="false" /> - this means this row is completely dedicated to indexing.
    • <row id="1" index="secondary" search="true" /> - this means this second row has redundant Indexing components and the primary purpose of this row is for Query Matching components.
    • <row id="2" index="none" search="true" /> - this means this row is completely dedicated to Query Matching components only.
  • The <row> tag is tightly correlated back to the <searchengine> tag. Having a <searchengine> tag defined for <host> tag indicates that row will have Indexing and/or Query Matching components. As you can see, the <row> tag actually controls which one is active on that server.

As I mentioned earlier, I believe there are some changes I would make to this farm to ensure there is some redundancy. For instance I would change the <row id="1" index="none" search="true" /> to be <row id="1" index="secondary" search="true" />. The reason being is if the index server were to fail, the query server in that columns would redundantly become the index server until the other server were fixed. Now you could experience some query performance issues while this is going on if you have high volumes of content that are being consumed.

The way the redundancy works is FIXML document (which is the searchable representation of every document in the index) will be replicated to the second row. However an index will not be actually built from the FIXML document. When an indexing component fails, a new index will be built using the FIXML documents on the redundant Index row.

Once you start to get the hang of it, this deployment.xml file is not too hard.

When you have completed an installation on a specific machine, run the following command

nctrl status

This will return the status of all the FAST components that are running on that machine.


image

Another thing you can do is go into the services console and check out the FAST services.

image

In the next part of the series I am going to focus on the SharePoint side of the configuration.

Part 3 - FAST for SharePoint 2010 Physical Architecture

Part three of this series will now focus on the physical architecture of FAST for SharePoint farm deployment.

Physical Architecture / Topology

Now am going to take what we have learned about the logical architecture and scaling and apply it to some farm scenarios. From what I know, scaling a FAST for SharePoint 2010 farm is a similar to scaling a SharePoint 2010 farm. We basically need to analyze the requirements and the scale the FAST components appropriately to meet those requirements. Some of the immediate ones will come up are:

  • How much content needs to be searchable?
  • What is the total number of items that need to be searched?
  • What is the format, size and any other interesting characteristic of the items to be searched?
  • What sort of availability is needed to support search?
  • How fresh must the search results be and how often is content changed?
  • How many users will be performing searches concurrently?

This is probably just a starting point. I actually wrote a set of questions that need to be asked here.

Up this point we learned about the FAST for SharePoint components such as:

  • Administration component
  • Document Processing component
  • Content Distributor component
  • Indexing Dispatcher component
  • Web Crawler component
  • Web Analyzer component
  • Indexing component
  • Query Matching component
  • Query Processing component

We have also learned about how they scale. Now honestly, it is pretty much impossible for me to give you all the examples out there on how to scale FAST. So I am going to pick ones (which I shamelessly already have pictures for) and discuss those J

Minimum Deployment

This following is a very basic, non-redundant implementation of SharePoint 2010 and FAST for SharePoint 2010. It is never really commended to install both on the same machine unless you are creating a local development environment. In both of these scenarios there really is no scale other than SharePoint and FAST are on different machines.

In the case of FAST all of the core components such as admin, document processing, content distributor, index dispatcher, web analyzer, indexing, query matching and query processing are all deployed to a single machine.

image

It is interesting to note here that SQL is not heavily utilized by FAST for SharePoint. FAST only needs to access configuration information from SQL. This is different than the out of the box SharePoint 2010 Search which heavily utilizes SQL as part of the indexing and querying process.

Small Deployment

The following is considered to be a small deployment of FAST for SharePoint. You will notice there is both a SharePoint and FAST farm. I am not going to discuss the SharePoint Farm; you can read my blog on how to scale SharePoint 2010 farms. However let’s look at the FAST for SharePoint 2010 Farm. As you can see there are three boxes.

image

The first box:

  • Has the Administrative component which can only be installed on one machine in the farm.
  • Has the Content Distributor which is responsible for routing documents to document processors.
  • Has a Document Processor component with 12 processes running. The more processes you have running the more content can be consumed. The number of processes that can be supported on the machine is available cores on the server.
  • Has an instance of the Web Analyzer running.

The second box:

  • Has an additional document processing component.
  • Has the index dispatcher component which is responsible for sending processed content to be indexed.
  • Has the indexing component, query matching and query processing components installed.

The third box:

  • Has redundant indexing component, query matching and query processing components.

In the end this is a really simple / standard implementation that can handle in the range of 10+ million items and managed about 10 queries per second.

Medium Deployment

This is a medium sized FAST for SharePoint deployment. This should be able to index roughly 40 million items and continue to support 10 queries per second.

image

Observations:

  • The first two FAST servers have all the administration, content distributors, and web analyzer components. It has also document processing components.
  • The following six servers have been configured used columns and rows patterns I introduced earlier.
  • There are three index columns to partition the content to allow for a higher volume of content than the previous.
  • On the first row we have additional document processing components. This will improve consumption and processing of content. Plus it adds redundancy.
  • As well on the first row index dispatching and indexing components have been installed.
  • On the second row are secondary indexing components which are redundant copies of the indexes on the first row. It is not exactly clear in the diagram but the query matching and query processing components are configured. This way there are dedicated machines to support the querying of content.

Large Deployment

This final farm is very similar to the previous other than some of the administration components have been scaled to a third server and there are now a total of six index columns to support up to 100 million searchable items.

image

As I said before, this is by no means the only three configurations of the FAST for SharePoint farm. These components can be scaled however to best meet the requirements.

References