Monday, January 17, 2011

Part 1 - FAST for SharePoint 2010 Features and Proposition

Introduction

FAST for SharePoint 2010 (FS4SP) is one of the newest and coolest features of SharePoint 2010. FAST was a major acquisition for Microsoft and it is one of the top Enterprise Search engines in the Gartner quartiles. Combined together, Microsoft really has a best of breed platform for managing content and building enterprise solutions.

FAST ESP 5.3 was the last version of the product before it was integrated with SharePoint 2010. The legacy product is still available in different formats from Microsoft because they want to continue to support existing clients and there are scenarios where FAST ESP 5.3 is a better fit. It has been rebranded as FAST Search for Internet Sites (FSIS) and FAST Search for Internal Applications (FSIA). I am not going to go into either of these. If you want an understanding of FAST ESP 5.3 I have written a blog series about it here which goes into the architecture.

After writing that series over a year ago, I have been wondering how is FAST integrated into SharePoint 2010. I really wanted to understand this from an architecture perspective so I can provide good guidance on how to set up SharePoint 2010 to use FAST. When I set out to do this, it took me awhile to really find the information but I was able to piece it together. I have listed all the references I have used at the end of the series. What you will soon find out is the architecture of FAST did not really change.

Features

The following are some of the additional features of FS4SP that are added on top of SharePoint Search you should know about right off the bat:

  • Advanced Content Processing – Extract and create metadata by using content within the documents. This will improve search results, relevancy, sorting and refinement. Plus this reduces the human workload to actually create the metadata.
  • Advanced Sorting – Provides ability to sort results based on any managed properties or rank profiles that are available to the current user.
  • Business Intelligence Indexing Connector – Supports the ability to index such things as Excel workbooks, SSRS reports, etc. So for instance there may data within the report, a title in a pie chart, etc. which will now appear in a search result.
  • Contextual Search – Have the ability to customize search results and refinement options based on the user profile or the audience that is performing the search. This is a major feature when it comes to building relevant search results to the user.
  • Tunable Relevance with Multiple Rank Profiles – Ability to create rank profiles that incorporate things such as freshness, authority and quality to provide more relevant results. This is important because these rankings are not part of the index and are applied to the query; making this tunable. There is the ability to identify authoritative pages, you can identify documents for promotion (and associate that promotion based on the user context).
  • Deep Refinement – SharePoint Search only performs a shallow refiner which only allows refinement for the first 50 results in the original query. FAST provides deep refinement which is based on statistical aggregation of managed property values within an entire result set. Because FS4SP provides a deep refiner, the exact count of documents in refiners.
  • Extensible Search Platform – Build complex search solutions, search driven applications, etc. Personally this is where I see lots opportunity to use search in ways you have not thought about before.
  • Extreme Scale Search – The ability to search millions upon millions of documents with sub second time speed. There is no scale boundary when it comes to FAST.
  • Rich Web Indexing – Ability to index dynamic HTML and javascript content with custom indexing connector.
  • Similar Results – When results are returned, there will be a link called Similar Results which the user can click. Basically when a user clicks on the link the search is re-defined and re-executed to include documents that are similar to the result.
  • Result Collapsing – FS4SP can perform a checksum on data within the index and will collapse results into a single result where possible. Now documents that are popular and stored in many places across the organization will not be shown multiple times. The user has the ability to click a link to see where all of the duplicates are stored.
  • Thumbnails and Previews – Review content quickly with thumbnail and preview images in search results. One example is the PowerPoint previewer. In reality this feature is given to us through Office Web Apps however it is bundled as part of the FAST solution offering.
  • Two Way Synonyms – SharePoint Search supports the usage of synonyms but they are only applied to keywords that in the query. However FS4SP also supports the ability to apply synonyms to the documents itself so if the query has a keyword with a synonym but the document only has the synonym, that document will be returned in the result set.
  • Managed Property – With FAST, you have the ability to create metadata mappings and create rules for that metadata and how it will be used to provide a better search experience. Specifically this identifies if the property can be used for sorting (i.e. can the property be sorted in the search results), filtering (i.e. can the property be used to filter), as a refiner (i.e. can be used with deep refiners), priority (i.e. used in ranking algorithm against other documents), and dynamic/static summaries (i.e. dynamic summary will display a hit-highlighted summary of the managed property in the result).
  • Property Extraction – FAST will identify key information like people names, company names, geographic names/location, etc. within a document and then use that data to provide more relevant search results.
  • Rank Profiles – Are a feature of FS4SP which control how relevancy is calculated for all items that are indexed. Rank Profiles can be aligned to type of user to provide them a more relevant search experience or even let users select a rank profile based on what they are trying to accomplish. Rank Profiles in FS4SP provide more features such as freshness, proximity, authority, query authority, context, managed properties
  • Visual Best Bets – Provide the ability to return rich, editorialized results based on keywords. This can as well be tied contextually to the type of user viewing the best bet.
  • Linguistics – Supports the ability for language variations to be used to allow users to find relevant information. What this helps with is finding relevant search based on words and phrases that may not be identical between the query and the indexed item. Such features as character normalization, normalization of stemming variations and suggested spelling corrections are part of the linguistic processing. Linguistic processing is part of both item processing (before the document is indexed) and as well as for the queries that are submitted by the user.
  • Multiple Language Support – FAST support well over 80 languages - http://technet.microsoft.com/en-us/library/ff793350.aspx

Value Proposition

I will be honest, all those features sound great, but taking a step back you really need to say to yourself how will a power search engine help your organization? I actually wrote about this topic from my perspective when I was starting to learn about Enterprise Search - http://www.astaticstate.com/2009/11/why-is-fast-enterprise-search-important.html. I highly recommend that you review this posting understand why Enterprise Search is so important. Upon reading this, you will see how important having FAST is whether you have massive amounts of content or you are just a medium sized organization with 1 million items. At the end of the day:

  • Users are bombarded with information that is located in numerous locations.
  • Users are not aware, educated or trained on how to find that data.
  • In many instances documents do not have good descriptive data making them hard to find.
  • Companies have challenges with retention of people and technology. They are looking for ways keep that data available to the organization.
  • Companies have challenges on boarding new personnel.
  • Companies have challenges communicating across geographical boundaries where information is not shared.
  • Users are challenged with finding the most right information from authoritative experts.
  • Organizations have challenges with customer relationships, satisfaction and retention.
  • Companies are not enabled to bring the offline office community online. Documents are not recognized in the way they are actually used within the organization.
  • Companies have challenges with effectively sharing or selling data they have.
  • Organizations have challenges with knowledge management as information is managed in silos.
  • Organizations are not responsive to their employees and customers.
  • Companies and organizations have challenges bridging together business processes the span across geographies, organizational hierarchies and technologies.

Really the list can go on and on. The utilization of FAST in SharePoint, regardless of how big or small the content source(s) is will ultimately improve a company’s ability to react to these challenges.

2 comments:

Paul Beck said...

Hi Jason,

I am reading your blog on Fast for SharePoint 2010, I can't find a template that behaves as yours works. i.e. 100% and width scales gracefully, where did you get the template or can i get a copy?

My blogger blog is: http://sharepointsite.co.uk

thanks
paul Beck

Jason Apergis said...

In the designer, you have the ability to mess around with the CSS and HTML template, which is what I had to do...