Monday, November 2, 2009

What is a FAST Enterprise Search Project Part 2

Series

Introduction

In my previous blog (Why is FAST Enterprise Search Important) I discussed why is an Enterprise Search project in import? In this blog posting I will discuss what is needed for a successfully Enterprise Search project. This should hopefully give you enough information to anticipate what will be needed in an Enterprise Search project.

What is an Enterprise Search Project?

A few years ago I had to make the transition as a custom application developer to an application server consultant with Microsoft products. Project plans for implementing SharePoint, K2 or BizTalk were really not much different other than you have several new tasks associated to the configuration, integration, sustainability and maintenance of the new application server. Still with application server projects you still have lots of custom artifacts and components that have to be developed. This too is the case with FAST.

When posed the question of what is an Enterprise Search project, I first did not know where to start. I wanted to draw from my past experience. I also knew that Enterprise Search projects can be complex but I did not understand what a search project would entail.

Content Processing and Transformation

Enterprise Search within an organization many complexities. First we have to be able to index content where ever it may be (in a custom database, 3rd party enterprise application server, file share, mainframe, etc.). Custom code may have to be written to facilitate bringing this content over to FAST so that it can be indexed. Knowing this a comprehensive analysis project must be completed to understand all the content/data that is spread across the organization. A common mistake is a company may index bad data and they get the old "garbage in; garbage out" issues. There must be plans for indexing both good and bad data, formatting unstructured data, making data relevant, normalizing data (removing duplicates), etc. We will need to understand the entire life-cycle of that data and how it can be effectively pulled or pushed into the FAST Search index. This is very similar to a data warehouse project however the context is a little different.

An Enterprise Search project is also very similar to a complex ETL project because you will have to create several transformation processes/workflows. The processes must transform the content into a document that can be recognized by the FAST Index. FAST refers to anything in the index as a document; even if the index item comes from a database. A document for FAST is a unique piece of data with metadata which gives it relevancy. FAST provides several out of the box connectors that do this transformation and they provide an API to write custom ones. In many cases you may have to build or extend connectors. Just as important as the ETL pre-processing, there is post-processing routines that must be executed before the search results are passed back to the user interface layer. Again more relevancy rules or aggregation of search results may be incorporated here. I was happy to hear that the FAST team also draws comparisons to an ETL project when discussing what an Enterprise Search project is.

User Interface

Most Enterprise Search platforms like FAST do not have a traditional GUI; it is an Enterprise Search engine that can be plugged into new or existing platforms. FAST does provide several controls that can be integrated into any UI platform but in many cases you will be extending upon or building complete new controls. FAST provides a rich API that is accessible in such languages and .NET, Java and C++.

User Profile

An important element of the FAST Enterprise Search project is to understand the user profile that is performing the search. Things such as their current location, where they are within the organization, what sort of specialties do they have, what types of past searches have they done, who have they worked for work for, and past or future projects, tasks or initiatives they have supported can all be used to give a more relevant search result. This requires integration to go to systems that can infer these relationships and pass this information along with the query to FAST Query and Results server which will return a relevant result.

Security

The profile is also important for incorporating security. FAST has numerous ways in which documents can be securely exposed to the end user. For instance there is an Access Control List (ACL) which is part of the document instance in the search index. The ACL is populated during the indexing of content and this may require customizations to set the ACL appropriately. As well, more customizations may be added to do real-time authorization to ensure that documents being returned from the index have not been removed from the user's visibility. Another consideration is to partition indexes based on boundaries such as internet, extranet and intranet. There are several more considerations that must be accounted for so time must be accounted for in the plan to ensure that content is managed properly.

Installation and Configuration

A major portion of the project plan needs to be devoted to the installation and configuration of the FAST server. There are several important things that need to be accounted for when doing this. For instance how many queries will be executed concurrently, what are peak usage scenarios, how much content will be indexed, what sort of complexities/exceptions are there in the indexing process, what is the anticipated growth, etc. All of this must be known for us to properly scale the FAST server and the design of custom components.

Testing

With all of the custom transformation and GUI components to support the Enterprise Search implementation, there will need to be a focus on system integration testing, system application testing, and user acceptance testing. There will be specific test for search to ensure that indexing, query performance and result relevancy are accurate and within acceptable ranges. This is nothing new but we need to be sure that a proportionate amount of time is incorporated into the plan to ensure that a quality solution is put in place.

Sustainment and Governance

Sustainment next needs to be part of the plan which is commonly neglected. Too often the plan is focused on the short-term end result while the long-term management is not incorporated into the solution. What sort of organizational management changes are required to support and maintenance of the search implementation? What sort of configuration management business processes will need to be introduced to continually tune the index and relevancy model based on usage? What sort of new roles and responsibilities need to be incorporated into the employee performance (from both a systems and business user perspective)? How is the enterprise taxonomy going to be maintained? What sort key performance metrics and reporting are needed to consistently evaluate the success of the project? What is the process for incorporating change back in the solution (which is extremely important for Enterprise Search)? If questions like these are not incorporated into the early design of the project, there will be long-term challenges with the adoption and integration of the Enterprise Search investment.

Closing

As you can see the key to a successful Enterprise Search project is to understand the needs of the business and how the solution will be supported. Many of the tasks that were discussed are very standard; we just needed to put them in context.

No comments: