Showing posts with label SharePoint Architecture. Show all posts
Showing posts with label SharePoint Architecture. Show all posts

Monday, September 7, 2015

New SharePoint and Office 365 Hybrid Search

I am super excited to see the new Hybrid Cloud Search coming. Why? A ton of reasons. The biggest for me are:

  • Simplified On-Premises Deployment – If you have ever had to manage SharePoint on-premises before, the search servers require the most care and management. No more search crashing on you because you did not put enough resources to manage it. No more having to worry about timer jobs and full index crawls.
  • No More Having to Run Search Servers On-Premises - Moving this entire search workload to the cloud will significantly make your on-premises deployments smaller and easier to manage.

Up to this point, the only way to achieve a hybrid search experience by setting up query federation with SharePoint 2013 on-premises with SharePoint Online. This really did not make your SharePoint on-premises deployments any easier. Now we can do away with that configuration.

Here are some specific technical considerations.

  • You now have the ability with your SharePoint 2013 and 2016 on-premises farms to get the entire SharePoint search experience from the cloud. This is being termed as SharePoint Server with Cloud Search service application.
  • Scenario: If you have not leveraged SharePoint Online yet, you can use this as an opportunity to remove all your SharePoint on-premises search servers and use the cloud.
  • Scenario: If you have move some or a large portion of your SharePoint on-premises to the cloud, you now have the ability to get a unified hybrid search and that entire search experience is delivered by the cloud.
  • Scenario: If you have specific reason (i.e. legal, compliance) that you have to keep some search servers on-premises, you can still do that have query federation (picture below) between your on-premises SharePoint and SharePoint Online.
  • Hybrid search can go across the following content sources: SharePoint Server 2007, 2010 or 2013, File shares, BCS connectors*. This basically means that user in SharePoint Online can retrieve search results from all these locations.
  • Recommend that you get your SharePoint on-premises to SharePoint 2013 or higher to get the most advantage of these features.
  • Recommend you seriously considered using search from the cloud. Why? With the amount of data organizations is acquiring and maintaining, using an elastic cloud to deliver those search results will only make your life easier J
  • Delve Search experience will also be supported as part of this new search experience. So Delve can find files not just in SharePoint Online, but also in SharePoint on-premises.
  • eDiscovery and new SharePoint DLP capabilities will be able to go across SharePoint Online, SharePoint on-premises and OneDrive for Business. This is a big deal.
  • Configuration with Azure AD is required. You will do this anyways as an Enterprise Office 365 customer.
  • If you need to do customer IFilters, BCS connectors or Partner connectors, those will remain on-premises.
  • Cloud will not support: Site collection level schema mapping, Custom security trimming, Custom entity extraction and Content enrichment web service.

Here is a high level logical architecture diagram

clip_image002

clip_image004

clip_image006

Getting the point across that this makes your on-premises deployment so much simpler. It is hard to argue that you SharePoint on-premises life will be made better.

clip_image008

clip_image010

clip_image012

References

https://blogs.office.com/2015/08/24/announcing-availability-of-sharepoint-server-2016-it-preview-and-cloud-hybrid-search/

https://channel9.msdn.com/Events/Ignite/2015/BRK3134

Wednesday, September 26, 2012

SharePoint 2013 Technical Diagrams Notes

Introduction

SharePoint 2013 Preview Technical Diagrams are now available here - http://technet.microsoft.com/en-us/library/cc263199(v=office.15).aspx

Ever since SharePoint 2007 started publishing these technical diagrams, I have recommended that architects become very familiar with them. I always start here when trying to understand a new major product release for SharePoint. If you search my blog, you will see that I have directly referenced these diagrams when building SharePoint strategies for customers.

The following is a high-level review of the new architecture changes available with SharePoint 2013.

Corporate Portal Diagrams

I reviewed the two new Corporate Portal Diagrams for SharePoint 2013. From a logical architecture perspective, these diagrams do not have any major changes from the SharePoint 2010 versions. These diagrams accurately capture how organizations should build web applications, site collections, application pools, SharePoint services, etc. to support major business initiatives. The diagrams are still a must read for people who are new or who need a refresher to understand how they should be segregating content and business functions across SharePoint.

Extranet Diagram

The new diagram for SharePoint 2013 extranet architecture closely resembles the corporate portal diagrams, however it is not very revealing on the type of information organizations need when making a decision on how to deploy an extranet. Looking back at the SharePoint 2010 Extranet Topologies diagrams (http://technet.microsoft.com/en-us/library/cc263199.aspx), I find that diagram to be much more helpful and the information contained here still holds true with SharePoint 2013. I would recommend reviewing both of them together.

Services in SharePoint 2013 Diagram

I admit this has always been one of my more favorite diagrams. When this was released in SharePoint 2010, it captured a fundamental change in how SharePoint services are configured and delivered. This new architecture was created to support Microsoft’s ability to deliver SharePoint Online as a SaaS solution.

I reviewed this diagram and nothing has significantly changed in regards to sharing services across farms, the logical architecture of services, service groups and service deployment.

In the services table there are a few new services that have been added.

  • Access Services – Do not be confused by this. Yes there was Access Services in SharePoint 2010. At this early point, I know that that Access Services for SharePoint 2013 have been changed to be more focused on utilize the new App Architecture. As such, Access Services for SharePoint 2013 is pretty different. Access Services solutions created in SharePoint 2010 will still be supported moving forward, however they will run in a different service.
  • App Management Services - This is a new service that will be used specifically for supporting the new internal catalog or the public SharePoint store. Remember that in SharePoint 2013, everything is an app; EVERYTHING. Even everyday SharePoint lists are now called an app. Once you get over the name change, you will find out it makes complete sense and Microsoft has just aligned what is does with how business users talk about technology.
  • Machine Translation Service – This is a new one and as of right now, I do not have much information on the purpose of this service other than the description which says it performs automated machine translation.
  • Work Management Services – This service provides task aggregation across management systems including SharePoint, Exchange and Project Server. This is huge from a user perspective. One single place to see all of your tasks. No more building content query web parts to find all tasks; this effectively does this plus goes outside the SharePoint boundary to find more tasks. This is a very exciting service.
  • Office Web App Services – Is called out in here as a service that is no longer running inside of SharePoint server. Why? Microsoft strategy is to provide Office Web App Service to other enterprise application than just SharePoint and it strategically made sense to move it out of SharePoint.

In the rest of this diagram there are architecture diagrams for how to architect service groups across farms, none of which have changed from SharePoint 2010. If you are not familiar with this stuff, this is a must read and I recommend reading my old posting on it here.

Mobile Architecture Diagram

There is a brand new mobile architecture diagram provided and obviously this is drive by Microsoft’s focus on being a “services and devices” company. This is a pretty simple architecture that basically describes some things you need to think about if you are going to support mobile to your users and discusses some of the mobile capabilities. This can serve as a launch point for you to begin to dive deeper into how you will support mobile for your organization. The following are some high-level observations I had when reading this the first time:

Extranet – If you are not thinking extranet, you need to so you mobile users can access content when they are on a mobile device. They have some diagrams which will get your started thinking about it and additionally how you can use Unified Access Gateway (UAG) as a reverse proxy to help with that.

Mobile Device Management (MDM) – One interesting thing brought up in this diagram is how do you manage mobile devices? If you need something simple, you can leverage Exchange ActiveSync for remote device wipe, password enforcement, etc. If you are looking for application level MDM there are additional solutions out in the marketplace today that provide even more capabilities.

Application Architecture – The new SharePoint 2013 mobile architecture is introduced. They break it down basically into two logical layers: mobile and SharePoint. Some key points are:

  • Automatic Mobile Browser Redirection – Is a new capability that can be used to optimize the mobile experience based on the connecting device. This Feature must be active on the site and will be activated by default on numerous site templates. First there is the Classic View which is used to provide backwards capability to mobile devices and will have a SharePoint 2010 mobile browser experience. Then there is the Contemporary View which is geared to support HTML5 browsers. The Contemporary View is several enhanced features for navigation of SharePoint sites. Additionally, Full Site View is available so the SharePoint site page can be viewed as if it were on a desktop browser or a tablet device.
  • Office Hub for Windows Phone – Is an application for Windows phone devices that provides enhanced capabilities to access SharePoint content from multiple places in one spot. It also leverages mobile Office.
  • Location – There is a new geo-location field type that is available in a SharePoint List. This can make a list location aware to capture latitude and longitude which can be used with map applications. For instance, if a user enters in data on their mobile device, it will capture where it was done from and then can be displayed on a map. Here is some more information about this - http://technet.microsoft.com/en-us/library/fp161355(v=office.15).aspx
  • Push Notifications – There is a new capability to allow notifications to be sent from a SharePoint site to registered applications running on a mobile device. The nice thing about this is that Windows Phone Apps can receive notifications without having to poll. Here is some additional reading on the topic - http://msdn.microsoft.com/en-us/library/jj163784(v=office.15).aspx
  • Device Channels – This is a really important new capability as device channels allow you to deliver a publishing site geared specifically to support different types of remote devices. Basically the site can mapped to multiple master pages and style sheets and even control what content you want to make available to specific devices. Here is an overview on the new device channels - http://technet.microsoft.com/en-us/library/fp161351(v=office.15).aspx
  • Office Web Apps – As mentioned earlier in this posting, Office Web Apps is now a separate standalone server which does not run inside of the SharePoint boundary. Office Web Apps has been improved a lot to support mobile devices. There are Word, Excel and PowerPoint Mobile Viewers.

SharePoint 2013 Upgrade Process Diagrams

There are two upgrade diagrams that have been provided. Here are some high points I walked away with:

  • Must be on SharePoint 2010 – To upgrade, you must be on SharePoint 2010 technologies. This means if you are on SharePoint 2003, 2007 or 2010, you will need to upgrade to the appropriate version to get to SharePoint 2013. There are Microsoft migration partners that have solutions to assist with this. I saw many times, this is the big value proposition for using SharePoint Online as this is handled for customers.
  • Database Attach Upgrade – Is the only supported method for upgrading. There is no more “in-place” upgrade option. Frankly that is fine because most customers always went down a database-attach upgrade.
  • Preparation – much of the preparation activities that we have discussed in the past with SharePoint 2010 hold true with SharePoint 2013. There is a bunch of information you are responsible for gathering.
  • Manual Configuration Settings – In the preparation phase is recommended to get a understanding of all the custom configurations that you may have done because not all of them are going to be migrated. This is because not all databases are upgraded. So many custom configurations in central admin such as alternate access methods, time job tweaks, managed paths, incoming/outgoing email settings, certificates, etc. will need to be documented and reconfigured in the new farm.
  • Databases That Can Be Upgraded – There is a set of databases that can be upgraded. They are Content, BDC, Managed Metadata, PerformancePoint, Secure Store, Search and User Profile databases.
  • Customizations – This is an important task that needs to be completed. I have seen many cases where good software organizations have not implemented a strong configuration management process and the result is an organization may not know about all the customize code that may be implemented. There are numerous ways to find all of them by running PowerShell commands, doing system directory diffs, checking web.config, etc.
  • Upgrade Health Checks – There are some new features that are available to site collection administrators that will show you a health check of a site collection before actually upgrading the site collection.
  • Evaluation Site Collection – Site Collection Administrators also have the ability to request the site collection be copied into a new site collection to evaluate how the upgrade will affect any customizations they may have. This is helpful so you can remediate issues before you actually perform the upgrade. This is also nice because your site collection will run in a SharePoint 2010 mode until you are ready to actually upgrade it.
  • Testing – Just like for SharePoint 2010, the best way to prepare for a migration is to build up your new SP 2013 farm and then multiple practices runs of that upgrade into the new production environment. An entire process is defined in one of the diagrams and is a great place to start.

SharePoint 2013 Search Diagrams

If you are a reader of my blog, I wrote some long postings about the search architecture for both SharePoint 2010 Search (here) and FAST for SharePoint 2010 (here). I am not going to do a deep dive into all these search components and roles because they are basically covered. As many people now know, the FAST search engine is now the core search engine for SharePoint. It will just be referred to as SharePoint Search. Now you will be able to leverage a very powerful search engine out of the box. However many of the advanced enterprise features of search will only be available in the SharePoint Enterprise addition. I am also really excited about this for SharePoint Online because it can leverage FAST too. SharePoint Online will not be able to do Enterprise Search of line of business systems but a Search Farm (which is FAST underneath the hood) can be configured on premise and SharePoint Online can invoke that search and provide the search results in the cloud; pretty exciting.

I highly recommend taking the time to review both of these diagrams. It explains how each of the components interacts with each other. Additionally there is a diagram the goes into how to scale the server farms for the amount of content you will need to index. There is a great, new table in there that shows you how scaling will work. To be honest, the folks who are really serious about search will say it is an art and a table does not always communicate how you will do it. It always comes down to how many items, the types of data sources, custom transformations, query latency, index latency, etc.

SharePoint 2013 App Overview Diagram

This is an area I plan to do a lot more exploration of this coming year on and writing on this blog. Why? This is something we have been waiting for a long time with SharePoint development. There are several ways to look and this. SharePoint Features which we have been writing for years are Apps. This is name change to better communicate our technology to the end users who have to use SharePoint. However the new SharePoint App architecture is way more than that.

I have seen so many things over the years.

  • I think one of the biggest challenges people would run into is developing great SharePoint solutions only to find out they incorporated some dependency they should not have, they wrote some high-end code that should not be running in the SharePoint layer, they cannot leverage their solution outside the SharePoint boundary, etc. We want to resolve those problems by helping developers to deploy solutions in way that will keep their SharePoint environment nimble.
  • Plus we want to provide third-party vendors quicker access to customers. We want to help customers to quickly acquire third-party solutions.
  • Additionally we want to allow customers to leverage commodity based SharePoint Online. As you may know SharePoint Online has restrictions on high-end custom development and if that code where to run in another location, while be highly integrated with SharePoint Online, that is a huge win.

I will thing of many more reason this year on why this is so great J

Now we can achieve this through the new SharePoint App architecture. The old SharePoint Solution architecture where you create a WSP is still around. Nothing has changed there. This is used to create deployment packages and in many cases is used to deploy code that requires full trust. SharePoint Solution packages will continue to be used by third-party vendors or developed internally with such tools as Visual Studio. You can still create Sandbox Solutions which run in a more secure runtime and can be deployed in SharePoint Online.

Now the new Apps framework for SharePoint 2013 is a packaged up in a file called .APP. It is composed of many of the same types of files, AppManifest, embedded Solution.wsp, etc. Once an app is loaded into SharePoint, it is accessible through the App Catalog in SharePoint. This App Catalog can be controlled at an organizational level.

Remember the big point with Apps is, that the custom solution you are writing may or may not actually run in SharePoint. Full trust code is not supported. Your custom solution code itself may run in a different SharePoint farm, on an IIS server as ASP.net pages, ASP.net pages running in Windows Azure, etc. So how does SharePoint access these solutions running outside the SharePoint context? In simple terms we have an IFrame (with some extensions) that external solution is available through. OAuth provides the secure connection for access SharePoint objects from a remote location. We will additionally use a new extended and robust event model and remote client SharePoint library to write integrated, remote code.

Why is this so great? We are going to ensure that custom applications and solutions that are being developed with SharePoint are isolated. No more writing a bunch code and services that should not be running in SharePoint servers. It is great that you can do whatever you want with SharePoint, however this will drive solution management.

So you may be asking where does this get deployed? There are many different options for hosting.

  • SharePoint Hosted – This means the app and all the resources run in SharePoint. Remember you server side is not supported however you can write applications with SharePoint’s JavaScript libraries and such.
  • Windows Azure Autohosted – This is a model that is only supported in SharePoint Online. In This case you can write an App package that will have code for Azure and SQL Azure embedded into it. When the application is deployed, the azure solution will be automatically deployed for you. You do not have to go to Azure and set anything up at all; it is all handled for you behind the scenes.
  • Provider-Hosted – This is the third model where custom code and solutions are hosted in a separate server in your organization, hosted in Azure, hosted in different SharePoint servers, etc.

Once an App package is installed, it can be managed and monitored through the catalog. End users have the ability to select an app to run in their sites (much in the same way as turning on a Feature). If and when an app is updated, the user can decide how they want to upgrade to the new app.

Again I really plan to go much deeper on this in my blog but for right now, these are just some introduction notes and ramblings on how excited I am about this new capability J

Back Up and Recovery Diagram

There is a new diagram that goes into the details of doing your own back-up and recovery for SharePoint 2013. I know many people have become accustomed to using third-party vendors for supporting these operations and I still believe these vendors will continue to provide features above and beyond what is out of the box. However, if you are a do it yourself sort of person, this is a great diagram to review.

Not much has changed in regards to developing back-up procedures for both the SQL Servers and the SharePoint Servers. There are tons of scenarios covered in here, and I recommend reading this if it is important to you.

SharePoint 2013 Database Diagram

Finally the database server diagram has been updated. This is a really really really important diagram to review if you are managing on-premise servers. It goes over all the SharePoint databases, plus provide sizing and scaling guidance. Great information.

Tuesday, March 13, 2012

SQL Server 2012 Brings New Features and Capabilities for SharePoint 2010

Introduction
If you have SharePoint customer or architect and you really should be looking at the new SQL 2012 release. As we all know SharePoint success is highly contingent on SQL Server. You need a strong SQL Server deployment to ensure there is good performance, high availability, back up/recovery, etc. Additionally Microsoft delivers their Business Intelligence (BI) stack through SharePoint and there have been several new features and capabilities added there as well.
For SQL 2012 there are three areas where there have been major capability additions:
  1. Mission Critical Confidence – This is the capability to more easily deliver high-availability with lower total cost of ownership.
  2. Breakthrough Insight – New and expanded capabilities for Business Intelligence.
  3. Cloud on Your Terms – Additional capabilities to create SQL databases in either the private or public cloud (Office 365).
In this blog I am going to cover some of the new SQL 2010 capabilities specifically focusing on how they can improve your SharePoint 2010 implementation. Please note that there are a lot of new capabilities for SQL Server 2012 which I have not covered such as data warehousing, resource management, full text searching, auditing, Big Data (Hadoop), etc. Here is a good reference to quickly spin up on all the new capabilities (“What's New in SQL Server 2012 Whitepaper” located here).
Mission Critical Confidence
AlwaysOn
First and foremost is the new high-availability solution in SQL Server 2012 called AlwaysOn. One of the first and foremost challenges that organizations face with deploying SharePoint 2010 is providing a solution architecture that will meet the SLA’s of their business users. Business processes, technical processes and Governance need to be put in place to ensure that SharePoint 2010 will be up as much as possible.
In the past with SQL Server 2008 R2, you employ such solutions as clustering, mirroring, log shipping and replication (my previous blog on this topic). However this could require a lot of planning, configuration, and management. With SQL 2012 AlwaysOn, new configuration wizards and tuning tools are now provided that makes set-up and configuration of High-Availability extremely simple.
The concept of Availability Groups have been added which specifically makes configuration of Database Mirroring easier. Availability Groups are a logical of databases that failover together. Through the configuration wizard, you can determine if you need such things as automatic or manual failover, set-up of primary and multiple secondary instances, synchronous or asynchronous data movement, etc.
Availability Groups remove the need for shared disk storage (SAN or NAS) for deployment of a failover cluster instance. Note that AlwaysOn Failover Cluster Instances support multiple-site clustering across subnets which subsequently enables cross datacenter failover.
This feature is very useful when setting up your SharePoint 2010 (or 2007) farm because High Availability is one of the most paramount tasks that are needed when setting up a mission critical SharePoint environment.
Recovery Advisor
Database Recovery Advisor provides many new features and capability for the support of back-up and restore of databases through SQL Server Management Studio. The new Recovery Advisor streamlines the back-up process. One such solution is a visual timeline that provides the backup history and all the available points in which you can restore from.
There is also new capability called Split File Backup which allows you to split a backup into multiple files. This allows for quicker backup and restores because they can be written and restored across disks running in parallel.
This is very helpful for improving the amount of time to work with large SharePoint 2010 (or 2007) Content Databases that have grown significantly.
Breakthrough Insight
As you probably already know SharePoint 2010 is where Microsoft’s Business Intelligence (BI) stack is delivered. This is a combination of solutions such as Excel Services, PerformancePoint, Visio Services, Chart Web Parts (Dundas), PowerPivot, and SQL Reporting Services (SSRS). With the release of SQL 2012 a new solution called Power View is now provided, PowerPivot can now be done on SharePoint 2010 server side and there is significantly enhanced integration with SSRS.
Power View
Power View is a new highly interactive data exploration and reporting tool that allows business users to visually explore data, in an ad-hoc fashion. End users can create the reports/dashboards very quickly, create shapes/graphs with clicks, create animation, highlight capabilities based on rules, drill down relationships and performs very well with large datasets. The design environment is very similar to Office. It can be published through SharePoint 2010 as shown in the diagram below.image
Note that Power View requires SQL Reporting Services (SSRS) to be integrated with SharePoint 2010 and there be an instance of SQL Server 2012 Analysis Services (SSAS) or PowerPivot be available.
PowerPivot
Up to this point PowerPivot was a solution that was available through Excel client only via an Add-in. This is a really powerful, end user friendly capability that allows for data analytics. There are actually several new capabilities that have been introduced that I recommend you read up on (in references below).
Specifically to SharePoint 2010, PowerPivot for SharePoint is available as an add-in and can run server side in conjunction with Excel Services. This now allows end users the ability to publish PowerPivot reports through the browser allowing end users easy access to this data.
image
As well, in SharePoint 2010 Central Administration there is a new PowerPivot Management Dashboard that provides several reports on performance of reports.
image
SQL Reporting Services (SSRS)
Personally the enhancements to SSRS are very exciting to me as I have had to do configuration of SSRS with SharePoint 2007 and 2010 in the past in production environments. If you have done it before, you may recall it was tedious task of configuration the SSRS Configuration Wizard and getting Kerberos Authentication set-up correctly. Now with SQL Server 2012, configuration of SSRS with SharePoint 2010 is completely done through Central Administration. Additionally there is a new service in SharePoint 2010 that runs SSRS, it supports WCF/Claims authentication, integrated ULS logging, built in load balancing across SharePoint servers, report performance improvements and there are PowerShell commands for management.
image
Additionally there is a SSRS alerting capability that allows end users to subscribe to alerts that are associated to SSRS report. Users can create conditions for any report, and when they are met a notification will be sent to them.
image
One more last addition change is you may know that SSRS reports can generate Word or Excel Documents. Up to now, the file formats generated were either .doc or .xls which means they are not Open XML document renditions. With SQL Server 2012 and SSRS now .docx and .xlsx file types will now be generated.
Cloud on Your Terms
With SQL Server 2012, there are a bunch of new capabilities to support even tighter integration with SQL Azure. You will have a unified database development experience between SQL 2012 on-prem and SQL Server. The reason why I bring this up is because it is very common that when creating a solution inside SharePoint, you will have complex data management requirements that can be better supported with a SQL Server database instead of a SharePoint list.
Now we have the ability to more quickly and efficiently create custom database in SQL Azure which can be utilized with SharePoint 2010. Plus this is really good for working with Office 365 SharePoint Online because we can make a cloud to cloud connection to work with advanced data structures.
References

Wednesday, January 4, 2012

Architecture Considerations for Moving SharePoint to Office 365 and SharePoint Online

Introduction
Why are customers looking to come to the Office 365 cloud? Customers want an environment that can scale on-demand. Customers do not want to be in the business of managing and patching software . Customers want a solution that give better business continuity and service level agreements to their users. Customers want solutions that are better governed and that will force them to do better governance. Customers want to be in the business of building solutions (i.e. an airplane company should focus on building planes; not writing enterprise applications from the ground up).
When I talk with customers about SharePoint Online, this is where they want to move to.
In this blog I am going to talk about how an organization should be looking at moving to SharePoint Online.
  1. In first half we are going to analyze SharePoint Architecture and Governance and review what potential blockers people see when moving to the cloud.
  2. In the second half I am going to have a more detailed discussion around how to architecturally plan to move to SharePoint Online.
Why the decision is not always straight forward?
Coming to SharePoint Online in Office 365 may not sometimes be the most straight forward decision for organizations that have made a large investment in SharePoint. Going to SharePoint Online is really easy for organizations that are in it for document management, intranets, OOB features, SharePoint Designer, etc. It gets harder for organizations that have had poor governance of their on-premise environments, made signification investments with custom code, utilize features not available in cloud, have a heavy reliance on third party solutions, etc.
There is a solution to these issues but it requires an organization to take a step back and look at what they have.
Poor governance is probably one of the largest changes organizations have faced with SharePoint. These challenges could have been avoided thought with good forethought and planning. There is not always a one size fit all solution either; SharePoint is powerful platform that can be used to be used to implement a very broad set of business requirements.
SharePoint Architecture Foundations
The following a Venn diagram that I always draw on whiteboard with customers.
image
I typically say that that:
  • Information Architecture – Is content types, taxonomy, topology, site collections, sites, libraries, lists, and solutions that drive the management of this content. The information architecture should be driven by business requirements.
  • Logical Architecture – Is the architecture of SharePoint services, web apps, databases, etc. to support the information architecture.
  • Physical Architecture – Is the configuration, deployment, farms, etc. that actually host the SharePoint logical architecture.
I admit this is open to interpretation but let’s use this as a foundation for discussion.
Here are some of the biggest problems organizations that have come up when I discuss the diagram above:
  • Started with Physical Architecture – This is the first mistake that many seem to make. Smart IT are concerned about how many servers to buy and configure they forget to actually create an environment needed to support their business requirements. What usually happens is the environment is not scaled or organized to support the actual utilization; this is where the trouble starts. This pretty much goes away if you go to the cloud.
  • No business requirements and software design best practices – One of the biggest, consistent challenges I see many organizations have with SharePoint is no business requirements. I typically just see organizations create very light documents (a bullet list) of requirements and just start building. I very rarely see organizations actually do software design best practices. There are no use cases, data models, ERDs, UI wireframes, data dictionaries, architecture documents discussing what SharePoint elements they plan to use, etc. Why; because SharePoint is such an easy platform to start building with. I personally believe SharePoint is one of the best solution platforms on the market but sometimes organizations need architect a solution fist. This does not go away with SharePoint Online.
  • No Governance – Organizations usually forget to put Governance plans to manage content and solutions running in SharePoint. A solution can be anything from a group of team sites or a complex .NET application in SharePoint. I have even seen organizations create Governance plans but then not manage nor adhere to them. An analogy would be a organizing a ton of boxes in their basement really well (with labels too). However as time goes on, they just keep on stuffing and stuffing the room with more boxes so that entry to the room is blocked. A good SharePoint environment needs care and feeding. The information, logical and physical architectures need to have policies and procedures to manage it. Policies, procedures, service level agreements, etc. must be put into place. With SharePoint Online several of these Governance woes are removed because the Office 365 cloud manages the environment. Still Governance needs to be but in place to manage content and solutions deployed in Office 365.
  • Over Regulation – I have also seen the term “SharePoint Governance” used as a crutch. I have seen organizations lock down SharePoint too much. My response to that is was there a requirement to lock it down? If so, the right thing was done like a publishing intranet or Internet site – no user should have rights to do what they want. There should be a locked controlled publishing and branding process. However it is perfectly acceptable to have areas of SharePoint that are pure collaboration. Your information architecture and governance plans will drive business users to put content and solutions in the appropriate areas.
There are more things to think about but I am getting a little preachy J
A Not So Unusual Situation
Here is a very common scenario that backs up what I just described. It is very common for an organization to start a SharePoint environment like the following. As you can see they start with a top level SharePoint site. They create some department level sites and some sub sites for their intranet. All in all this is a pretty good start; right?
image
However within a few months (after some very heavy business and user adoption) we have the following.
image
We see such things as:
  • Team sites starting to sprawl underneath some of the sub-sites within a department. The challenge is being able to support collaboration in what was meant to initially be a publishing site.
  • Navigation, presentation, branding and user experience is not consistent.
  • We see custom applications either built from scratch or third party solutions purchased and embedded in sub-sites of a department. The challenge is should these custom solutions be hooked directly here?
  • Department level project sites are created. The challenge is that as project sites grow and take on new responsibilities for the business, they need to be elevated and made accessible like other sites.
There tons more. And I have had customers tell me this is not an issue with “SharePoint”. They currently have these same exact challenges with their legacy intranet, portals, etc. So why do organizations keep on having these challenges? I usually point to a lack of attention to Governance and Information Architecture.
So How Should You Be Thinking About This?
What if content and solutions that are to be deployed are delivered within a framework? Not a “novel” idea either. Hopefully we can make it as simple as we can. Instead of just adding more and more sites and applications; have rules that drive where we put things.
image
For example:
  • The Intranet site should remain a dedicated publishing site collection. This would be geared towards business users having read-only access with only a small set of content owners. Content is delivered in a consistent and clean fashion so that every site is the same giving the user a unified user experience.
  • Create a separate site collection for team sites with appropriate service level agreements. In the team site collection, sites should be can be dynamic generated from pre-define site templates. Users are given the ability to do pure collaboration, standing up lists, libraries, etc. They should be able to share information to complete everyday business tasks. There would also be an expectation that this area is not uniform and that users can do what they want with these sites. Retention polices can be put in place to discard these sites if they have not been used for a pre-defined period of time.
  • Create another site collection for business project sites. These have different support SLAs and Governance. These sites are a little bit more formal in nature with only a subset of people that would have access to manage them. Maybe there are designers from the IT department who have responsibility for building and supporting them. There could be policies and automated procedures to move content out of them to other site collections. Retention policies on these sites would be completely different than team sites and these sites may only be deleted when the project is over (or never deleted).
  • For custom applications, instead of embedding the application into a sub-site, place them into their dedicated site collections. Navigation links can be made to the application from other site collections. This would give significantly more flexibility to move it or provide it more resources.
  • Instead of creating one massive site collection where all scanned documents or records create multiple site collections and then route content to them based on business rules.
This list I provided is not meant to be an all-encompassing list of possibilities and/or solutions. Hopefully you will see that you need to start vertically and horizontally partitioning solutions and data based on such simple things as business rules, security access, data characteristics, etc. Knowing this will help you identify the types of site collections, sites, features, content types, managed metadata, site templates, permission levels, etc. that you need to configure SharePoint with.
The net result is you would take the previous diagram and start creating management boundaries based on the characteristics of the solutions and data managed in SharePoint. This is why Information Architecture is so critical for Enterprise Content Management system like SharePoint.
Putting everything into a single site collection really puts an organization into a tough spot to scale with the business. Microsoft has been publishing great technical diagrams for SharePoint 2010 (http://technet.microsoft.com/en-us/library/cc263199.aspx) and SharePoint 2007 (http://technet.microsoft.com/en-us/library/cc263199(office.12).aspx). Please review and get to know these diagrams well because they accurately tell everything I am talking about here.
Now some people counter this whole thing with why does SharePoint not give this to me? Why does SharePoint not auto-govern itself? SharePoint absolutely comes with a ton of features and capabilities that support a good governance model like Features, site templates, sandbox, managed metadata service, permission levels, and the list goes on. All of the configuration settings are there and organizations just need to set them as appropriate for their business.
I promise; we are coming back to the SharePoint Online cloud but we need to finish setting the stage.
How You Should Change Your Thinking
The following is a diagram a colleague (http://blogs.msdn.com/b/edhild/) and I commonly discuss with customers. I am shamelessly using it because this simple diagram really helps customers with a basic understanding of Information Architecture with SharePoint. Plus it supports everything that I just discussed. This is what we call the “Arch of SharePoint Data”. This is not all encompassing list, however there are different extremes.
image
There is publishing which are sites that are managed by a small group of users and read by a large community of read-only users. While on the other side is something like MySites which is my personal area to manage information and data. In between these two extremes are tons of different types of sites. Department sites, project sites, organization sites, custom application sites, team sites, etc. There are too many to even try to draw. Each one of these types of sites has different security, retention, data usage, transaction management, business rules, automation, etc.
I tell customers is that is perfectly ok to have a team sites area that are for pure collaboration where business users can spin-up a site, do some work on it for two months and then move on. Some people call this the “wild wild west” and that is ok. Just do not allow pure collaboration in you publishing area which is probably one of the most common mistakes J.
Coming full circle, you can see this is all about solution and data management. An Information Architecture is going to tell you the data characteristics. This will drive you to put content in one area versus another.
This is the beauty of SharePoint. It allows you to create and manage business workloads based on your specific business and mission. Once you have the Information Architecture nailed down, you can determine the Logical Architecture (services) you need and then the Physical Architecture needed to support this. By doing this, you know that both your Logical and Physical Architectures will be driven off real business requirements.
Where Does this All Fit with SharePoint Online?
Once you build an Information Architecture you will see that many of these partitions can be moved to the cloud. Intranets, team sites, project sites, my sites, light weight custom solutions, etc. can all be moved to the cloud. If you look at the “Arch of SharePoint Data” diagram, depending on your scenarios, there is a good chance that 80% to 90% of your solutions can be moved to the SharePoint Online cloud. For some organizations, they will be able to move everything up to the cloud. For some organizations they will have a hybrid.
Regardless the re-architecture should be a primary task in your SharePoint 2007 migrations to SharePoint 2010 anyways. Going through this exercise will put you in a position to move pieces to the cloud and with time, more and more.
So what are the Gaps with SharePoint Online?
If you want to know the exact answer it is clearly spelled out and completely available for you. Read either the Multi-tenant SharePoint Online Service Description (http://www.microsoft.com/download/en/details.aspx?id=13602) or the Dedicated SharePoint Service Description (http://www.microsoft.com/download/en/details.aspx?id=18128).
However really understanding these gaps depends on the perspective and approach you are taking with SharePoint. As I say a lot, “SharePoint means a lot of different things to different people”. I have seen customers extremely happy with using SharePoint with out of the box features and SharePoint Designer. While I have seen other customers use SharePoint as a full application development platform writing thousands lines of code. Knowing what SharePoint means to you will dictate your approach to the SharePoint Online cloud. The approach that I outlined in the first part of this bog will really help you with that decision process for moving to the cloud.
When you read the SharePoint Online service descriptions for both the multitenant and dedicated you will see that the gaps are really small. However there are some ones you must be aware of as they will affect your decisions on how to move to the cloud. The big ones that most people bring up are full trust code, business intelligence, FAST search, PowerShell, and Central Administration. This list will quickly change and become outdated because more and more features will be released with time. However let me address each one as it stands today:
  • Full trust code is usually the first one the first challenges. Right now, the only way to deploy custom code to the SharePoint Online Multitenant cloud is using a Sandbox solution. For the SharePoint Online Dedicated cloud, full trust code is supported but it must adhere to a strict set of rules (which are publically available) and the code will only be deployed within set windows. Why such restrictions and strong governance? Well, for all the obvious reasons that would come up if you were tasked with having to run an extremely large SharePoint environment on-premise. What has been one of the biggest issues with SharePoint 2007 Governance? It was developers writing complex code on SharePoint and disregarding the fact that an error they write may take down other sites in the farm (like the content query web part that retrieved too much data on the home page of the intranet <g>). We need to ensure that there is strong security and good performing code and that there is no possibility the Company A can take down Company B. The only way to achieve that service level agreement is to have an environment for running governed code. The net effect is that there will be limitations but you will have that guaranteed uptime. My golden rule is that all SharePoint development (on-premise and cloud) should begin with Sandbox solutions and only when the customizations cannot run in the Sandbox then build as full trust solutions. Doing this will ensure you have agility to move to the SharePoint Online cloud when you are ready. If you really need to do complex operations and manage data structures consider using services Windows Azure integrated through Business Connectivity Services (BCS), Silverlight, etc. But still that may not always suffice and that is why SharePoint Hybrid implementations will be commonplace for organizations with mature SharePoint deployments. I will cover this in more detail shortly.
  • Business Intelligence today in Office 365 is Excel and Visio Services. Other SharePoint business intelligence services like PerformancePoint, SSRS, PowerPivot and Chart Parts are not available right now. Another limitation today is Excel and Visio services only utilize data that is within the SharePoint Online context; it is not able to reach outside (i.e. to back-end databases). More and more capability will be released through the Office 365 cloud; just for the time being Silverlight and Windows Azure can be used to satisfy these requirements.
  • FAST Search is not available in the SharePoint Online cloud right now. It is possible to integrate a local FAST farm with SharePoint Online Dedicated; but not Multitenant. Still take comfort in the fact that SharePoint 2010 search made significant jumps forward from the SharePoint 2007 offering and provides a very strong search experience. With time more and more advanced search capabilities will be released in the cloud.
  • PowerShell currently is not fully available with SharePoint Online. There are a lot of PowerShell commands available for Exchange Online and user subscription management; however the full set of SharePoint PowerShell commands is not available today.
  • Central Administration Site is not available and this should be expected by anyone who understands what cloud architecture. There are administration screens available to some operation that you would normally perform in Central Admin however it is limited and does not give you the granular control. Why? Well this is the cloud. Customers want to come to the environment so they do not need to be in the business of managing every little configuration of SharePoint.
And the reality is with more and more releases of SharePoint Online this gap is going to continue to close. Still, even if it were to completely close, there will be perfectly valid reasons why some SharePoint may remain on-premise thus, creating a SharePoint Hybrid environment.
What is the SharePoint Hybrid Architecture?
SharePoint Hybrid is as simple as it sounds; it is some SharePoint delivered through the cloud and some SharePoint delivered through on-premise. What will drive you to have SharePoint on-premise? All of that has pretty much been covered to this point. Anything from a specific feature to a business policy may keep some SharePoint on-premise. However using the approach I laid out, you will be able to significantly reduce your footprint of SharePoint on-premise and gain the advantages of the cloud.
The great thing about using SharePoint on-premise is that it is the same software being run in the cloud. This means it is very easy to deliver a consistent user experience between on-premise and the cloud. Branding, navigation, security groups and single sign-on can be configured in such a way that the user can go between these two environments and not know it.
What Type of Cloud is SharePoint Online?
One other thing I want to discuss is what type of cloud is SharePoint Online. I really like this picture it really spells it out for people whom are not fully aware of the multiple delivery models for cloud computing.
image
Starting on the left side, you see on-premises and this is how most organizations run SharePoint. You must own the entire solution; all the way up the computing stack. Next is Infrastructure as a Service (IaaS) which is the cloud environment that is managed all the way up to the virtualization layer. The company is responsible for everything else; including the management of the operating systems. From a SharePoint perspective that means all the software installation, configuration, management, patching, adding new servers to meet demand, load balancing, etc. needs to be managed by the you. IaaS gets you of the business of hosting virtual SharePoint servers.
Platform as a Service (PaaS) is running the environment all the way up to the application and data tiers. Windows Azure is a PaaS cloud. In this cloud you build custom applications, data models and run them through a highly available environment.
Software as a Service (SaaS) is the entire stack delivered in the cloud and this is the delivery model for SharePoint Online. You do not have to install software, manage patches, availability, etc. This environment gets you out of the business of managing software and into the business of building solutions. SaaS does not provide the granular level of control of the SharePoint environment (which we have discussed).
The reality is that organizations and companies need to save costs and SaaS is the cloud delivery framework they want for the long-run. IaaS can run SharePoint 2010 and can be leveraged as a replacement for on-premise complex SharePoint computing. Still it is well recognized by industry that companies want more SaaS solutions.
Supplement with SharePoint Online with Windows Azure
Windows Azure (PaaS) and SharePoint Online (SaaS) can be used together to deliver end-to-end cloud solutions. Companies that have mature SharePoint deployments commonly have:
  • Code that runs in full-trust
  • Are managing complex data
  • Require the ability to do back-end systems integration.
One thing I have been talking with customers about is offloading that code out of SharePoint and into to the Windows Azure (PaaS) cloud. The Windows Azure cloud allows you to develop and deliver custom code, services, complex data (SQL Azure) and connected back-end integration (AppFabric). These complex operations can be connected through Business Connectivity Services (BCS) in SharePoint Online.
This makes a lot of sense too when you take a step back. I have already said that you should develop SharePoint code to the Sandbox first and when there is good reason to deliver outside the Sandbox. Here is a similar question. At what point do you know you should be developing in SharePoint? I fully recognize there is a gray area here.
I say good software development patterns and practices should drive that decision. This is why Windows Azure is so interesting with Office 365 because we can move complex code that cannot run in the SharePoint Sandbox to the Windows Azure cloud. I recognize this is not a perfect rule because some code needs to run in SharePoint as full trust. However this should be part of your design analysis to reduce to your SharePoint on-premise footprint.
There are lots of different ways Windows Azure can be utilized with SharePoint Online. There may be situations where you need:
  • To work with data in SharePoint but you have complex relationships in the data model that are not right for SharePoint lists. Use SQL Azure to manage that data and build services and connect via BCS or Silverlight.
  • To integrate with line of business applications on premise use custom services deployed in Windows Azure or AppFrabic. Again BCS or Silverlight can connect Windows Azure which is conduit to line of business applications.
  • To use custom web services to perform complex computations and logic. Again offload that to Windows Azure.
  • As well, reverse the direction. There may be solutions and applications delivered in Windows Azure can utilize SharePoint Online services. There are SharePoint web services, REST services for data, SharePoint Client API, Javascript APIs and Silverlight APIs that can be used as integration points. A simple scenario could be a web page deployed in Windows Azure that needs to manage documents. Instead of building that up from scratch in Windows Azure just connect through SharePoint Online APIs and deliver SharePoint Online services through that custom web page.
At the end of the day both SharePoint Online and Windows Azure can be used to complement each other in the delivery enterprise business solutions through the cloud.
image
Conclusions
Why did I go through all of this? The answer is simple; to give SharePoint Architects ideas on how to move forward with SharePoint Online. There really should be no blockers as long as you take a realist look at your Information Architecture and assess how you use SharePoint. There will be lots of stuff which can clearly be moved to the cloud, there will be some stuff where it is not appropriate (hybrid) and then there is that gray area. However I really hope that approach I put forth will help you with your thinking into how to significantly reduce, if not completely remove, your SharePoint on-premise footprint.
Thanks
Special thanks to Chris Geier and Stephen Cawood for providing me feedback and advice as I wrote this.

Tuesday, November 1, 2011

SharePoint 2010 Content Database Sizing

A lot of people already know this but on July 14, 2011 it was announced that Content Database Sizing has been updated http://technet.microsoft.com/en-us/library/cc262787.aspx#ContentDB.

There is still recommendation to keep content databases at 200GB. Why? This recommendation is still a good recommendation because if you need to back-up and restore a database very quickly, you do not have move around large back-up files. Plus if you have an Information Architecture that drives content to specific site collections (with dedicated content databases), you will be much more agile to requirement changes, upgrades, etc. One big massive content database is an indication of poor planning and governance.

The first new recommendations in this article is that 4TB of data can be stored in a content database. There are some parameters around this recommendation that you should read.

The second new recommendation is that there is no explicit recommendation on sizing for document archiving scenarios. However there are some very specific recommendations made in here to support that scenario – so review http://technet.microsoft.com/en-us/library/cc262787.aspx#ContentDB. This is really important for Records Management solutions. On this specific point, it would also be good to review the “Extremely large-scale content archive” section in http://technet.microsoft.com/en-us/library/cc263028.aspx. When going down this path you will need to have Remote Blob Storage and FAST to support this solution architecture.

Finally SQL Server tuning is fundamental to your success for managing large content databases. Here is a blog that I wrote on the topic - http://www.astaticstate.com/2010/12/sharepoint-2010-high-availability-with.html

Saturday, August 20, 2011

SharePoint 2010 Architecture Introduction

Introduction
I recently had a client ask be about how to get started on understanding the SharePoint 2010 architecture and how they should deploy. Unfortunately the answer is it depends based on your business requirements.
Gartner recognizes the SharePoint platform as a best of breed across all major workloads like web portal, enterprise content management (document management, web content management, records management, etc.), business intelligence, workplace social computing, search/enterprise search, and as an application development platform. Knowing this, the SharePoint platform delivers a single platform that is managed together helping agencies consolidate costs in people, process and technology. Plus SharePoint is tightly aligned to Office and Lync (instant messaging, sharing, meeting, and phone solution).
Now depending on what you will implement will depend on how you scale SharePoint 2010. Plus with many agencies, there is never just on SharePoint farm. There will be multiple SharePoint farms which will be configured to support the business requirements.
References
If you are trying to get an initial understanding of the SharePoint 2010 architecture, here are some good references:
· SharePoint 2010 Architecture - http://msdn.microsoft.com/en-us/library/gg552610.aspx - this is a good starting place if you are not familiar with SharePoint.
· SharePoint 2010 Technical Diagrams - http://technet.microsoft.com/en-us/library/cc263199.aspx - all the big picture of both physical and logical architecture.
· Hardware and Software Requirements - http://technet.microsoft.com/en-us/library/cc262485.aspx - I suggest reading this right off the bat.
Performance, scaling, business continuity topics always come up when starting to learn about SharePoint 2010. Here is a good place to start.
· Performance and capacity technical case studies (SharePoint Server 2010) http://technet.microsoft.com/en-us/library/cc261716.aspx - More good case studies.
· SharePoint 2010 Performance and Capacity whitepapers - http://technet.microsoft.com/en-us/library/ff608068.aspx - Whitepapers on specific workloads.
· SharePoint 2010 Capacity boundaries - http://technet.microsoft.com/en-us/library/cc262787.aspx - This is pretty detailed discussion on testing.
Now if you are familiar with SharePoint 2007 architecture:
· I have written a multiple part series on SharePoint 2010 architecture here- http://www.astaticstate.com/2010/01/sharepoint-2010-service-architecture.html
· I have another blog on scaling SQL Server because this is a critical component to SharePoint - http://www.astaticstate.com/2010/12/sharepoint-2010-high-availability-with.html.
· Here is a series on SharePoint 2010 Search - http://www.astaticstate.com/2010/12/sharepoint-2010-search-architecture.html
· Here is a series on FAST for SharePoint 2010 - http://www.astaticstate.com/2011/01/part-1-fast-for-sharepoint-2010.html
Office365
Finally it is also IMPORTANT to know when reviewing all these architectures, that SharePoint 2010 is the only portal technology on the market that software as a service (SaaS) cloud offering called Office365. This ultimately means major reduction on hardware and software that must be installed and managed, better service level agreements to your users, quicker deployment of solutions, better ability to scale, better ability to support telework and external collaboration, and the list really just goes on. Be in the business of creating business solutions.
I say the best way to learn about SharePoint Online Service is read the service level agreements which I have linked to here - http://www.astaticstate.com/2011/07/office365-slas.html

Tuesday, February 1, 2011

SharePoint 2010 Server Topology Examples

I figured I write a short blog about SharePoint, logical / physical architecture, YET AGAIN!!!

Well I actually received some advice that made me want to tweak my message a little. If you read my earlier blog series on SharePoint 2010 Service architecture you will know how much SharePoint 2010 has changed. Then you may have also read my series on the new SharePoint 2010 Search architecture and found out there have been tons of interesting changes here too.

I had a great presentation from Shane Young (SharePoint MVP) which drove me to write this blog. What I realized is I have to stop thinking about architecting in SharePoint 2007 and take advantage of SharePoint 2010. Heck I should know them given all the new architecture features I have been writing about all this time.

In many SharePoint 2007 implementations, the one service we had much of our planning around was SharePoint Search and the SSP. Now all of those limitations are gone.

I am commonly asked how many servers I need to get started with SharePoint 2010. Now that is a LOADED question because such things as SQL high availability, network, service utilization, enterprise features, etc. which all play into an optimum configuration of SharePoint 2010. There really is no one size fit all. However there is always a good place to start the discussion. Shane Young provided two recommendations which I liked the best.

The Three Server Farm

This one is pretty fun one when you think about. There are a ton of SharePoint 2007 environments out there where there are 3 SharePoint servers and with dedicated SQL Servers. In those cases there are two web front ends and one application server. The one application server runs the index service, maybe something like Excel and the search query services are running on the web front ends. In the SharePoint 2007 days, this gave some good redundancy but with SharePoint 2010 we can do better.

Now one approach you can consider in SharePoint 2010 is just making all three servers the same. Rather novel idea right?

clip_image001

Observations:

  • You now have web traffic distributed across three web front ends.
  • All three servers can support SharePoint 2010 services traffic such as Excel, Visio, Access, etc.
  • Search crawler service should be pretty spiffy because you can configure all three servers to perform full crawling during off hours, and partial crawls depending on how fresh you need to keep search.
  • All three servers can be made redundant from a search perspective by having index mirrors.
  • Search results will actually be quicker because you will have three index partitions that can be searched across simultaneously.

Now I had a colleague point out that this does not fly well because “this does not maximize the separation of services architecture from the web front ends”. I get and completely understand that standpoint but I really do not think that should be a consideration for disqualifying this option. Even though everything is running on each machine, the SharePoint roles / services will continue to run independently in different IIS websites, in separate application pools and with different accounts. There is no reason why you cannot continue to configure it as it if they are running on different machines. This is probably a good thing too because someday you will have to scale SharePoint up to meet demand. Remember you can always scale up by adding boxes or separating services onto their own dedicated machines. All I can say is configure SharePoint 2010 based on your requirements, adhere to best practices, plan as much as you can and do the right configuration for your company.

The Four Server Farm

Now let’s look at a four server SharePoint farm. There will be cases where the three server farm recommendation will not make sense for whatever the valid reason. In this configuration we go a little more “traditional”. In this case, Shane recommend two web front ends and then two application servers.

clip_image002

Observations:

  • We have two dedicated web front ends. Add more if you need to.
  • Application services will run on the same application servers with search. Hopefully they will not interfere with full crawls unless you require dedicated resources for the application services. If you need; just add more application servers.
  • The two application servers give us the needed redundancy we need for search. You will have two crawlers and two index partitions with mirrors to start with.
  • One little trick I was told for this configuration is you should also configure the application servers where the Search crawling component is running to also be web front ends (WFE). Do not load balance them with the other WFEs. Then add the SharePoint site URLs to be searched as entries into the hosts file on the application servers to point back to itself and not to the load balanced URL. What this effectively does is trick the crawler into crawling the WFE on the application server. This will give performance improvements twofold. First the load balanced WFEs where users access content will have no performance issues associated to the crawler crawling content. Second, the load of the network will be reduced because nothing will go across the wire when indexing is occurring.

Additional Notes:

  • High Availability – SQL is a big player in this discussion and you need to read this blog.
  • FAST for SharePoint 2010 - Now if you are going to consider using FAST for SharePoint, I would probably vote for the three server architecture recommendation as a starting off point. Here is a blog series on FAST for SharePoint 2010 if you are interested.

Closing

As I have said, there is no one size fits all. This is just a place to start and based on your plans you will scale up the SharePoint 2010 architecture as needed.

Monday, January 17, 2011

Part 3 - FAST for SharePoint 2010 Physical Architecture

Part three of this series will now focus on the physical architecture of FAST for SharePoint farm deployment.

Physical Architecture / Topology

Now am going to take what we have learned about the logical architecture and scaling and apply it to some farm scenarios. From what I know, scaling a FAST for SharePoint 2010 farm is a similar to scaling a SharePoint 2010 farm. We basically need to analyze the requirements and the scale the FAST components appropriately to meet those requirements. Some of the immediate ones will come up are:

  • How much content needs to be searchable?
  • What is the total number of items that need to be searched?
  • What is the format, size and any other interesting characteristic of the items to be searched?
  • What sort of availability is needed to support search?
  • How fresh must the search results be and how often is content changed?
  • How many users will be performing searches concurrently?

This is probably just a starting point. I actually wrote a set of questions that need to be asked here.

Up this point we learned about the FAST for SharePoint components such as:

  • Administration component
  • Document Processing component
  • Content Distributor component
  • Indexing Dispatcher component
  • Web Crawler component
  • Web Analyzer component
  • Indexing component
  • Query Matching component
  • Query Processing component

We have also learned about how they scale. Now honestly, it is pretty much impossible for me to give you all the examples out there on how to scale FAST. So I am going to pick ones (which I shamelessly already have pictures for) and discuss those J

Minimum Deployment

This following is a very basic, non-redundant implementation of SharePoint 2010 and FAST for SharePoint 2010. It is never really commended to install both on the same machine unless you are creating a local development environment. In both of these scenarios there really is no scale other than SharePoint and FAST are on different machines.

In the case of FAST all of the core components such as admin, document processing, content distributor, index dispatcher, web analyzer, indexing, query matching and query processing are all deployed to a single machine.

image

It is interesting to note here that SQL is not heavily utilized by FAST for SharePoint. FAST only needs to access configuration information from SQL. This is different than the out of the box SharePoint 2010 Search which heavily utilizes SQL as part of the indexing and querying process.

Small Deployment

The following is considered to be a small deployment of FAST for SharePoint. You will notice there is both a SharePoint and FAST farm. I am not going to discuss the SharePoint Farm; you can read my blog on how to scale SharePoint 2010 farms. However let’s look at the FAST for SharePoint 2010 Farm. As you can see there are three boxes.

image

The first box:

  • Has the Administrative component which can only be installed on one machine in the farm.
  • Has the Content Distributor which is responsible for routing documents to document processors.
  • Has a Document Processor component with 12 processes running. The more processes you have running the more content can be consumed. The number of processes that can be supported on the machine is available cores on the server.
  • Has an instance of the Web Analyzer running.

The second box:

  • Has an additional document processing component.
  • Has the index dispatcher component which is responsible for sending processed content to be indexed.
  • Has the indexing component, query matching and query processing components installed.

The third box:

  • Has redundant indexing component, query matching and query processing components.

In the end this is a really simple / standard implementation that can handle in the range of 10+ million items and managed about 10 queries per second.

Medium Deployment

This is a medium sized FAST for SharePoint deployment. This should be able to index roughly 40 million items and continue to support 10 queries per second.

image

Observations:

  • The first two FAST servers have all the administration, content distributors, and web analyzer components. It has also document processing components.
  • The following six servers have been configured used columns and rows patterns I introduced earlier.
  • There are three index columns to partition the content to allow for a higher volume of content than the previous.
  • On the first row we have additional document processing components. This will improve consumption and processing of content. Plus it adds redundancy.
  • As well on the first row index dispatching and indexing components have been installed.
  • On the second row are secondary indexing components which are redundant copies of the indexes on the first row. It is not exactly clear in the diagram but the query matching and query processing components are configured. This way there are dedicated machines to support the querying of content.

Large Deployment

This final farm is very similar to the previous other than some of the administration components have been scaled to a third server and there are now a total of six index columns to support up to 100 million searchable items.

image

As I said before, this is by no means the only three configurations of the FAST for SharePoint farm. These components can be scaled however to best meet the requirements.

References

Part 2 - FAST for SharePoint 2010 Logical Architecture

This is part two in this FAST for SharePoint series and it will focus on the logical architecture.

Logical Architecture

If you are a reader of my blog, you will notice I always start out with understanding the features (previous section), then understand the logical architecture and then finally understand the physical architecture. I have always said the most common mistake is to make is jump right to the physical architecture because everyone wants to know how many “boxes” they need. However both the features and logical architecture heavily influence the physical architecture.

If you are an experienced SharePoint professional, right off the bat you want to know, how are the FAST services made available to SharePoint? Hopefully the below diagram will clear it up.

image

Here is the reference to the above diagram.

If you have been reading up on the new SharePoint 2010 Service Architecture you will know that it is much more scalable than the prior version. Basically there are two services that have to be configured when using FAST for SharePoint:

  • FAST Search Connector service – In the above diagram, this is the FAST Content SSA. This service is responsible for feeding content to FAST for SharePoint farm. Architecturally, it is performing the same function as connector application would in the FAST ESP 5.3 product. This service is configured in Central Administration and you configure it like the out of the box search (i.e. set up content sources, rules, etc.). The configured content sources will be used to feed the content to FAST for indexing. One thing to point out is the all the data for the content sources will flow through this connector whether they be in SharePoint, file share, public exchange folder, etc. All content will be fed to the FAST Content Distributors which will subsequently send the content for processing.
  • FAST Search Query service – In the above diagram, this is the FAST Query SSA. This service has two responsibilities. First and foremost it is responsible for forwarding all search queries to the FAST farm query service. The FAST farm query service has the responsibility of building the search results and then returning them to SharePoint FAST Search Query service. Second the SharePoint FAST Search Query service is responsible for performing all people searches. This service actually performs the indexing of user profile information and also performs the querying of the data in the user profile index. In summary, when a query is made content data will be retrieved from FAST while user profile data will be returned from this service.

Now let’s take this one step farther by understanding to understand the roles and responsibilities. In the diagram below you can see how it aligns with the previous diagram:

  • On the far right we see the FAST Search Connector service which is configured to feed content from other locations.
  • On the bottom far left we see the FAST Search Query service which will receive queries from SharePoint web parts and then perform a query for content in FAST as well in the user profile index it has built.

In the middle you can see several services that are part of the FAST for SharePoint Search farm.

image

Here is some background on these services. If you read up on my FAST ESP 5.3 series, you will see that the architecture is not really different. Starting on the far right within the “FAST Search Server 2010 for SharePoint Farm” area:

  • FAST Indexing Connectors – This is where additional FAST Indexing Connectors reside like the JDBC and Lotus Notes connectors. These connectors are conceptually no different that the FAST Search Connector running in SharePoint. They are responsible for feeding content to FAST. The only difference is these are configured as part of the FAST server deployment configuration file and they are not configurable in SharePoint Central Administration.
  • Item Processing – This is the where a lot of the secret sauce of FAST for SharePoint 2010 lives. There are a couple sub-components which are not shown in this picture that you should be aware of. First there are Content Distributor(s) which have the responsibility for directing content to the correct Document Processing Pipeline for processing. Document Processing Pipeline does much of the FAST magic such as entity extraction, linguistic processing, and content normalization.
  • Indexing – This is the component that manages the content that is produced by the Document Processing Pipelines. We will talk about scaling later, but there is a good chance that there will multiple Index nodes in the FAST for SharePoint farm. Multiple index nodes are needed to support searching large volumes of content in a quick amount of time. The concept is very similar to the SharePoint out of the box search where you create Index Partitions to allow for quicker querying of large volumes of content. Another sub component you should be aware of is called the Index Dispatcher which is responsible for routing a searchable item to a location on the index.
  • Query Matching – This component is responsible for actually searching and retrieving items from the index. It will build the result set from the Index node that it is associated to. This is another component that can be scaled to assist with improving performance. It will also do things such as return a summary of content for an item, highlight the search terms and supports both shallow and deep refiners. This component is the same as the Search Node in the FAST ESP 5.3 product and is similar to the Query Component in the SharePoint 2010 out of the box search.
  • Query Processing – This component will perform both pre and post query processing. This is also the component that the SharePoint FAST Search Query service connects to. For pre-querying such things as language parsing, linguistic and security processing will be applied. For post-query operations such as result merging from multiple index nodes, formatting and duplicate removal will be performed.
  • FAST Search Authorization (FSA) – This component works with the query processing component to ensure that user performing the search only has access to content they have permission to.
  • Web Link Analysis – This is also referred to as the Web Analyzer. This component provides various different features to improve the relevancy. For instance if a piece of content is linked to a lot by other content sources, that piece of content will be considered more relevant than others. Also this component does click-through analysis from the search results. The logic being if the content is clicked on a lot in the search results it may be more relevant than other content.
  • Web Crawler – This component is not shown in the diagram above but you will run across it in your FAST for SharePoint research. This is a component that will crawl and feed web content to FAST. This component does not have to be used if SharePoint FAST Search Connector service has been configured to crawl websites in SharePoint Central Administration.
  • Administration Component – This is the component that manages the FAST farm. The interesting thing about this one is that it cannot be scaled or made redundant however if this component were to fail that FAST server will continue to run in the configuration it is currently running in.

Scaling the FAST for SharePoint 2010 Architecture

In this section I am going to give an introduction into how FAST for SharePoint 2010 components can be scaled.

The diagram below comes from this MSDN article. In the previous diagram discussion I introduced the two components called Indexing and Query Matching. If you have been doing research on FAST you have probably heard about creating columns and rows to support better querying and indexing. This applied directly to both Indexing and Query Matching components.

An entire index can be made up of multiple index columns. Breaking the index into multiple columns is commonly referred to as partitioning. Searches are performed across the search rows, which in turn search all of the index columns. The Query Processing component that I introduced earlier (which is different that Query Matching) is responsible for merging all of the search results, from all of the index columns together into a single result set. It is not depicted in this diagram but there will be Query Processing components for each column. This is conceptually very similar to the new out of the box features of SharePoint 2010 Search.

image

The point of this diagram is show how FAST search is scaled, and scaling the index and query nodes are the most common. Here are some basic rules you should know right off the bat:

  • When there are more Index Columns – more content volume can be managed and better search performance can be realized.
  • When there are more Index Rows – there is better fault tolerance as there is a primary Indexing node and back-ups. Should the primary fail; the back-up will kick in.
  • Adding more Search Columns – will provide better search performance against a large volume of content.
  • Adding more Search Rows – will provide not just fault tolerance but also better performance as queries can be load-balanced across the search rows.

The information you just learned will help you understand how many servers of FAST you may potentially need.

Now I know the next question you may be asking is “when do I need to add a new index column”? Well the guidance that I found is depends as I have talked with FAST folks and say 10 to 25 million items can be managed per index column. Drivers of how many columns are based on this such as:

  • How much data is being consumed?
  • How often is data fed?
  • How quickly does the data need to be made available?
  • How many concurrent queries will be made by users?
  • Etc.

Now the Indexing and Query Matching components are not the only components can be scaled. Pretty much every component, with exception of Administration, can be scaled:

  • FAST Indexing Connectors – Multiple indexing connectors can be added to better supporting feeding of JDBC and Lotus Notes data.
  • Item Processing – The Document Processing Pipeline I referred to earlier can be scaled. You will want to add more item processors to support the rate at which content is fed into FAST. This is important if you are trying to reduce index latency which is the amount time it takes to make content searchable in the index.
  • Content Distributor – I earlier mentioned this as part of the Item Processing component. Adding multiple Content Distributors will add fault tolerance but there were only ever be one primary.
  • Query Processing – If you recall this is the component that does the pre and post processing around making a call to the Query Matching nodes(s). More Query Processing components can be added to better support queries per second and to provide fault tolerance.
  • Web Link Analysis – Can have multiple instances added to reduce the amount of time needed to complete the analysis.

Hopefully this has provided you a good introduction into how FAST for SharePoint 2010 can be scaled.