Monday, 18 April 2016

SharePoint 2016 Cloud Hybrid Search

SharePoint 2016 has an awesome new feature for Cloud Hybrid search and I had the opportunity to try it out. There is one search index in Office 365 to rule them all!

You can search for on premises and SharePoint online content from one index in Office 365 and Office 365 ranks them in one result block. The content metadata is encrypted when it’s transferred to the search index in Office 365, so the on-premises content remains secure.

Search is configured in Office 365, except for the crawling which is done on SharePoint Server. This blog highlights my experience with setting it up and testing it in SharePoint 2016 Server. The image below shows the building blocks for cloud hybrid search. The new key component is the Cloud Search Service Application.

Building blocks of cloud hybrid search

Here are few things you will need to get started:

  • Farm with SharePoint 2016 or SharePoint 2013 starting with the August 2015 PU
  • Active Directory domain administrator credentials
  • Office 365 tenant URL and tenant administrator credentials
  • Directory Synchronization server with Azure AD Connect tool installed
  • PowerShell scripts CreateCloudSSA and Onboard-HybridSearch provided here
  • OOS(Office Online Server) for previews

The picture below shows the various servers/workloads,their roles and if it is on premises or cloud.

Servers and roles

Below are steps used to get cloud hybrid search going:

Directory Synchronization:

We stood up a Windows Server 2012 server which should be a member of the same Windows Server Active Directory forest as the SharePoint Server farm. Microsoft recommends that directory synchronization occur in its own server due to resource requirements.Download and install Azure AD Connect tool from here on the server. Launch the Azure AD Connect tool and configure as prompted.For the synchronization type, Microsoft recommends identity with password synchronization. This will start the synchronization right away. Your screen should look like the picture below if all goes well.By default the sync occurs every 3 hours but you can change it if needed from the Azure AD Sync Scheduler task in Task Scheduler of the sync server.



Create Cloud Service Application:

From your SharePoint server run the CreateCloudSSA PowerShell script. This script will provision a Cloud Search Service Application with the cloud index switch turned on. When I ran this script the first time, I got an error saying SharePoint farm license not recognized. If you run (Get-SPFarm).Products and get no output it indicates this license not recognized problem. To fix this, run the cmdlet below which will refresh the farm license.

Set-SPFarmConfig –InstalledProductsRefresh

This has been fixed in a later release but just wanted to let you know just in case.

Install hybrid onboarding prerequisites:

Before we can complete the onboarding, download and install these prerequisites on your SharePoint server.

  • Microsoft online sign in assistant found here
  • Microsoft Azure AD PowerShell found here

On boarding cloud hybrid search:

To on board hybrid search, server to server authentication needs to be set up which allows servers to request/access resources from one another on behalf of users. The Onboard-HybridSearch PowerShell script sets up the necessary server to server authentication and configures trust between SharePoint Server and Office 365 tenant. Once the script finishes successfully everything is set up to send metadata to the Office 365 index.

Create Content Source and Crawl content

Next up we set up our content sources in the Cloud Service Application to crawl on premises content. I set up a content source to crawl for a SharePoint site and a file share. File shares work great with cloud hybrid search. Once they are set up, start a full crawl and upon completion of the crawl review the crawl log for any errors. Hopefully everything crawled as expected and the content should be in the Office 365 search index. I couldn't wait any longer to test the search out!

Search for your on premises content

Log on to the search center of your tenant and enter isexternalcontent:1 and hit search along with some drumrolls!! The results should display on premises content. The new managed property isexternalcontent is available which has value 1 if it is on premises content or 0 otherwise.


Search Previews

Now that we have search results, we need to get previews working for Office document on premises results. SharePoint 2016 does not support OWA(Office Web Apps Server) and I found this out the hard way by configuring SharePoint with OWA and it did not work. So, OOS(Office Online Server) is the one to use for SharePoint 2016. Once you have OOS stood up, follow the steps in this article to configure SharePoint to use your OOS server.

If everything is configured right, search results will display previews of Office documents when you hover on the result item. One important note, previews will work only if you are connected/logged on to the source of the on premise content.

Search Index limits

The latest guidance from Microsoft per this article, the Office 365 index can have one-million items from on-premises SharePoint per each 1 TB of pooled storage. So, past this limit throttling will occur to protect the Office 365 tenant.

SharePoint 2010

A reminder, the cloud Search Service Application is available only in SharePoint Server 2016 and SharePoint Sever 2013 starting with the August 2015 PU but not in SharePoint 2010. But fear not,the cloud Search Service Application can be published to a SharePoint 2010 farm from a farm where it exists should you need it there. I have not tested this but have been told it works.

Use case scenarios

Some scenarios where cloud hybrid search becomes applicable:

  • A customer is in the process of migrating to Office 365 from on premises or hybrid implementation and they want user's to search everything from one place which is the Office 365 content. This is the most common use case cloud hybrid search.
  • Need for web parts and apps in SharePoint Online to expose on premises content
  • Use Delve Me page for showing on premises content activity

Features unsupported with cloud hybrid search

Below are some notable features not available in cloud hybrid search:

  • Content enrichment service
  • Custom entity extraction

People Search

In Office 365 by default all people in the User Profile application will be indexed in the Office 365 index.

If additionally people are crawled from the cloud Search Service Application, a duplicate set of people content is created in the Office 365 index. This will be confusing to users as searching for a person will return multiple results.   

Two ways to handle this: 

  • Have the Office 365 User Profile service the primary source of people information and let Office 365 search take care of the indexing and presentation. With this approach you do not need to crawl people on-premises.
  • If you crawl the on-premises people profiles in addition to Office 365 crawling the tenant profile store duplicate results will come up.However you can use query transformation to decide which results you want to display. Even providing the ability for end users to choose between the different result sources at query time. 

So, there you have it, cloud hybrid search is one of the coolest features when implemented will provide immediate value to your customers.