Enabling “Fuzzy Search” for SharePoint Server 2010

I like a challenge and one of our consultants really asked me for a hard one last week – implementing “Fuzzy Search” on SharePoint 2010.  This was not actually something I’d ever heard of before, so it took quite a bit of research before I was able to figure out what it was and how to implement it.  “Fuzzy Search” is a search that returns results not distinctly asked for when performing a query.  For instance searching for “Jon” should return results for “John”, “Jon” and perhaps even “Jean”.

The first clue from the consultant was “I think it needs the Speech platform installed on the server”.  I actually immediately dismissed this as it sounded pretty unlikely, why on earth would SharePoint Server 2010 require a speech server installable?

Soon enough I’d figured out that there is a cmdlet in SharePoint 2010 called “New-SPEnterpriseSearchLanguageResourcePhrase”, which by looking at the existing LanguageResourcePhrases with Get-SPEnterpriseSearchLanguageResourcePhrase contained a word and an alias for a word, which will allow a “Fuzzy Search” returning search results based on an alias name.

However even after I had uploaded a whole heap of new aliases based on my culture “en-AU” I was still not getting any results.  Out of desperation I installed the Speech Server platform binaries mentioned by my colleague and after a quick restart everything started working!

Here are the steps to get it working :

  1. Download the Microsoft Speech Server Run Time from here : http://www.microsoft.com/downloads/en/details.aspx?displaylang=en&FamilyID=f704cd64-1dbf-47a7-ba49-27c5843a12d5.  Most likely you will want “EN-AU” as this appears to work off Browser settings not server settings.
  2. Install this onto the Server that does the crawling and searching, if in doubt put it on all of your app servers.
  3. Restart the application server
  4. Grab this file “C:\Program Files\Microsoft Office Servers\14.0\Bin\languageresources.txt” and make a copy of it to “C:\Windows\Temp” – this directory may change based on your SharePoint 2010 installation.
  5. I’d recomend using Excel to edit the file and strip out any languages you do not want. I would recommend retaining “en-US”.
  6. Remove any columns except for the original word and its alias.
  7. Insert the following headers “Name” and “Nickname” for the alias.
  8. Run the following PowerShell script
asnp microsoft.sharepoint.powershell
$langinfo = import-csv C:\windows\temp\AU-LanguageResources.csv
foreach ($line in $langinfo) {
New-spenterprisesearchlanguageresourcephrase -Name $($line.Name) -Language "en-AU" -Type "Nickname" -Mapping $($line.Nickname) -SearchApplication (Get-SPEnterpriseSearchServiceApplication)
}
start-sptimerjob "prepare query suggestions"
(Get-SPEnterpriseSearchCrawlContentSource -SearchApplication (get-SPEnterpriseSearchServiceApplication) -Identity "Local SharePoint Sites").StartFullCrawl()

This will likely take a couple of hours to go through, but the end result will be that your search will now be nice and fuzzy!  You will need to replace the culture (en-AU) specified with your own.  Cultures such as en-US and de-DE already exist in the database, so there is no point re-adding these.

I have to admit this is very cool!  and I was actually really thrown by the requirement of the Speech Server platform, my guess would be it has something in the API that is required by SharePoint Search to process aliases.  I was also quite surprised I was not able to find anything out there at all regarding this particular operation, even the cmdlet seems to be seldom used.

Update 9th Feb 2011 : If this is not working for you, it is worth noting that the culture selection seems to work off the users browser culture, not off the server culture.  If you have multiple cultures using your search, then it could be worth doing this for all of them that may apply.

Update 11th Oct 2011 : Updated import-txt to be import-csv, which actually exists.  Lesson learned, do not write powershell scripts from memory. 🙂

Advertisements

5 Responses to Enabling “Fuzzy Search” for SharePoint Server 2010

  1. Claire says:

    Amazing! I knew you could do fault-tolerant searches with public-facing websites, but wasn’t aware SharePoint allowed it as well. Thanks!

  2. Pingback: Enabling “Fuzzy Search” for SharePoint Server 2010 | Trivial Information

  3. Sezai Komur says:

    Freaking awesome mate!

  4. Ver interesting article! The speech server run time for en-US is one of the prerequisites that get installed before setup is run. The other ones need to be downloaded indeed.

  5. Steve says:

    Excellent article. I found this handy for identifying the language codes and then setting a filter in Excel to delete the unwanted languages.

    http://msdn.microsoft.com/en-us/goglobal/bb964664

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: