Synonyms Support with Azure Search in Sitecore SXA Search Component
If you tried to enable Synonyms support with Azure Search in Sitecore, you probably have seen this article: https://blogs.perficient.com/2019/11/12/programmatically-creating-synonym-maps-in-azure-search-with-sitecore/. Which provided details on how to use Sitecore to store and create the Synonym Map in Azure Index.
However, after creating the Synonym maps in Azure, you may find that it doesn't work with the SXA Search component. In this post, I will talk about why and how to resolve the issue in Sitecore SXA, which will help you to understand how non-SXA works too.
Assume that you have succesfully setup Synonym Map on sxacontent field for "home resident". When doing a search for "home" using the SXA Search component, Sitecore generate a search query like below:
&search=sxacontent:(/.*home.*/)&$filter=(latestversion_1 and (path_1/any(t:t eq 'ed5230a06c234cd6a88c73daee6d5b5b')) and searchable)&queryType=full&$skip=0&$top=5
The problem is that Azure Search does not expand Synonyms on wildcard queries:
Synonym expansions do not apply to wildcard search terms; prefix, fuzzy, and regex terms aren't expanded.
The suggestion is to combine the wildcard with a simple fixed query, which means the above query should look like this:
&search=((sxacontent:(/.*home.*/) OR sxacontent:(home)))&$filter=(latestversion_1 and (path_1/any(t:t eq 'ed5230a06c234cd6a88c73daee6d5b5b')) and searchable)&queryType=full&$skip=0&$top=5
- Based on information collected, regular wildcard query also work (which is sxacontent:("*home*")), however, it produces a different result than the regex wildcard, so it's not discussed here.
- If queryType is changed to simple in the query, the original query also works. It is not easy to change, so not considered here.
Now we know what kind of query needs to be generated. The next step is to make Sitecore SXA to generate the query.
The place to modify is "ContentPredicate" method in Sitecore.XA.Foundation.Search.Services.SearchService. Out of the box code looks like this:
protected virtual Expression<Func<ContentPage, bool>> ContentPredicate(string content)
Expression<Func<ContentPage, bool>> expression = PredicateBuilder.True<ContentPage>();
foreach (string item in content.Split().TrimAndRemoveEmpty())
string t = item;
expression = expression.And((ContentPage i) => i.AggregatedContent.Contains(t));
A logical solution is just adding "|| i.AggregatedContent == t" but the tricky part is that since AggregatedContent is sxacontent filed in the index, and it's a string array. When you use string comparison on it, it builds the query as a filter instead of what we wanted. Here is an example of what that query looks like if you change the line to "expression = expression.And((ContentPage i) => i.AggregatedContent.Contains(t) || .AggregatedContent == t);"
&$filter=(latestversion_1 and (search.ismatchscoring('sxacontent:(/.*home.*/)', null, 'full', null) or (sxacontent/any(t:t eq 'home'))) and (path_1/any(t:t eq 'ed5230a06c234cd6a88c73daee6d5b5b')) and searchable)&$top=2147483647
This causes an exception in Azure Search because sxacontent is not defined as a filterable field. Even if you make sxacontent a filterable, Synonym doesn't work because Azure Search doesn't support expanding it in filters.
A number of solutions:
One: If instead of sxacontent field, you have a custom index field that is just a string type, above code works (with a different field name of course)
Two: you can overwrite Sitecore.ContentSearch.Azure.Query.SearchQueryBuilder.Contains to always use both regex wildcard and normal wildcard with an OR. Since SearchQueryBuilder is not directly exposed in DI, it will take a fair bit amount of work to replace it, it won't be an easy solution.
An example of how to do this can be referenced in this support ticket: https://github.com/SitecoreSupport/Sitecore.Support.147386/releases/tag/220.127.116.11. Check out the classes in repo that have to be duplicated, note that it may not be exactly the same to the version of Sitecore you have.
Three: Use "MatchWildcard" in Linq. The code looks like this:
public class SearchService : Sitecore.XA.Foundation.Search.Services.SearchService
protected override Expression<Func<ContentPage, bool>> ContentPredicate(string content)
var expression = PredicateBuilder.True<ContentPage>();
foreach (var t in content.Split().TrimAndRemoveEmpty())
expression = expression.And((ContentPage i) => i.AggregatedContent.Contains(t) || i.AggregatedContent.MatchWildcard(t));
Above code generates exactly the query we wanted. It may seem a bit weird and even feels like a "bug". But looking at the offcial documentaion on this page ( https://doc.sitecore.com/developers/92/sitecore-experience-manager/en/linq-to-sitecore.html):
results = queryable.Where(i => i.Template.Where(i => i.Template.MatchWildcard("H?li*m")));
The intention of MatchWildcard is to accept an expression with custom wildcard defined and just pass it as it is to the query. That's why it worked for our case. To be honest this may not seem the best fit for this method however comparing to the effort having to go through in solution two to overwrite the Sitecore Azure provider, this saves a huge amount of effort and has less potential impact to the site. I'd prefer this approach out of the three here.
The solution has been tested with Sitecore 9.2.0 with SXA 1.9 only. However I checked the code in Sitecore 9.3, in theory it should also work, best to double check.