When users go to your application and use search to find data, you want to give them a relevant response quickly. The definition of “relevant” depends largely on the nature of your application and your users. Azure Search provides building blocks for defining search relevance in your application. The feature is called a “scoring profile”. If you haven’t looked at scoring profiles before, you can learn more about them here.
In the results that come back from a search query, all hits are scored and by sorted highest to lowest (by default; you can also override the sort order of results). A scoring profile lets you boost the search scores in documents based on numeric values in documents (e.g. favor items with higher star rating, or with higher profit margin), based on dates (e.g. favor newer documents over older ones) and based on distance to a reference point (e.g. favor items that are geographically closer to a reference location that’s passed as part of the query).
There’s one scenario that can’t be modeled in the current implementation of scoring profiles: boosting based on personalized data. Let’s say you have customers that purchase items from you regularly. For each customer, you track their top 3 or 4 brands they buy the most often. Now what you’d like to do is to boost documents in search results when those documents represent products of the preferred brands. Note that this is contextual; each user would have a different set of top-K brands they prefer.
In our experimental API (2014-10-20-Preview), we’re introducing a new scoring function called “tag” to handle this scenario.
Tag boosting
The new “tag” scoring function is used as follows:
1. As part of your index definition, create a scoring profile with a tag scoring function where you indicate which field in your document contains a tag or label list (typically this is used with a string collection field). In the example above, each product would have one or more tags indicating the brand of the product.
2. At query time, each request also takes a list of tags or labels which correspond to the context you’re trying to establish. Continuing with the example, you’d load the list of top brands for this particular customer (e.g. perhaps maintained as part of the customer profile record).
3. During scoring, Azure Search will give a boost to documents that have tags in common with the input query.
To make this concrete, let’s walk through these steps with a concrete index definition. Let’s say our index looks like this:
{ "name": "products", "fields": [ { "name": "id", "type": "Edm.String", "key": true }, { "name": "name", "type": "Edm.String" }, { "name": "brandTags", "type": "Collection(Edm.String)", "searchable": false } ], "scoringProfiles": [ { "name": "personalized", "functions": [ { "type": "tag", "boost": 2, "fieldName": "brandTags", "tag": { "tagsParameter": "brands" } } ] } ] }
Above you can see the index definition with 3 fields (id, name and brandTags) and a scoring profile with a single scoring function. With this in place, we can index documents the usual way and then start issuing queries.
To issue a query, we include all the usual search options you would use in your application, plus the parameter we defined in the scoring profile (“brands”):
https://…/indexes/products/docs?search=&scoringParameter=brands:brandA,brandB
In this request those documents will be scored based on the quality of text matches as usual but also those that are of brandA or brandB will get an extra boost, making it more likely that our customer will see what she’s looking for.
Scenarios
As described above, the tags scoring profile feature is applicable to cases where you want to personalize search ranking.
In the case of ecommerce applications, you could use the purchase history of your customers to produce tags for each of them, you could use machine learning/clustering techniques to group and tag them based on what they’ve put in their shopping carts, or even manually tag them. Identifying and tagging customers can start simple and get sophisticated over time, no need to get fancy to get started.
You can also use tag boosting to give different results based on role. For example, if you are indexing documentation about several topics and you know for each of your users their affinity to certain topics, you could use tag boosting give documents in the topics of preference of each users more visible. Similarly, in a line-of-business application used by different departments, you can use the user’s department as a query tag and boost documents (e.g. customer records, marketing campaigns, contact information, whatever you’re indexing) most related to their particular area of work, assuming each document has an “area” or “areas” field that captures the affinity of each document to one or more areas.
When can I use it?
This is already available in all services. It’s only accessible using the experimental 2014-10-20-Preview API version since we’re still collecting feedback on the approach. You can ask questions and send us feedback using the Azure Search forums.
You can find the exact details for tag scoring functions in the 2014-10-20-Preview docs here.
Liam Cavanagh can be contacted at his blog or through twitter.