Skip to main content


Michal Komoch
Elasticsearch

How to Personalize Search Results with Elasticsearch

A few years ago, a simple search engine was sufficient – a few search criteria based on a relational database were enough. Yet, there are no universal and irreplaceable solutions. Thanks to data growth, new search tools have appeared. An undeniable advantage is their ability to influence search results and sorting. I’d like to provide a few examples of how we can manipulate search results with Elasticsearch.

Searching for data was, is and will be an indispensable part of almost every web application. In the past few years, the approach to data searches has changed. With the development of web applications and the moving of large sets of data to online services, the amount of data has increased rapidly. Existing search tools and solutions have become insufficient. A good, old search engine based on MySQL ceased to be fast and flexible enough.

Let’s see what those new search tools can do.

A search engine used for searching through job applications would be a great example here. Let’s say that we have a web application with multiple companies. Each company stores job applications (CVs) of users applying for a job. Almost everyone knows how hard it is to get a good employee that fits our requirements.

Our search engine needs to have multiple search criteria. This sounds quite easy, but what if every company wants to search for applicants in its own, personalized manner? There are many ways to do this. One of them is to add “weights” to certain search parameters. Going further, we can attach these weights to a certain company and give their company admins permissions to manage them.

Elasticsearch is one of the search tools that help you create a search engine with personalized weights. We can now create an advanced search engine in an easy way. What’s more, we’ll have a possibility to affect the results. Thanks to Elasticsearch, we not only get an advanced search engine. We ensure the scalability and easier development of our application in case new functionalities are added in the future.

Examples are based on this technology stack:

– Elasticsearch 2.3.5
– Symfony 2.8 (FosElasticaBundle included)

In the examples, I would like to base on two types of variables attached to user applications:

Let’s say we’ve got two statuses of “employment types”:

1 – Permanent job
2 – Temporary job

and two statuses of work types:

1 – Full time
2 – Part time

I filled Elasticsearch with some sample data:

RECORD ONE (applicant):

– Employment type = 1
– Work type = 1

RECORD TWO (applicant):

– Employment type = 2
– Work type = 1

RECORD THREE (applicant):

– Employment type = 1
– Work type = 2

Introduction

Basic search query and its result without weights looks just like this:

elas1

Let’s remember the weights:

1 → 2.71 (candidate 1)
2 → 0.567 (candidate 2)
3 → 0.567 (candidate 3)

Example 1: Personalize search weights

Weights configuration:

similarity_tresholds:
       work_types.id:                    			1.0
    employment_types.id:              		2.0
    industries.id:                    			2.0
    experiences.company.industry.id:  	2.0
    schools.education_level.id:       		2.0
    salary:                           			2.0
    skills:                           			2.0
    language_skills:                  			2.0
    job_titles.job_title.name:        		2.0

In this case, weights are defined globally in one of the parameters files in Symfony2. But nothing stands in the way of moving this configuration to a database and attaching them to certain companies.

And what does it look like in PHP?

We passed declared weights in the constructor. Then, in the class responsible for the manipulation of weights, we have a method for checking if we have defined weights which we can attach to a certain search criteria.

At the end, we create a part of a search query with suitable weight to Elasticsearch.

public function __construct(array $filtersToBoost = [])
{
    $this->filtersToBoost = $filtersToBoost;
}
private function applyBoostIfExists($filterName, BoolQuery $boolQuery)
{
    if (array_key_exists($filterName, $this->filtersToBoost)) {
        $boolQuery->setBoost($this->filtersToBoost[$filterName]);
    }

    return $boolQuery;
}


In the results, we get a simple Elasticsearch query where the work type parameter has a weight with value 1 and the employment type parameter has a weight with value 2. It means that the employment type parameter is 2 times more important than the work type parameter. In practice, it looks just like this:

elas2

In result we get:

1 → 2.57 (candidate 1) user with employment type = 1 – score boosted
2 → 0.79 (candidate 3) user with employment type = 1 – score boosted
3 → 0.33 (candidate 2)

As we can see, candidates with employment type = 1 are scored higher. This example shows how can we manage search weights in a simple way.

Example 2: Personalize results with score functions

Search results in Elasticsearch are sorted by “score” value. If the personalization of weights isn’t good enough or doesn’t fit our needs, we have the option to multiple the score value of a record by the boost_factor parameter.

Let’s say we’d like to see the records with work type = 1 have their scores increased twice.

The Elasticsearch query would look like this:

elas3

We get:
1 → 4.06 (candidate 1)
2 → 1.13 (candidate 2)
3 → 0.56 (candidate 3)

Example 3: Score script

An extension of the above functionality is an inline script. One of the languages in which we can write that script is “Groovy”. With that solution, we can personalize results based on record data we have stored in Elasticsearch.

If we want to have results/users who are looking for permanent job (1) and full time (1) on the top of the list, then we can use the score script to boost those records 4 times.

An example of an Elasticsearch query:

elas4

In results we get:

1 → 6.77 (candidate 1) – a user who has employment type = permanent job and work type = full time
2 → 0.64 (candidate 2)
3 → 0.56 (candidate 3)

Conclusion: Good search tools give you power

The presented personalized search solutions with Elasticsearch are common and simple. I’d like to show how developers can create advanced search engines easily, based on customer needs. Those above examples are a great starting point for more complicated conditions.

As someone said: “The sky’s the limit!”

Share:Share on FacebookTweet about this on TwitterShare on LinkedInShare on Google+Pin on Pinterest

  • This is a great writeup! How does this scale?
    1) Can you provide a different script_score function for each query so that each individual user could have a different personalized results (maybe changing their values but also changing which fields are considered since the personalization for one person might be interested in 3 things [A,B,C] whereas for another person there might be 4 things [A,D,E,F])
    2) Does the Groovy script have to get evaluated for each query, which would be slower because it has to compile?

    Thanks!

    • Michał Kómoch

      Hi Greg! Thanks for your comment.
      I am using Elastic Search for complex search engine but unfortunately we have only 300 000 records so it is hard to talk about scale in that case.

      Answer 1:
      Yes, you can provide different script scores for every query. It depends how would you like to create that scripts.
      – If you use Java/PHP or other language to build your application, than you would be able to “build a proper script score” based on personalized “user data” than you can just attach it to the query before you do the final call of that query.
      – If you are able to put all “relations/conditions” to one BIG script score than you can write it in a separate file and just attach it to your query.
      It all depends on approach.

      Answer 2:
      I have never heard about possibility to call (or not) the groovy script based on search criteria (if I understand your question correctly). But as I mentioned above during building your query you can specify to use or not to use the script.

      Hope it helps;-)

Like what you see?

Get in touch! We'll respond quickly, and we'll keep your data confidential.