Improving Magento search results with Elasticsearch

Posted on

Why is search so important?

We all understand how important search is to the internet, imagine trying to find things if you can’t use search engines such as Google or Bing. It follows that if you want people to find your store that you need good search engine optimisation, it is critical to an eCommerce store’s success. But once a customer has visited your site how do they continue to discover other products on your site. Customers are used to using search so is important that your search can deliver the results your customer expects. A poor search on your site could send them back to the search engine and potentially to a competitor.

Default search, disappointing results

Unfortunately Magento’s default search tends to provide disappointing results. It can vary from not matching anything to matching far too much. Customers come away thinking either you don’t stock the item they are looking for or they get tired of scrolling through page after page of irrelevant results.

So why is Magento’s default search so poor? Full text search is a very specialised technology, by default Magento’s search uses MySQL which is a general purpose database and does not have the features required to be able to produce a good quality search.

Magento Enterprise and Solr

Magento Enterprise has solved the search problem by replacing MySQL with Apache Solr as its full text search engine. Because Enterprise uses Solr it can provide you with much greater control over how it performs its search. Allowing you to tune the results and ensure your customers find the items which they where looking for.

Why ElasticSearch?

So if Magento Enterprise’s search is so great why look at ElasticSearch? First not everyone has Magento Enterprise, and we at iWeb wanted to develop a new search which could be used by all of our clients who wanted it. Secondly whilst Solr is an excellent full text search engine we felt that ElasticSearch had a number of significant advantages.

image

Both ElasticSearch and Solr are full text search engines built on top of Lucene. Lucene is a database specialising in text analysis, and unlike MySQL, it is designed to work with text and is able to analyse natural language. Lucene is also fiendishly fast.

Both ElasticSearch and Solr are based on the same full text database and so have similar search performance, but ElasticSearch is a new search engine which has been designed from the beginning to operate in the cloud. It also has a modern API which makes managing and configuring the search engine easy, allowing us to build much more sophisticated integrations with Magento.

ElasticSearch at iWeb

At iWeb we have extensive experience of full text search with Apache Solr. We have been using it on a variety of websites for nearly 10 years.

Our initial experience of ElasticSearch was as a log analysis tool, and it was here that we could see one of the great advantages of ElasticSearch; it was able to index thousands of lines of log files per hour.

This is a key difference between our experience of Solr and ElasticSearch. ElasticSearch’s ability to index new data and searching simultaneously. This means we can keep the search up today with real time data, such as stock, without disrupting the search.

Features

Here at iWeb we have written a Magento Extension to take advantage of the power of ElasticSearch. We have an improved autocomplete, a new custom index, and fuzzy spell checking.

Autocomplete

A key feature of ElasticSearch is its near realtime autocompletion. ElasticSearch can search faster then a customer can type, so it makes it perfect for a “search as you type” autocomplete. The problem is Magento is very heavy, even when it only has to respond to small ajax requests it can still take hundreds of milliseconds. This is okay for full page loads but you notice it as you type in a search box, there is a pause between pressing a key and seeing the suggestions.

screen-2015-08-25-at-11.15.37

Our solution was to bypass Magento altogether and to talk to ElasticSearch directly. The responses are more then ten times faster and the search can easily keep up with the customer as they type. Then we generate a thumbnail and index the url of the product so that the customer can see a preview of the product as they type and can go straight to the product if they click on it.

Custom Search Indexer

Magento’s current search is built from the full text index in the Magento admin. This index takes all of the product data and converts it into a form that can be indexed by MySQL. The data is combined together as a single text field per product. The problem with this is that the context of the text is lost, we don’t know if the text is part of the description, or the title, or is an attribute. It is important to know the context if you want to have a successful search.

iWeb have created a custom indexer, in fact we created two, to index the product data. We ensure the context of the data is preserved when it is inserted into ElasticSearch. This enables us to tune a search so a search term matching an SKU has a higher score then if the term was also found in the description.

Why two indexers?

iWeb have created not one but two indexers; an attribute indexer and a product indexer. As you would expect, the product indexer indexes all of the products on your site. The attribute indexer was created so that we can maximise the features of ElasticSearch with minimum index time.

Performing a full product index can take a while, Magento has to analyse the data of all of your products and insert them into ElasticSearch. But many of ElasticSearch’s features are applied at index time, when the data is first inserted into the database. ElasticSearch analyses the data and stores the result. Our module allows you to alter how fields are analysed, but the problem is ElasticSearch won’t re-analyse items which have already been indexed. Running a full reindex of your products takes time and can slow down your website. The solution to this problem is the attribute index, it allows you to change how ElasticSearch analyses the data without having to reindex all of the products. The attribute indexer synchronises your changes to attribute data (for example making a new field searchable) with ElasticSearch and then tells ElasticSearch to rebuild the index based on the data it already has. Running the attribute indexer is significantly faster then running the product indexer. So you can make changes to how your data is indexed without having to run a full reindex of your data.

Fuzzy Search

As well as providing the standard “AND” and “OR” search, the iWeb ElasticSearch module introduces a “FUZZY” operator. The problem with the standard “AND” and “OR” is that they don’t always give good results. An “AND” search requires all of the terms in a search to match, so the search results won’t include products if one term is missing. So potential product matches are excluded because the search is to0 exact. The “OR” search has the opposite problem, a product will get matched even if only one term matches the product. If it is a common term then you could end up with hundreds of results.

ElasticSearch goes someway to improve this, because it is able to rank search results much better then the default search so even if you have hundreds of matches the most suitable ones should be at the top.

The “FUZZY” search strikes a balance between the two extremes, instead of saying all or any of the terms need to match we can say the majority of the terms need to match. So if you type a search with three terms it will need to match two terms. Products matching all three terms still score the highest and appear at the top but the other results are still good quality.

screen-2015-08-25-at-11.19.11

The “FUZZY” search is also able to deal with spelling mistakes, it is able to calculate the Levenshtein distance, the number of one character changes that need to be made to one term to make it the same as another term. For example Lvoe -> Love is one edit distance, moving the o and the v. So if somebody searched for “Lvoe” the would get results match “Love”.

Future Features

We are constantly improving our search offering, responding to store owners’ feedback to improve the quality of the search results. We are currently working on analysing sales data so that the search can perform conversion calculations and boost products which convert well for particular search terms. We are also looking at giving the store owner the ability to boost products they wish to promote.

Thanks for reading and contact us about our search and how we can use it to improve you website.

Want to discuss a project?

Talk to our Magento experts on 01785 279920

Request a callback