A/B Testing
On this page
Algolia has in some ways become its own client. It has leveraged two signature features - Relevance Tuning and Analytics - to create a new tool: A/B Testing. Let’s see what this means.
-
Relevance Tuning enables you to give your users the best search results. Algolia offers numerous settings and methods to manage relevance.
-
Analytics makes relevance tuning data-driven, ensuring that your configuration choices are sound and effective.
Relevance tuning, however, can be tricky. The choices are not always obvious. It is sometimes hard to know which settings to focus on and what values to set them to. It is also hard to know if what you’ve done is useful or not. What you need is input from your users, to test your changes live.
This is what A/B Testing does. It allows you to create 2 alternative search experiences, A and B, each with their own settings, and to put them both live, to see which one performs best.
A/B testing defined
Create two alternatives (variants), with only a small difference in their relevance settings. Call them A and B. Put them both live on your website, but make it transparent to your users. Present A to some users and B to the rest (the split is determined by a unique user id). With Algolia’s analytics, capture the same user events for both A and B. Measure these captured events against each other, creating scores. Use these scores to determine whether A or B is a better user experience. Adjust your settings accordingly. Start a new test (if you want to make any further configuration changes).
With this feature, you run alternative indices or searches in parallel, capturing click and conversion analytics to compare effectiveness. You make small incremental changes to your main index or search and have those changes tested - live and transparently by your customers - before making them official. A/B Testing goes directly to an essential source of information - your users - by including them in the decision-making process, in the most reliable and least burdensome way.
A/B comparative testing is widely-used in the industry, to measure the usability and effectiveness of a website. Algolia’s focus is on measuring search, and more specifically relevance: Are your users getting the best search results? Is your search effective in engaging and retaining your customers? Is it leading to more clicks, more sales, more activity for your business?
A/B Testing - Implementation
We have designed A/B Testing with simplicity in mind, to encourage you to perform A/B Testing regularly and often. A/B Testing does not require any coding intervention. It can be managed from start to finish by people with no technical background.
1. Prerequisite: Collect clicks and conversions
In order to perform A/B testing, you need to set up your click and/or conversion analytics. This is the only way to test how each variant is performing.
While A/B testing itself does not require coding, sending clicks and conversions requires coding.
2. Set up the index or query
We allow 2 kinds of A/B Tests:
- Test different index settings (index setup required).
- Test different search params (no index setup required).
3. Run the A/B test
- Use the Dashboard to create the A/B test
- Start the A/B test
- Review, interpret, and then act based on the results
A/B Testing - Dashboard
To access A/B testing analytics and create A/B tests, you should go through the A/B testing tab of the Dashboard. The A/B testing section provides very basic analytics, so you may want to get more information through the Analytics tab. However, because all search requests to an A/B test first target the primary (‘A’) index, viewing a test directly on the Analytics tab will include searches to both the A and B variants.
To let you view detailed analytics for each variant independently, we automatically add tags to A/B test indexes. To access the variant’s analytics, click on the small analytics icon on the right of each index description, in the A/B test tab: it appears as a small bar graph in the figure above. The icon will automatically redirect you to the Analytics tab with the appropriate settings (time range and analyticsTags) applied.
Examples - What kind of tests can you make?
As already mentioned, we allow 2 kinds of A/B Tests:
- different index settings
- different search settings
For index-based testing, you can test:
- Changing your index settings
- Reformatting your data
For search-based settings, you can test many different kinds of settings, like:
- disabling typo tolerance and other such engine-level settings
- disabling query rules
- using optional filters
- etc.
Example 1, Changing your index settings
Add a new custom ranking with the attribute, _number_of_likes_
- You’ve recently offered your users the ability to like your items, which include music, films, and blog posts.
- Now you have a large amount of likes and dislikes, and you’d like to use this information to sort your search results.
- So you add a number_of_likes attribute to A, create B (a replica of A), and then adjust A and B’s settings accordingly:
- A does not sort by number_of_likes (so it’s the same as before)
- B sorts by number_of_likes
- You name your test “Test new ranking with number_of_likes”.
- You want this test to run for 30 days, to be sure to get enough data and a good variety of searches.
- You set B at only 10% usage, because of the risk of introducing a new sorting. You don’t want to change the user experience for too many users until you’re are absolutely sure the change is desirable.
- When your test reaches 95% confidence or greater, you will be able to see whether there was any improvement, and whether the improvement is large enough to justify the cost of implementing. In most cases, a settings change costs nothing, it’s just a simple configuration change on the Dashboard.
Example 2, Reformatting your data
Add a new search attribute: short_description
- Your company has added a new short description for every item. You want to see if this short description will help return more relevant results.
- Add the new attribute short_description to Index A.
- Create replica B, which will have all of the same attributes and settings as A.
- For B only, add the new attribute to its
searchableAttributes
settings. - You create a test called “Testing the new short description”.
- You have enough traffic to know that 7 days is sufficient.
- For the same reason as example 1, you give B only 30% usage (70/30) - because of the risk. You estimate that after 7 days, there will be enough data for both A and B to make a decision, and you’d rather not risk degrading an already good search with an untested attribute.
- Once you reach 95% confidence, you can judge the improvement and the cost of implementation to see whether this change is beneficial.
Example 3, enabling/disabling query rules: compare a query with and without merchandising
For this, you will have two kinds of searches:
- one with query rules enabled
- one with query rules disabled
Add a query rule to promote a hit
- Your company has just received the new iphone. You want this item to appear at the top of the list for all searches that contain “apple” or “iphone” or “mobile”.
- You want to see if by putting the new iphone at the top, this will encourage more traffic or sales.
- You create a test called “Testing newly-released iphone merchandising”.
- Your index should have query rules enabled. This is A.
- For your variant B, add a query parameter to disable query rules.
- You have enough traffic to know that 7 days is sufficient.
- For the same reason as example 1, you give B only 30% usage (70/30) - because of the risk. You estimate that after 7 days, there will be enough data for both A and B to make a decision, and you’d rather not risk degrading an already good search with an untested attribute.
- Once you reach 95% confidence, you can judge the improvement and the cost of implementation to see whether this change is beneficial.