Indexing
Indexing
In order to provide fast and relevant search, our engine restructures your data in a special way through a process called indexing. Our extension runs through all of your data - products, categories, and pages - and creates indexable objects out of it. These objects are then uploaded to our servers, either automatically via the extension’s queue, or manually via the Magento console or the command line. Once pushed to our servers, the objects go through an indexing process that transforms them into searchable data.
To learn about the indexing process, have a look at the documentation.
If you are having any issues with your data, indexes, or queue, please check our troubleshooting guide.
The extension automatically keeps all your data up to date to provide the best search experience for your users. To do this, we provide two indexing mechanisms in Magento:
- Section reindex - An entire section of the catalog (products, categories, pages) is pushed to our servers and reindexed.
- Single item reindex - A single resource (product, category, page) is pushed to our servers and reindexed. This happens when a resource is updated.
By default, the indexing operations run synchronously. This means the Magento administrator has to wait until the indexing process is finished before continuing. Since this is inconvenient, and can cause unexpected issues, we created the indexing queue. This will process all index operations in the background, and has some fail-safes built in.
Section Reindex
With indexing queue
With the indexing queue enabled, products are reindexed using temporary indices. Instead of sending all data to the production index, a copy will be created and swapped with the production index only when ready. This approach has several advantages:
- High reindexing speed
- Avoids potential inaccuracies with deleted products
- Lower number of indexing operations needed
Any changes to the index will be visible when the swap of the temporary index has been completed.
Without indexing queue
Without the indexing queue enabled, the reindex process has to handle the complete catalog synchronously. This means everything has to wait until the process is completed. Any update has to be pushed to our servers in order to have up-to-date data.
Processing large indices synchronously may trigger PHP timeouts.
This process does not only take more time than reindexing with the queue enabled, it also takes more resources. Next to that, it’s slightly less reliable as some product updates may not be handled if they were performed during the process.
Enabling the indexing queue is highly recommended for doing any full reindexing, especially on large catalogs.
Automatic Indexing
Our extension will send every update and deletion on products or categories to our servers to keep all data up-to-date. The indexers’ behavior can be changed to prevent these update calls, and only update the data through manual reindexing. For this to work, the indexers’ mode should be set to ‘Manual Update’.
Manual Indexing
With Manual Indexing enabled, the command line has to be used to send updates of the data to our server. For example, this is the command to completely reindex all products:
$
$ php path/to/magento/bin/magento indexer:reindex algolia_products
The same command can be used for all other indices created by our extension:
algolia_products
- Reindexes all productsalgolia_categories
- Reindexes all categoriesalgolia_pages
- Reindexes all CMS pagesalgolia_suggestions
- Reindexes all search query suggestionsalgolia_additional_sections
- Reindexes all additional sectionsalgolia_queue_runner
- Process jobs in the indexing queuealgolia_delete_products
- Removes inactive products from Algolia indices
Indexing Products
It is essential for e-commerce businesses to have exact and up-to-date product data in the search. We provide a number of ways to configure your search and indexing to accommodate as much use-cases as possible.
Full reindex command
$
$ php path/to/magento/bin/magento indexer:reindex algolia_products
Indexable products
To prevent too many indexed product and save indexing operations on your indices, we only index products that will actually show up in the webshop. This results in a set of requirements a product has to meet before we index the product.
We only index products that are:
- Visible - either in the catalog, the search, or both
- Enabled
- Not deleted
- In stock - unless the Magento settings tells us to show out-of-stock products, too.
If there’s ever a missing product in your index, make sure the product meets all these requirements.
More information about troubleshooting for missing data can be found here.
Searchable attributes
It’s possible to configure which attributes should be searched when users type their query. To configure the list of searchable attributes, navigate to the products tab through System > Configuration > Algolia Search.
In the products tab, it’s possible to configure per attribute if it is searchable, retrievable or ordered. By default, all attributes are set to be searched as unordered. In general, this is better for the relevance of the search and we don’t recommend changing it without a specific reason.
Read the dedicated documentation to learn more about the difference between ordered and unordered search.
Default searchable attributes
Some attributes are indexed, regardless of what is specified in the configuration. These attributes are not all searchable, but can be used for filtering, sorting, customizing the ranking and building the results page.
The attributes that are always indexed:
name | The product’s name |
url | The product’s url |
visibility_search | The product’s visibility in the search |
visibility_catalog | The product’s visibility in the catalog |
categories | The product’s categories, formatted as a tree path |
categories_without_path | The product’s categories, without the tree path |
thumbnail_url | The product’s thumbnail image |
image_url | The product’s main image |
in_stock | The product’s stock availability |
price | The product’s price |
type_id | The product’s type (simple, configurable, bundled, etc.) |
Facets
Facets are the attributes that will be used as filters on the results page. Common facets include price, color, categories, and brand. However, this will not work for every store. The facets have to be tuned to the products being sold and the way the end user searches for these products.
There are a couple of things that need to be specified for each facet:
- The attribute
- The label - this will be displayed above the filter
- The type of facet
Facets can be set to be searchable. This will allow a user to search for a facet value by providing a search box in the filter. This is useful when a facet has a lot of different values, like with brands, for example.
Facets can also be attached to Query Rules. With this setting set to ‘Yes’, we will dynamically filter on a facet when a user searches a certain value in the query. For example, let’s assume we attached a Query Rule to the color attribute. Anytime the users’ query contains a color, like in “red shorts”, we will internally filter all results with the color ‘red’ as one of their attributes. This leads to more relevant results.
When a Query Rule is attached to an attribute, it is applied in both the autocomplete search and instant search results.
Query Rules have a limited quota. Make sure your plan supports the amount of Query Rules desired.
By default, we provide facets on the price, categories and color attributes.
Any numeric attribute (like price) will be shown as a slider.
Attributes that are specified as facets are automatically indexed as retrievable but not searchable. There’s no need to specify them in the Searchable Attributes configuration.
Sorting strategies
Sorting is only available on the InstantSearch results page.
When searching for products, users may expect multiple ways to sort the result set. For example, they want to sort by relevance, popularity, price or date.
The default sorting strategy when searching is sorting by relevance. Any other sorting strategy needs to be defined in the Sort Settings. For each strategy, an attribute, sort order (ascending or descending) and label should be defined.
By default, there are three sorting strategies:
- From lowest price to highest price
- From highest price to lowest price
- From newest to oldest
Each sorting strategy will create a new index, which will increase the amount of records. More information can be found here.
Attributes that are configured in a sorting strategy are automatically indexed as retrievable, but not searchable. There’s no need to specify them in the Searchable Attributes configuration.
Removing a sorting strategy will not automatically remove the index replica in Algolia. This has to be done manually through the dashboard.
Index Settings
Through the Magento dashboard, the following setting for an index can be configured.
searchableAttributes
customRanking
unretrievableAttributes
attributesForFaceting
maxValuesPerFacet
removeWordsIfNoResults
Additional index settings can be managed in the Algolia dashboard.
It’s also possible to modify the settings programmatically, by hooking into the algolia_products_index_before_set_settings
event provided by the extension.
A list of events provided by the extension can be found here.
Any changes done in the Algolia Dashboard will override these settings until a full reindex is performed from the Magento Dashboard.
Indexing Categories
To keep the amount of records and indexing operations as low as possible, we only index categories that are actually active. This behavior can be changed through the settings.
If set to ‘Yes’, all categories will be shown in the autocomplete search and InstantSearch results page.
Full reindex command
$
$ php path/to/magento/bin/magento indexer:reindex algolia_categories
Searchable Attributes
It’s possible to configure which attributes should be searched when users type their query. To configure the list of searchable attributes, navigate to the Category configuration through Stores > Configuration > Algolia Search > Categories.
It’s possible to configure per attribute if it is searchable, retrievable or ordered. By default, all attributes are set to be searched as unordered. In general, this is better for the relevance of the search and we don’t recommend changing it without a specific reason.
Read the dedicated documentation to learn more about the difference between ordered and unordered search.
Default searchable category attributes
Some attributes are indexed, regardless of what is specified in the configuration. These attributes are not all searchable, but can be used for filtering, sorting, customizing the ranking and building the results page.
The attributes that are always indexed:
name | The category’s name |
url | The category’s url |
path | The category’s path (parent categories) |
level | The category’s level in the category tree |
include_in_menu | The category’s visibility in the menu |
_tags | Filled automatically by the extension |
popularity | The category’s popularity |
product_count | The category’s amount of products |
Index Settings
Through the Magento dashboard, the following setting for an index can be configured.
Additional index settings can be managed in the Algolia dashboard.
It’s also possible to modify the settings programmatically, by hooking into the algolia_categories_index_before_set_settings
event provided by the extension.
A list of events provided by the extension can be found here.
Any changes done in the Algolia Dashboard will override these settings until a full reindex is performed from the Magento Dashboard.
Indexing Pages
CMS Pages will be automatically indexed by our extension, allowing the users to search for pages in the autocomplete menu. By default, all active pages are indexed.
The settings provide an option to exclude certain pages, like error pages, so they don’t show up in the search results.
The indexing of pages can be disabled altogether by navigating to the Additional Sections configuration, as shown below.
Full reindex command
$
$ php path/to/magento/bin/magento indexer:reindex algolia_pages
Searchable Attributes
For pages, it’s not possible to configure the searchable attributes through the admin interface.
However, it is possible to change them programmatically, by hooking into the algolia_after_create_page_object
event provided by the extension.
A list of events provided by the extension can be found here.
Default searchable page attributes
These attributes are indexed by default, and are not all searchable (some are). They can be used for filtering, sorting, customizing the ranking and building the results page.
The attributes that are always indexed:
name | The page’s name |
url | The page’s url |
slug | The page’s slug |
content | The page’s content |
Since records for our engine have to be smaller than 10 kilobytes, any page which is longer than 10.000 characters in content will not be indexed. In this case, only the page’s name would be searchable.
Read more here about the engine’s record limit.
Index Settings
The following settings will always be sent to configure the index, and cannot be changed through the admin interface:
searchableAttributes
:unordered(slug)
,unordered(name)
andunordered(content)
attributesToSnippet
:content:7
Additional index settings can be managed in the Algolia dashboard.
It’s also possible to modify the settings programmatically, by hooking into the algolia_pages_index_before_set_settings
event provided by the extension.
A list of events provided by the extension can be found here.
Any changes done in the Algolia Dashboard will override these settings until a full reindex is performed from the Magento Dashboard.
Indexing Suggestions
Every query that is being executed on the Magento installation is stored by Meganto in the database.
Magento automatically stores the query, the number of results and the number of searches in the catalogsearch_query
table, without any involvement from our extension.
Only back-end searches are stored by Magento. Search-as-you-type searches and instant search queries are not stored.
Our extension offers the possibility to index queries that are performed regularly on the Magento installation. In the settings, it’s possible to filter relevant queries (for example by minimum number of results, minimum popularity, etc.). The resulting queries can be pushed into the suggestions index, providing an autocomplete on the most relevant queries for the Magento installation.
To ensure the data in the suggestion index is good and relevant, the data in the catalogsearch_query
table must be relevant as well.
This can be achieved my enabling back-end search with our extension, by turning on the Enable Search and Make SEO Request settings in the Magento Administration.
With these options enabled, searches in the back-end will be processed by our extension.
Since the data in catalogsearch_query
will be updated as well, the queries in this table will become more relevant over time.
By default, suggestions are not indexed. When enabling the indexing of suggestions, a manual reindex needs to be triggered. Another way to start the indexing of suggestions is by adding a recurring job to the cron table:
1
1 * * * * php path/to/magento/bin/magento indexer:reindex algolia_suggestions
Full reindex command
$
$ php path/to/magento/bin/magento indexer:reindex algolia_suggestions
Searchable Attributes
For suggestions, it’s not possible to configure the searchable attributes through the admin interface.
However, it is possible to change them programmatically, by hooking into the algolia_after_create_suggestion_object
event provided by the extension.
A list of events provided by the extension can be found here.
Default searchable query attributes
These attributes are indexed by default, and are not all searchable (some are). They can be used for filtering, sorting, customizing the ranking and building the results page.
The attributes that are always indexed:
query | The query’s value |
number_of_results | The query’s number of results |
popularity | The query’s number of searches |
updated_at | The query’s last update timestamp |
Index Settings
The following settings will always be sent to configure the index, and cannot be changed through the admin interface:
searchableAttributes
:unordered(query)
customRanking
:desc(popularity)
,desc(number_of_results)
typoTolerance
:false
attributesToRetrieve
:query
removeWordsIfNoResults
:lastWords
Additional index settings can be managed in the Algolia dashboard.
It’s also possible to modify the settings programmatically, by hooking into the algolia_suggestions_index_before_set_settings
event provided by the extension.
A list of events provided by the extension can be found here.
Any changes done in the Algolia Dashboard will override these settings until a full reindex is performed from the Magento Dashboard.
Indexing Additional Sections
The autocomplete menu offers the possibility to display other sections from attributes, like colors and brands for example.
For this feature to work, the instant search page must be enabled
The attributes used for the additional sections have to be set as attributes for faceting.
$
$ php path/to/magento/bin/magento indexer:reindex algolia_additional_sections
Searchable attributes
For Additional Settings, it’s not possible to configure the searchable attributes through the admin interface.
However, it is possible to change them programmatically, by hooking into the algolia_additional_section_items_before_index
event provided by the extension.
A list of events provided by the extension can be found here.
Default searchable attributes
These attributes are indexed by default, and are not all searchable (some are). They can be used for filtering, sorting, customizing the ranking and building the results page.
The attributes that are always indexed:
value | The attributes’s value (e.g. Red, XL, Nike, etc.) |
Index settings
The following settings will always be sent to configure the index, and cannot be changed through the admin interface:
searchableAttributes
:unordered(value)
Additional index settings can be managed in the Algolia dashboard.
It’s also possible to modify the settings programmatically, by hooking into the algolia_additional_sections_index_before_set_settings
event provided by the extension.
A list of events provided by the extension can be found here.
Any changes done in the Algolia Dashboard will override these settings until a full reindex is performed from the Magento Dashboard.
Remove inactive products
In case there are products indexed in Algolia, which are not supposed to be there, they can be removed by running algolia_delete_products
indexer:
$
$ php path/to/magento/bin/magento indexer:reindex algolia_delete_products
This indexer will remove all products from Algolia indices, which shouldn’t be displayed in search. This come in handy in case some products were disabled or deleted directly in Magento’s database and the extension could reindex and remove them regularly.
The reindexer won’t delete or remove anything from Magento. Products are removed only from Algolia indices.