Integrations / Platforms / Magento 2 / Indexing
Aug. 07, 2019

Indexing

In order to provide fast and relevant search, our engine restructures your data in a special way through a process called indexing. Our extension runs through all of your data - products, categories, and pages - and creates indexable objects out of it. These objects are then uploaded to our servers, either automatically via the extension’s queue, or manually via the Magento console or the command line. Once pushed to our servers, the objects go through an indexing process that transforms them into searchable data.

To learn about the indexing process, have a look at the documentation.

If you are having any issues with your data, indexes, or queue, please check our troubleshooting guide.

The extension automatically keeps all your data up to date to provide the best search experience for your users. To do this, we provide two indexing mechanisms in Magento:

  • Section reindex - An entire section of the catalog (products, categories, pages) is pushed to our servers and reindexed.
  • Single item reindex - A single resource (product, category, page) is pushed to our servers and reindexed. This happens when a resource is updated.

By default, the indexing operations run synchronously. This means the Magento administrator has to wait until the indexing process is finished before continuing. Since this is inconvenient, and can cause unexpected issues, we created the indexing queue. This will process all index operations in the background, and has some fail-safes built in.

Section Reindex

With indexing queue

With the indexing queue enabled, products are reindexed using temporary indices. Instead of sending all data to the production index, a copy will be created and swapped with the production index only when ready. This approach has several advantages:

  1. High reindexing speed
  2. Avoids potential inaccuracies with deleted products
  3. Lower number of indexing operations needed

Any changes to the index will be visible when the swap of the temporary index has been completed.

Without indexing queue

Without the indexing queue enabled, the reindex process has to handle the complete catalog synchronously. This means everything has to wait until the process is completed. Any update has to be pushed to our servers in order to have up-to-date data.

Processing large indices synchronously may trigger PHP timeouts.

This process does not only take more time than reindexing with the queue enabled, it also takes more resources. Next to that, it’s slightly less reliable as some product updates may not be handled if they were performed during the process.

Enabling the indexing queue is highly recommended for doing any full reindexing, especially on large catalogs.

Automatic Indexing

Our extension will send every update and deletion on products or categories to our servers to keep all data up-to-date. The indexers’ behavior can be changed to prevent these update calls, and only update the data through manual reindexing. For this to work, the indexers’ mode should be set to ‘Manual Update’.

Manual Indexing

With Manual Indexing enabled, the command line has to be used to send updates of the data to our server. For example, this is the command to completely reindex all products:

$
$ php path/to/magento/bin/magento indexer:reindex algolia_products

The same command can be used for all other indices created by our extension:

  • algolia_products - Reindexes all products
  • algolia_categories - Reindexes all categories
  • algolia_pages - Reindexes all CMS pages
  • algolia_suggestions - Reindexes all search query suggestions
  • algolia_additional_sections - Reindexes all additional sections
  • algolia_queue_runner - Process jobs in the indexing queue
  • algolia_delete_products - Removes inactive products from Algolia indices

Indexing Products

It is essential for e-commerce businesses to have exact and up-to-date product data in the search. We provide a number of ways to configure your search and indexing to accommodate as much use-cases as possible.

Full reindex command

$
$ php path/to/magento/bin/magento indexer:reindex algolia_products

Indexable products

To prevent too many indexed product and save indexing operations on your indices, we only index products that will actually show up in the webshop. This results in a set of requirements a product has to meet before we index the product.

We only index products that are:

  • Visible - either in the catalog, the search, or both
  • Enabled
  • Not deleted
  • In stock - unless the Magento settings tells us to show out-of-stock products, too.

If there’s ever a missing product in your index, make sure the product meets all these requirements.

More information about troubleshooting for missing data can be found here.

Searchable attributes

It’s possible to configure which attributes should be searched when users type their query. To configure the list of searchable attributes, navigate to the products tab through System > Configuration > Algolia Search.

In the products tab, it’s possible to configure per attribute if it is searchable, retrievable or ordered. By default, all attributes are set to be searched as unordered. In general, this is better for the relevance of the search and we don’t recommend changing it without a specific reason.

Read the dedicated documentation to learn more about the difference between ordered and unordered search.

Default searchable attributes

Some attributes are indexed, regardless of what is specified in the configuration. These attributes are not all searchable, but can be used for filtering, sorting, customizing the ranking and building the results page.

The attributes that are always indexed:

name The product’s name
url The product’s url
visibility_search The product’s visibility in the search
visibility_catalog The product’s visibility in the catalog
categories The product’s categories, formatted as a tree path
categories_without_path The product’s categories, without the tree path
thumbnail_url The product’s thumbnail image
image_url The product’s main image
in_stock The product’s stock availability
price The product’s price
type_id The product’s type (simple, configurable, bundled, etc.)

Facets

Facets are the attributes that will be used as filters on the results page. Common facets include price, color, categories, and brand. However, this will not work for every store. The facets have to be tuned to the products being sold and the way the end user searches for these products.

There are a couple of things that need to be specified for each facet:

  • The attribute
  • The label - this will be displayed above the filter
  • The type of facet

Configuration of facets

Facets can be set to be searchable. This will allow a user to search for a facet value by providing a search box in the filter. This is useful when a facet has a lot of different values, like with brands, for example.

Facets can also be attached to Query Rules. With this setting set to ‘Yes’, we will dynamically filter on a facet when a user searches a certain value in the query. For example, let’s assume we attached a Query Rule to the color attribute. Anytime the users’ query contains a color, like in “red shorts”, we will internally filter all results with the color ‘red’ as one of their attributes. This leads to more relevant results.

When a Query Rule is attached to an attribute, it is applied in both the autocomplete search and instant search results.

Query Rules have a limited quota. Make sure your plan supports the amount of Query Rules desired.

By default, we provide facets on the price, categories and color attributes.

Any numeric attribute (like price) will be shown as a slider.

Attributes that are specified as facets are automatically indexed as retrievable but not searchable. There’s no need to specify them in the Searchable Attributes configuration.

Sorting strategies

Sorting is only available on the InstantSearch results page.

When searching for products, users may expect multiple ways to sort the result set. For example, they want to sort by relevance, popularity, price or date.

The default sorting strategy when searching is sorting by relevance. Any other sorting strategy needs to be defined in the Sort Settings. For each strategy, an attribute, sort order (ascending or descending) and label should be defined.

Configuration of sorting strategies

By default, there are three sorting strategies:

  1. From lowest price to highest price
  2. From highest price to lowest price
  3. From newest to oldest

Each sorting strategy will create a new index, which will increase the amount of records. More information can be found here.

Attributes that are configured in a sorting strategy are automatically indexed as retrievable, but not searchable. There’s no need to specify them in the Searchable Attributes configuration.

Removing a sorting strategy will not automatically remove the index replica in Algolia. This has to be done manually through the dashboard.

Index Settings

Through the Magento dashboard, the following setting for an index can be configured.

Additional index settings can be managed in the Algolia dashboard. It’s also possible to modify the settings programmatically, by hooking into the algolia_products_index_before_set_settings event provided by the extension. A list of events provided by the extension can be found here.

Any changes done in the Algolia Dashboard will override these settings until a full reindex is performed from the Magento Dashboard.

Indexing Categories

To keep the amount of records and indexing operations as low as possible, we only index categories that are actually active. This behavior can be changed through the settings.

Show categories that are not included in the navigation menu configuration

If set to ‘Yes’, all categories will be shown in the autocomplete search and InstantSearch results page.

Full reindex command

$
$ php path/to/magento/bin/magento indexer:reindex algolia_categories

Searchable Attributes

It’s possible to configure which attributes should be searched when users type their query. To configure the list of searchable attributes, navigate to the Category configuration through Stores > Configuration > Algolia Search > Categories.

It’s possible to configure per attribute if it is searchable, retrievable or ordered. By default, all attributes are set to be searched as unordered. In general, this is better for the relevance of the search and we don’t recommend changing it without a specific reason.

Read the dedicated documentation to learn more about the difference between ordered and unordered search.

Default searchable category attributes

Some attributes are indexed, regardless of what is specified in the configuration. These attributes are not all searchable, but can be used for filtering, sorting, customizing the ranking and building the results page.

The attributes that are always indexed:

name The category’s name
url The category’s url
path The category’s path (parent categories)
level The category’s level in the category tree
include_in_menu The category’s visibility in the menu
_tags Filled automatically by the extension
popularity The category’s popularity
product_count The category’s amount of products

Index Settings

Through the Magento dashboard, the following setting for an index can be configured.

Additional index settings can be managed in the Algolia dashboard. It’s also possible to modify the settings programmatically, by hooking into the algolia_categories_index_before_set_settings event provided by the extension. A list of events provided by the extension can be found here.

Any changes done in the Algolia Dashboard will override these settings until a full reindex is performed from the Magento Dashboard.

Indexing Pages

CMS Pages will be automatically indexed by our extension, allowing the users to search for pages in the autocomplete menu. By default, all active pages are indexed. Configuration of excluded pages

The settings provide an option to exclude certain pages, like error pages, so they don’t show up in the search results.

The indexing of pages can be disabled altogether by navigating to the Additional Sections configuration, as shown below. Configuration of additional sections

Full reindex command

$
$ php path/to/magento/bin/magento indexer:reindex algolia_pages

Searchable Attributes

For pages, it’s not possible to configure the searchable attributes through the admin interface. However, it is possible to change them programmatically, by hooking into the algolia_after_create_page_object event provided by the extension. A list of events provided by the extension can be found here.

Default searchable page attributes

These attributes are indexed by default, and are not all searchable (some are). They can be used for filtering, sorting, customizing the ranking and building the results page.

The attributes that are always indexed:

name The page’s name
url The page’s url
slug The page’s slug
content The page’s content

Since records for our engine have to be smaller than 10 kilobytes, any page which is longer than 10.000 characters in content will not be indexed. In this case, only the page’s name would be searchable.

Read more here about the engine’s record limit.

Index Settings

The following settings will always be sent to configure the index, and cannot be changed through the admin interface:

Additional index settings can be managed in the Algolia dashboard. It’s also possible to modify the settings programmatically, by hooking into the algolia_pages_index_before_set_settings event provided by the extension. A list of events provided by the extension can be found here.

Any changes done in the Algolia Dashboard will override these settings until a full reindex is performed from the Magento Dashboard.

Indexing Suggestions

Every query that is being executed on the Magento installation is stored by Meganto in the database. Magento automatically stores the query, the number of results and the number of searches in the catalogsearch_query table, without any involvement from our extension.

Only back-end searches are stored by Magento. Search-as-you-type searches and instant search queries are not stored.

Our extension offers the possibility to index queries that are performed regularly on the Magento installation. In the settings, it’s possible to filter relevant queries (for example by minimum number of results, minimum popularity, etc.). The resulting queries can be pushed into the suggestions index, providing an autocomplete on the most relevant queries for the Magento installation.

Configuration of suggestions

To ensure the data in the suggestion index is good and relevant, the data in the catalogsearch_query table must be relevant as well. This can be achieved my enabling back-end search with our extension, by turning on the Enable Search and Make SEO Request settings in the Magento Administration.

With these options enabled, searches in the back-end will be processed by our extension. Since the data in catalogsearch_query will be updated as well, the queries in this table will become more relevant over time.

By default, suggestions are not indexed. When enabling the indexing of suggestions, a manual reindex needs to be triggered. Another way to start the indexing of suggestions is by adding a recurring job to the cron table:

1
1 * * * * php path/to/magento/bin/magento indexer:reindex algolia_suggestions

Full reindex command

$
$ php path/to/magento/bin/magento indexer:reindex algolia_suggestions

Searchable Attributes

For suggestions, it’s not possible to configure the searchable attributes through the admin interface. However, it is possible to change them programmatically, by hooking into the algolia_after_create_suggestion_object event provided by the extension. A list of events provided by the extension can be found here.

Default searchable query attributes

These attributes are indexed by default, and are not all searchable (some are). They can be used for filtering, sorting, customizing the ranking and building the results page.

The attributes that are always indexed:

query The query’s value
number_of_results The query’s number of results
popularity The query’s number of searches
updated_at The query’s last update timestamp

Index Settings

The following settings will always be sent to configure the index, and cannot be changed through the admin interface:

Additional index settings can be managed in the Algolia dashboard. It’s also possible to modify the settings programmatically, by hooking into the algolia_suggestions_index_before_set_settings event provided by the extension. A list of events provided by the extension can be found here.

Any changes done in the Algolia Dashboard will override these settings until a full reindex is performed from the Magento Dashboard.

Indexing Additional Sections

The autocomplete menu offers the possibility to display other sections from attributes, like colors and brands for example.

For this feature to work, the instant search page must be enabled

The attributes used for the additional sections have to be set as attributes for faceting. Configuration of additional sections

$
$ php path/to/magento/bin/magento indexer:reindex algolia_additional_sections

Searchable attributes

For Additional Settings, it’s not possible to configure the searchable attributes through the admin interface. However, it is possible to change them programmatically, by hooking into the algolia_additional_section_items_before_index event provided by the extension. A list of events provided by the extension can be found here.

Default searchable attributes

These attributes are indexed by default, and are not all searchable (some are). They can be used for filtering, sorting, customizing the ranking and building the results page.

The attributes that are always indexed:

value The attributes’s value (e.g. Red, XL, Nike, etc.)

Index settings

The following settings will always be sent to configure the index, and cannot be changed through the admin interface:

Additional index settings can be managed in the Algolia dashboard. It’s also possible to modify the settings programmatically, by hooking into the algolia_additional_sections_index_before_set_settings event provided by the extension. A list of events provided by the extension can be found here.

Any changes done in the Algolia Dashboard will override these settings until a full reindex is performed from the Magento Dashboard.

Remove inactive products

In case there are products indexed in Algolia, which are not supposed to be there, they can be removed by running algolia_delete_products indexer:

$
$ php path/to/magento/bin/magento indexer:reindex algolia_delete_products

This indexer will remove all products from Algolia indices, which shouldn’t be displayed in search. This come in handy in case some products were disabled or deleted directly in Magento’s database and the extension could reindex and remove them regularly.

The reindexer won’t delete or remove anything from Magento. Products are removed only from Algolia indices.

Did you find this page helpful?