Indexing
Synchronous or asynchronous?
Before we start indexing our data there is one important thing to understand about how Algolia works.
When you index in Algolia, a network call with the write operation is sent synchronously to the Algolia API but
then the engine will process the operation asynchronously. Technically, if you do a search right after the API
acknowledges your indexing operation request, you may not find it. To monitor this, all indexing operation return a taskID
so you can
ask the API if a task was processed.
For performance reasons, you may want to also call algolia asynchronously. This gem support many background job managers.
When testing, and only when testing, you may want to make everything synchronous and wait for each operations.
Manual indexing
Using rake
command
The Rake command algoliasearch:reindex
will look for all models including the
AlgoliaSearch
module in your app and index them into Algolia.
1
2
3
4
5
6
$ rake algoliasearch:reindex
Reindexing 1 models: Contact.
Contact
Reindexing 500 records...
Reindexing all models
The gem provides 2 ways to reindex all your objects. One will send all entries to the Algolia index. The other will create a temporary index.
Regular reindexing
To reindex all your objects in place, use the reindex!
class method. This method will send all found
entries to the Algolia index. Any record with the same objectID will be replaced, any new ones will be
added. Note that if you’ve deleted an entry since the last time you reindexed, these records will not be
deleted.
If you want to delete them, it’s best to clear your index first.
1
2
Contact.clear_index!
Contact.reindex!
Reindexing this way means your index will be empty for a small period of time (making search ineffective). If you want to reindex without any downtime, use the atomic reindexing way explained below.
Zero-downtime reindexing
To reindex all your records, taking into account the deleted objects and without any downtime, use
the reindex
class method.
This method will index all your objects to a temporary index called <INDEX_NAME>.tmp
and move (rename)
the temporary index to the final one as soon as everything is indexed. This guarantees that your index is
never empty but requires that your plan has enough record quota to hold the temporary index.
This is the safest way to reindex all your content.
1
Contact.reindex
Notes: if you’re using an index-specific API key, ensure you’re allowing both <INDEX_NAME>
and <INDEX_NAME>.tmp
.
Clearing the index
Clearing an index will remove all records but preserve settings, synonyms, and rules.
1
Contact.clear_index!
Indexing a subset
You can index a subset of your records using model scoping.
You most likely don’t want to use the atomic reindexing in this case because it will replace
the entire index, keeping the filtered objects only. Use reindex!
(with the trailing !
).
1
Contact.where('updated_at > ?', 10.minutes.ago).reindex!
If you already have a list of models available, and you want to send them, you can pass the list of
objects to the index_objects
class method.
1
2
objects = Contact.limit(5)
Contact.index_objects objects
Indexing a single instance
You can trigger indexing using the index!
instance method. The same way, you can remove
a model from the Algolia index via remove_from_index!
.
1
2
3
4
5
c = Contact.create!(params[:contact])
# Add to Algolia
c.index!
# Remove from Algolia
c.remove_from_index!
Automatic updates
To keep Algolia indices up-to-date, this gem makes extensive use of Rails’ callbacks to trigger the indexing tasks.
If you’re using methods bypassing after_validation
, before_save
or after_commit
callbacks, it won’t
index your changes. For example: update_attribute
doesn’t perform validation checks, so it’s recommended to
use update_attributes
instead, which makes validation checks.
Each time a record is saved, it will be indexed. The same way, each time a record is destroyed, it will be removed from the index.
You can disable auto-indexing and auto-removing setting the following options:
1
2
3
4
5
6
7
class Contact < ActiveRecord::Base
include AlgoliaSearch
algoliasearch auto_index: false, auto_remove: false do
attribute :first_name, :last_name, :email
end
end
Temporary disable auto-indexing
You can temporary disable auto-indexing using the without_auto_index
scope.
This is often used for performance reasons. In the following snippets, we’ll delete all contacts in the Algolia index and create 10,000 contacts. The first snippet will dispatch an indexing operations to Algolia after each creation, resulting in 10,001 HTTP calls.
By disabling auto-index while creating contacts and calling reindex!
after, we’ll only dispatch about a dozen
HTTP calls, because the reindex method creates batch automatically.
1
2
Contact.delete_all
1.upto(10000) { Contact.create! attributes }
1
2
3
4
5
Contact.delete_all
Contact.without_auto_index do
1.upto(10000) { Contact.create! attributes } # inside this block, auto indexing task will not run.
end
Contact.reindex! # will use batch operations
Indexing only if attributes have changed
For all attributes binding to the database, Rails provide a method to know if an attribute has
changed named #{attr_name}_changed?
.
If you defined a fullname
dynamic attribute for instance, you should also provide fullname_changed?
method. Without
this method, every model touched will dispatch an indexing operation even if nothing has changed. Internally,
we’ll call the _changed? method for all attributes found in the algoliasearch
block.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class Contact < ActiveRecord::Base
include AlgoliaSearch
algoliasearch do
attributes :first_name, :email
attribute :full_name
end
def full_name
"#{first_name} #{last_name}"
end
def full_name_changed?
first_name_changed? || last_name_changed?
end
end
The methods #{attr_name}_changed?
were deprecated
in Rails 5.1 and renamed to will_save_change_to_#{attr_name}
.
This gem will check for both method names regardless of your Rails version:
- Prepare the future and use the new name with older version of Rails.
- Upgrade without issues, old method name will still be called.
In the previous example, we could define will_save_change_to_full_name
instead.
tags
and geoloc
helpers
If you are using the tags
or geoloc
helpers
they map to the _tags
and _geoloc
attribute under the hood resulting in the following method names. Notice the
double underscores.
_tags_changed?
_geoloc_changed?
will_save_change_to__tags
will_save_change_to__geoloc
1
2
3
4
5
6
7
8
9
10
11
12
class Contact < ActiveRecord::Base
include AlgoliaSearch
algoliasearch do
attributes :first_name, :email
geoloc :latitude, :longitude
end
def _geoloc_changed?
latitude_changed? || longitude_changed?
end
end
Single _changed?
method
If you prefer, you can also define a algolia_dirty?
method. If this method is found, we’ll use it instead of
calling all the _changed? methods.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class Contact < ActiveRecord::Base
include AlgoliaSearch
algoliasearch do
attributes :first_name, :email
attribute :full_name
end
def full_name
"#{first_name} #{last_name}"
end
def algolia_dirty?
# return true if model should be reindexed
end
end
Conditional indexing
You can add constraints controlling if a record must be indexed by using options the :if
or :unless
options.
It allows you to do conditional indexing on a per-record basis.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
class Post < ActiveRecord::Base
include AlgoliaSearch
algoliasearch if: :published?, unless: :deleted? do
end
def published?
# [...]
end
def deleted?
# [...]
end
end
As soon as you use those constraints, addObjects
and deleteObjects
calls will be performed in order to keep the
index synced with the DB. Indeed, this stateless gem doesn’t know if the object doesn’t match your constraints
anymore or if it never did, so we always send an add or a delete operation. You can work around this behavior
by creating a #{attr_name}_changed?
method.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
class Contact < ActiveRecord::Base
include AlgoliaSearch
algoliasearch if: :published do
end
def published
# true or false
end
def published_changed?
# return true only if you know that the 'published' state changed
end
end