In this article, we will see when we need to use elasticsearch percolator query and how to implement it in Ruby. The elasticsearch percolator query is written based on Ubuntu, but it works in other Linux libraries too.

How does it work?

We believe most Elasticsearch developers think conventionally, and so, they design documents according to the structure of data and store them in an index. Then they define queries through the search API to retrieve these documents. The percolator works in the opposite (reverse) direction. Meaning, first, you store queries into an index and then through the Percolate API you define documents in order to retrieve these queries

All queries are loaded in memory
Each document is indexed in memory
All queries get executed against it
Execution time linear to # of queries
Memory index gets cleaned up

When do we need to use percolator?

The usage of the Percolate API in Elasticsearch is quite common, and for the purpose of document monitoring and alerting.

For example, provision of a platform that stores users’ interests in order to send the right content (notification alert) to the right users every time new content comes in.

For instance, a user subscribes to a specific topic, and as soon as a new article for that topic comes in, a notification will be sent to the interested users.

How is this done?

By expressing the users’ interests as an elasticsearch query, using the query DSL, and you can register it in elasticsearch as though it was a document. Every time a new article is issued, without needing to index it, you can percolate it to know which users are interested in it.

At this point in time you know who needs to receive a notification containing the article link (sending the notification is not done by elasticsearch though). An additional step would also be to index the content itself but that is not required.

The uses of this concept are many, such as alerting weather forecast, price monitoring, news alerts, stocks alerts, logos monitoring and many more.

Pre-requisites & Setup:

Java:

Elastic search engine is developed in Java, so we need to make sure Java is installed with help of the below command:

java --version

Installing Elasticsearch:

Next, install Elasticsearch with the below command:

sudo apt-get install elasticsearch

In order to make sure that Elasticsearch is installed correctly, use the following command:

curl -XGET 'localhost:9200'

The result should be something like the following:

{
"name" : "lNOxiFt",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "r8yOSyCjRtmHFYmdbijjpg",
"version" : {
"number" : "5.1.2",
"build_hash" : "c8c4c16",
"build_date" : "2017-01-11T20:18:39.146Z",
"build_snapshot" : false,
"lucene_version" : "6.3.0"
},
"tagline" : "You Know, for Search"

Using Percolator:

The following steps explain how your queries get store into an index and how you define documents in order to retrieve these queries through the Percolate API.

Requirement and service set up
Making a connection
Create a index
Index a query
Percolate a document

Requirement & Service setup

In order to implement elasticsearch percolator, we need elasticsearch gem.

gem 'elasticsearch'

I created one service object to index query.

index_service = Services::Percolation.new
index_service.re_index

Making a connection

In order to make a connection, we need elasticsearch-transport, which provides a low-level Ruby client for connecting to an Elasticsearch cluster.

def initialize(cfg)
@cfg = cfg
transport_configuration = lambda do |f|
f.response :logger
f.adapter  :typhoeus
end
transport = Elasticsearch::Transport::Transport::HTTP::Faraday.new hosts: [
{ host: @cfg['elastic']['url'], port: @cfg['elastic']['port'] } ], &transport_configuration
@server = Elasticsearch::Client.new log: true, transport: transport
end
def re_index
index_name = "percolator-index"
delete_index(index_name)
create_index(index_name)
ds = ['foo', 'bar']
ds.map do |i|
index(i, index_name)
end
end

Create an index

Create an index with two mappings:

def create_index(index_name)
@server.indices.create index: index_name, body: {
mappings: {
doctype: {
properties: {
message: {
type: "text"
}
}
},
queries: {
properties: {
query: {
type: "percolator"
}
}
}
}
}
end

The doctype mapping is the mapping use to pre-process the document define in the elasticsearch percolator query before it gets index into a temporary index.

The queries mapping is the mapping used for indexing the query documents. A json object is store in the query field, and this json object actually constitutes an Elasticsearch query.

Further, this query field is configured in such a way as to utilise the percolator field type. This particular field type (the percolator field type) is used since it is the one that can comprehend the query dsl.

This is also useful because of the manner in which it stores the query. The documents specified on the elasticsearch percolator query can be match at any point later, with the query.

Index a query

def index(ds, index_name)
query = { query: { match: { message: "#{ds}" } } }
begin
r = @server.index index: index_name, type: 'queries', id: ds, body: query
puts 'Indexing result:'
puts r.inspect
rescue Faraday::Error::ResourceNotFound,
Faraday::Error::ClientError,
Faraday::Error::ConnectionFailed => e
puts "Connection failed: #{e}"
false
end
end

Percolate a document

Match a document to the registered percolator queries:

def list_document(index_name='percolator-index')
sleep 2
doc = { query: { percolate: { field: "query", document_type: "doctype",document: {message: 'message foo bar'} } } }
data = @server.search index: index_name, type: 'queries', body: doc
puts "final result"
puts data
end

The above request will yield the following output response:

{"took"=>8, "timed_out"=>false, "_shards"=>{"total"=>5, "successful"=>5, "failed"=>0}, "hits"=>{"total"=>2, "max_score"=>0.25316024, "hits"=>[{"_index"=>"percolator-index", "_type"=>"queries", "_id"=>"foo", "_score"=>0.25316024, "_source"=>{"query"=>{"match"=>{"message"=>"foo"}}}}, {"_index"=>"percolator-index", "_type"=>"queries", "_id"=>"bar", "_score"=>0.25316024, "_source"=>{"query"=>{"match"=>{"message"=>"bar"}}}}]}}

This can then be use in whichever manner to render the desire output.

This is a sample implementation of elasticsearch percolator query using Ruby, and as mentioned above, it has quite a lot of features. To learn more, check this ElasticSearch Percolator.

To checkout this particular example, please check agiratech github repo.

0Likes

Elasticsearch Percolator Query Implementation in Ruby

How does it work?

When do we need to use percolator?

Pre-requisites & Setup:

Java:

Installing Elasticsearch:

Using Percolator:

Requirement & Service setup

Making a connection

Create an index

Index a query

Percolate a document

Bharanidharan Arumugam

Elasticsearch Percolator Query Implementation in Ruby

How does it work?

When do we need to use percolator?

Pre-requisites & Setup:

Java:

Installing Elasticsearch:

Using Percolator:

Requirement & Service setup

Making a connection

Create an index

Index a query

Percolate a document

Bharanidharan Arumugam

Related Blogs

Thinking about investing in a CRM? Here’s what you should consider first

How to Fix ‘A JavaScript Error Occurred in the Main Process’ Error in Discord?

Black Friday SaaS Deals: A Comprehensive SaaS List for 2020

Top 10 PHP Frameworks To Rule Web In 2021

React Concurrent Mode – Everything You Should Know

A Celebration Of Success: Agira's 4th Year Anniversary