Configuring a remote Elasticsearch connector
Siren Federate provides the capability to query data from an Elasticsearch cluster through the remote clusters module and the Siren Federate connector APIs.
The remote Elasticsearch cluster does not have the Siren Federate plugin installed. Therefore Siren Federate cannot push down a join to the remote cluster. Instead, the computation of the join is done on the local cluster using the |
Compatibility with security systems
To execute joins spanning several clusters, set the following cluster- and index-level permissions on the clusters.
On the local Federate cluster:
-
cluster:internal/federate/*
-
indices:data/read/mget
-
indices:data/read/msearch
-
indices:data/read/mtv
-
indices:data/read/scroll*
-
indices:data/read*
-
indices:admin/template/get
-
indices:admin/aliases/get
-
indices:admin/aliases/exists
-
indices:admin/get
-
indices:admin/exists
-
indices:admin/mappings/fields/get*
-
indices:admin/mappings/get*
-
indices:admin/mappings/federate/connector/get*
-
indices:admin/mappings/federate/connector/fields/get*
-
indices:admin/types/exists
-
indices:admin/validate/query
-
indices:monitor/settings/get
For the remote ES cluster:
-
indices:data/read/mget
-
indices:data/read/msearch
-
indices:data/read/mtv
-
indices:data/read/scroll*
-
indices:admin/template/get
-
indices:data/read*
-
indices:data/read/search
The remote Elasticsearch connector is compatible with the following security systems:
Before you begin
-
Ensure that the remote clusters are configured as described in the Configuring remote clusters section of the Elasticsearch documentation.
-
Set up the remote Elasticsearch clusters. For example, use the following settings:
curl -X PUT http://localhost:9200/_cluster/settings -H 'Content-type: application/json' -d ' { "persistent": { "cluster": { "remote": { "remotefederate": { "seeds": [ "127.0.0.1:9330" ] } } } } } '
Procedure
In this procedure, we are using the example of a remote Elasticsearch cluster called remoteelasticsearch
, which contains indices called logs-2019.01
, logs-2019.02
, …, logs-2019.12
, and so on.
-
Define the datasource as an alias to the remote Elasticsearch cluster, by using the Siren Federate datasource API as follows:
curl -X PUT http://localhost:9200/_siren/connector/datasource/remoteelasticsearchds -H 'Content-type: application/json' -d ' { "elastic": { "alias": "remoteelasticsearch" } } '
-
Define a virtual index on the coordinator cluster that matches the wildcard index pattern
logs-*
, by using the Siren Federate virtual index API as follows:curl -X PUT http://localhost:9200/_siren/connector/index/logsvi -H 'Content-type: application/json' -d ' { "datasource": "remoteelasticsearchds", "resource": "logs-*", "key": "_id" } '
-
Execute a join query. For example, the coordinator cluster contains an index called
machines
, which contains information about IP addresses on machines of interest. To find out about the logs that are associated to these machines, execute the following Federate join query:curl -X GET http://localhost:9200/siren/logsvi/_search -H 'Content-Type: application/json' -d ' { "query": { "join": { "indices": [ "machines" ], "on": [ "logs_ip_hash", "machines_ip_hash" ], "request": { "query": { "match_all": { } } } } } } '
logs_ip_hash
is the IP field in the indexlogsvi
andmachines_ip_hash
is the IP field in the indexmachines
.The API returns the following response:
{ "took": 150, "timed_out": false, "hits": { "total" : { "value": 1, "relation": "eq" }, "max_score": 1, "hits": [ { "_index": "logs-2019-11-12", "_id": "0", "_score": 2, "_source": { "date": "2019-11-12T12:12:12", "message": "trying out Siren" } } ] } }