Connecting to JDBC datasources
Siren Federate provides the capability to query data from a remote JDBC databases.
Settings
In order to send queries to virtual indices the Elasticsearch cluster must contain at least one node enabled to issue queries over JDBC; it is advised to use a coordinating only node for this role, although this is not a requirement for testing purposes.
JDBC node settings
In order to enable JDBC on a node where the Siren Federate plugin is installed, add the following setting to elasticsearch.yml
:
node.attr.connector.jdbc: true
Then, create a directory named jdbc-drivers
inside the configuration directory of the node (e.g. elasticsearch/config
or /etc/elasticsearch
).
Finally, copy the JDBC driver for your remote datasource and its dependencies to the jdbc-drivers
directory created above and restart the node; see the JDBC driver installation and compatibility section for a list of compatible drivers and dependencies.
Common configuration settings
Encryption
JDBC passwords are encrypted by default using a predefined 128 bit AES key; before creating datasources, it is advised to generate a custom key by running the keygen.sh
script included in the siren-federate
plugin directory as follows:
bash plugins/siren-federate/tools/keygen.sh -s 128
The command will output a random base64 key; it is also possible to generate keys longer than 128 bit if your JVM supports it.
To use the custom key, the following parameters must be set in elasticsearch.yml
on master nodes and on all the JDBC nodes:
-
siren.connector.encryption.enabled
:true
by default, can be set tofalse
to disable JDBC password encryption. -
siren.connector.encryption.secret_key
: a base64 encoded AES key used to encrypt JDBC passwords.
Example elasticsearch.yml
settings for a master node with a custom encryption key:
siren.connector.encryption.secret_key: "1zxtIE6/EkAKap+5OsPWRw=="
Example elasticsearch.yml
settings for a JDBC node with a custom encryption key:
siren.connector.encryption.secret_key: "1zxtIE6/EkAKap+5OsPWRw==" node.attr.connector.jdbc: true
Restart the nodes after changing the configuration to apply the settings.
JDBC driver installation and compatibility
The JDBC driver for your remote datasource and its dependencies must be copied to the jdbc-drivers
subdirectory inside the configuration directory of JDBC nodes (e.g. elasticsearch/config/jdbc-drivers
).
It is not required nor recommended to copy these drivers to nodes which are not enabled to execute queries.
You may create a sub-directory within jdbc-drivers
to store a driver and to provide a custom security policy file for this driver. It is recommended to use this approach for drivers that come in multiple jars. A custom security policy file enables the definition of driver-specific permissions. The custom security policy file must be named security.policy
and must be located inside the driver sub-directory. The following variables can be used within the policy file:
-
codebase.federate.common
: Path to the directory storing thesecurity.policy
(defaults tojdbc-drivers
if the defaultdrivers-security.policy
file is used) -
codebase.federate.${jar_name)
: Path to a driver jar. Here,${jar_name}
refers to the filename of the jar stored in the directory where thesecurity.policy
file is located (defaults to jars found injdbc-drivers
if the defaultdrivers-security.policy
file is used). For Example:
grant codeBase "${codebase.federate.postgresql-42.2.5.jar}" { // Permissions for postgresql-42.2.5.jar }
If a security.policy
is placed in the main jdbc-drivers
directory, then it overrides the default drivers-security.policy
Name | JDBC class | Notes |
---|---|---|
PostgreSQL | org.postgresql.Driver | Download the latest JDBC 4.2 driver from https://jdbc.postgresql.org/download.html and copy the |
MySQL | com.mysql.jdbc.Driver | Download the latest GA release from https://dev.mysql.com/downloads/connector/j/, extract it, then copy When writing the JDBC connection string, set the |
Microsoft SQL Server 2014 or greater | com.microsoft.sqlserver.jdbc.SQLServerDriver | Download |
Sybase ASE 15.7+ | com.sybase.jdbc4.jdbc.SybDriver OR net.sourceforge.jtds.jdbc.Driver | To use the FreeTDS driver, download the latest version from https://sourceforge.net/projects/jtds/files/, extract it, then copy To use the jConnect driver, copy |
Oracle 12c+ | oracle.jdbc.OracleDriver | Download the latest |
Presto | com.facebook.presto.jdbc.PrestoDriver | Download the latest JDBC driver from https://prestodb.io/docs/current/installation/jdbc.html and copy it to the |
Spark SQL 2.2+ | com.simba.spark.jdbc41.Driver | The Magnitude JDBC driver for Spark can be purchased at https://www.simba.com/product/spark-drivers-with-sql-connector/; once downloaded, extract the bundle, then extract the JDBC 4.1 archive and copy the following jars to the In addition, copy your license file to the |
Dremio | com.dremio.jdbc.Driver | Download the jar at https://download.siren.io/dremio-jdbc-driver-1.4.4-201801230630490666-6d69d32.jar and copy it to the |
Impala | com.cloudera.impala.jdbc41.Driver | Download the latest JDBC bundle from https://www.cloudera.com/downloads/connectors/impala/jdbc/2-5-42.html, extract the bundle, then extract the JDBC 4.1 archive and copy the following jars to the |
Restart the JDBC node after copying the drivers.
Operations on virtual indices
The plugin supports the following operations on virtual indices:
- get mapping
- get field capabilities
- search
- msearch
- get
- mget
Search requests involving a mixture of virtual and normal Elasticsearch indices (e.g. when using a wildcard) are not supported and will be rejected; it is however possible to issue msearch requests containing requests on normal Elasticsearch indices and virtual indices.
When creating a virtual index, the plugin will create an empty Elasticsearch index for interoperability with Search Guard and Elastic X-Pack; if an Elasticsearch index with the same name as the virtual index already exists and it is not empty, the virtual index creation will fail.
When deleting a virtual index, the corresponding Elasticsearch index will not be removed.
Type conversion
The plugin converts JDBC types to their closest Elasticsearch equivalent:
- String types are handled as
keyword
fields. - Boolean types are handled as
boolean
fields. - Date and timestamp are handled as
date
fields. - Integer types are handled as
long
fields. - Floating point types are handled as
double
fields.
Complex JDBC types which are not recognized by the plugin are skipped during query processing and resultset fetching.
Supported search queries
The plugin supports the following queries:
- match_all
- term
- terms
- range
- exists
- prefix
- wildcard
- ids
- bool
At this time the plugin provides no support for datasource specific full text search functions, so all these queries will work as if they were issued against keyword
fields.
Supported aggregations
Currently the plugin provides support for the following aggregations:
Metric:
- Average
- Cardinality
- Max
- Min
- Sum
Bucket:
- Date histogram
- Histogram
- Date range
- Range
- Terms
- Filters
Only terms aggregations can be nested inside a parent bucket aggregation.
Troubleshooting
Cannot reconnect to datasource by hostname after DNS update
When the Java security manager is enabled, the JVM will cache name resolutions indefinitely; if the system you’re connecting to uses round-robin DNS or the IP address of the system changes frequently, you will need to modify the following Java Security Policy properties:
-
networkaddress.cache.ttl
: the number of seconds to cache a successful DNS lookup. Defaults to-1
(forever). -
networkaddress.cache.negative.ttl
: the number of seconds to cache an unsuccessful DNS lookup. Defaults to10
, set to0
to avoid caching.