Introduction

Siren Investigate is an investigative intelligence platform built upon Kibana 5.6.9. Siren Investigate 10.0.0 supports Elasticsearch 5.6.9 and the Siren Federate plugin version 5.6.9-10.0.0. Siren Federate replaces the Siren Join plugin for distributions based on Elasticsearch 5.x.

Siren Investigate enables you to perform complex analytics on large volumes of data by providing customizable visualizations (charts, maps, metrics and tables) on Elasticsearch searches. Visualizations can be organized into multiple dashboards.

Search results can be filtered interactively through a variety of techniques (date ranges, full text queries, field value matching). By setting up relations between indices, it is possible to filter search results matching documents in a different dashboard, for example by displaying only companies that received investments in a particular year.

In addition, search results can be filtered and augmented by queries on multiple external datasources such as SQL databases and REST APIs; queries on external datasources can also be used as aggregations in visualizations.

In addition to visualizations provided by Kibana, Siren Investigate provides:

  • The Relational Filter visualization (Deprecated), which enables you to configure relations between fields in different indices and to apply cross-dashboard filters (pivoting).

  • The Relational Navigator visualization, which enables you to navigate between relationally connected dashboards.

  • The Siren Investigate Timeline visualization, which displays a timeline with multiple groups of data coming from different indices.

  • The Radar Chart visualization, which is a graphical method for displaying multivariate data with multiple groups of data coming from different indices.

  • The Bubble Diagram visualization, which displays series of data grouped into packed circles.

  • The Scatter Plot visualization, which displays a scatter plot chart in different modes.

  • The Box Plot visualization, which displays a box plot chart from the data.

  • The Horizontal Bar Chart visualization, which displays a horizontal bar chart.

  • The Multi Chart visualization, which displays different types of charts for the same data and allow to save and to select multiple aggregation configurations.

  • The Enhanced Search Results visualization, which displays query results in a table.

  • The Siren Investigate Query Viewer, which enables the visualization of queries on external datasource through Jade or Handlebars templates.

  • The Siren Investigate Graph Browser, which displays the currently selected Elasticsearch documents as a node of a graph and enables the user to visually explore the connection between vertices.

The Relational Filter visualization requires the Siren Federate plugin 5.6.9-10.0.0 for Elasticsearch.

How does Siren Investigate compare to Kibana?

Siren Investigate is currently developed as a fork of Kibana 5.6.9. Although configuration objects are mostly the same, you should keep Siren Investigate and Kibana in separate indices.

What’s new in Siren Investigate v10.0.0

To see all the changes, check the full release notes.

Set up Siren Investigate

This section includes information on how to setup Siren Investigate and get it running, including:

  • Downloading

  • Installing

  • Starting

  • Configuring

  • Upgrading

Supported platforms

Packages of Siren Investigate are provided for and tested against Linux, macOS and Windows. Because Siren Investigate runs on Node.js, we include the necessary Node.js binaries for these platforms. Running Siren Investigate against a separately maintained version of Node.js is not supported.

Elasticsearch version

Siren Investigate should be configured to run against an Elasticsearch node of the same version. This is the officially supported configuration.

Running different major version releases of Siren Investigate and Elasticsearch, for example Siren Investigate 5.x and Elasticsearch 2.x, is not supported, nor is running a minor version of Siren Investigate that is newer than the version of Elasticsearch, for example Siren Investigate 5.4 and Elasticsearch 5.0.

Running a minor version of Elasticsearch that is higher than Siren Investigate will generally work in order to facilitate an upgrade process where Elasticsearch is upgraded first, for example Siren Investigate 5.2.2 and Elasticsearch 5.4. In this configuration, a warning will be logged on Siren Investigate server startup, so it is only meant to be temporary until Siren Investigate is upgraded to the same version as Elasticsearch.

Running different patch version releases of Siren Investigate and Elasticsearch, for example Siren Investigate 5.4.1-1 and Elasticsearch 5.4, is generally supported, though we encourage users to run the same versions of Siren Investigate and Elasticsearch down to the patch version.

Installing Siren Investigate

Siren Investigate is provided in the following package formats:

zip

The zip packages are provided for installation on Linux, Darwin and Windows and are the easiest choice for getting started with Siren Investigate.

docker

Siren Investigate Docker images are available at https://hub.docker.com/u/sirensolutions/.

If your Elasticsearch installation is protected by X-Pack Security see Using Kibana with X-Pack Security for additional setup instructions.

Running Siren Investigate on Docker

Docker images for Siren Investigate are available from the sirensolutions organization on Dockerhub.

Pulling the Image

Obtaining Siren Investigate for Docker is as simple as issuing a docker pull command.

The Docker image for Siren Investigate 10.0.0 can be retrieved with the following command:

docker pull sirensolutions/siren-platform:latest
docker run -d -p 5606:5606 -p 9220:9220 sirensolutions/siren-platform:latest

or for specific version, for example 10.0.0:

docker pull sirensolutions/siren-platform:10.0.0
docker run -d -p 5606:5606 -p 9220:9220 sirensolutions/siren-platform:10.0.0

For an image pre-populated with demonstration data:

docker pull sirensolutions/siren-platform-demo-data:latest
docker run -d -p 5606:5606 -p 9220:9220 sirensolutions/siren-platform-demo-data:latest

Environment variable configuration

Under Docker, Siren Investigate can be configured using environment variables. When the container starts, a helper process checks the environment for variables that can be mapped to Siren Investigate command-line arguments.

For compatibility with container orchestration systems, these environment variables are written in all capitals, with underscores as word separators. The helper translates these names to valid Siren Investigate setting names.

Some example translations are shown here:

Example Docker Environment Variables
Environment Variable

Siren Investigate Setting

SERVER_NAME

server.name

KIBANA_DEFAULTAPPID

kibana.defaultAppId

XPACK_MONITORING_ENABLED

xpack.monitoring.enabled

In general, any setting listed in Configuring Siren Investigate or X-Pack Settings can be configured with this technique.

These variables can be set with docker-compose like this:

services:
  investigate:
    image: docker.elastic.co/kibana/kibana:10.0.0
    environment:
      SERVER_NAME: siren.example.org
      ELASTICSEARCH_URL: http://elasticsearch.example.org

Because environment variables are translated to CLI arguments, they take precedence over settings configured in investigate.yml.

Docker defaults

The default settings when using the siren-investigate image (standalone Siren Investigate image) are:

elasticsearch.url

http://localhost:9220

server.basepath

""

kibana.index

.siren

Install Siren Investigate with .zip

Siren Investigate is provided for Linux, Darwin and Windows as a .zip package. These packages are the easiest formats to use when trying out Siren Investigate.

The latest stable version of Siren Investigate can be found on the Siren Support Portal. Descriptions of the separate demonstration packages are also available there.

Download and install the Linux 64-bit package

After you have obtained your license, you should have access to the Siren Support Portal's download pages. The siren-investigate package can be downloaded from there, either by clicking the package link in the browser or, right click the link, then use the copied link as investigate-link as follows:

wget {investigate-link}.zip
sha1sum siren-investigate-10.0.0-linux-x86_64.zip (1)
unzip siren-investigate-10.0.0-linux-x86_64.zip -d .
cd siren-investigate-10.0.0-linux-x86_64/ (2)
1 Compare the SHA produced by sha1sum or shasum with the published sha1.txt (found on the Siren Investigate download page on the Siren Support Portal).
2 This folder is known as $INVESTIGATE_HOME.

Download and install the Darwin package

After you have obtained your license, you should have access to the Siren Support Portal's download pages. The siren-investigate package can be downloaded from there, either by clicking the package link in the browser or, right click the link, then use the copied link as investigate-link as follows:

wget {investigate-link}.zip
sha1sum siren-investigate-10.0.0-darwin-x86_64.zip (1)
unzip siren-investigate-10.0.0-darwin-x86_64.zip -d .
cd siren-investigate-10.0.0-darwin-x86_64/ (2)
1 Compare the SHA produced by sha1sum or shasum with the published sha1.txt (found on the Siren Investigate download page on the Siren Support Portal).
2 This folder is known as $INVESTIGATE_HOME.

Running Siren Investigate from the command prompt

Siren Investigate can be started from the command prompt as follows:

./bin/investigate

By default, Siren Investigate runs in the foreground, prints its logs to the standard output (stdout), and can be stopped by pressing Ctrl-C.

Configuring Siren Investigate using Config File

Siren Investigate loads its configuration from the $INVESTIGATE_HOME/config/investigate.yml file by default. The format of this config file is explained in Configuring Siren Investigate.

Directory layout of Linux/Darwin .zip archives

The .zip packages are entirely self-contained.

This is very convenient because you do not have to create any directories to start using Siren Investigate, and uninstalling is as easy as removing the folder. However, it is advisable to change the default locations of the config and data directories so that you do not delete important data later on.

Type Description Default Location Setting

home

Siren Investigate home folder or $INVESTIGATE_HOME

Folder created by unpacking the archive; in demo distributions, the folder is siren-investigate.

bin

Binary scripts including kibi to start the Siren Investigate server and kibi-plugin to install plugins

$INVESTIGATE_HOME\bin

config

Configuration files including investigate.yml

$INVESTIGATE_HOME\config

data

The location of the data files written to disk by Siren Investigate and its plugins

$INVESTIGATE_HOME\data

optimize

Transpiled source code. Certain administrative actions, for example plugin install, result in the source code being retranspiled on the fly.

$INVESTIGATE_HOME\optimize

plugins

The location of the plugin files. Each plugin will be contained in a subfolder.

$INVESTIGATE_HOME\plugins

Install Siren Investigate on Windows

Siren Investigate can be installed on Windows using the .zip package; zip packages can be downloaded from the download page. The demo versions contain a pre-configured Elasticsearch cluster in addition to Siren Investigate.

Running Siren Investigate from the command prompt

Siren Investigate can be started from the command prompt as follows:

.\bin\investigate.bat

By default, Siren Investigate runs in the foreground, prints its logs to STDOUT, and can be stopped by pressing Ctrl-C.

Configuring Siren Investigate using config file

Siren Investigate loads its configuration from the $INVESTIGATE_HOME/config/investigate.yml file by default. The format of this config file is explained in Configuring Siren Investigate.

Folder layout of Windows .zip archive

The .zip package is entirely self-contained.

This is very convenient because you do not have to create any directories to start using Siren Investigate, and uninstalling Siren Investigate is as easy as removing the folder. However, it is advisable to change the default locations of the config and data directories so that you do not delete important data later on.

Type Description Default Location Setting

home

Siren Investigate home folder or %INVESTIGATE_HOME%

Folder created by unpacking the archive; in demo distributions, the folder is kibi.

bin

Binary scripts including kibi to start the Siren Investigate server and kibi-plugin to install plugins

%INVESTIGATE_HOME%\bin

config

Configuration files including investigate.yml

%INVESTIGATE_HOME%\config

data

The location of the data files written to disk by Siren Investigate and its plugins

%INVESTIGATE_HOME%\data

optimize

Transpiled source code. Certain administrative actions, for example plugin install, result in the source code being retranspiled on the fly.

%INVESTIGATE_HOME%\optimize

plugins

The location of the plugin files. Each plugin will be contained in a subfolder.

%INVESTIGATE_HOME%\plugins

Configuring Siren Investigate

The Siren Investigate server reads properties from the investigate.yml file on startup. The default settings configure Siren Investigate to run on localhost:5606. To change the host or port number, or connect to Elasticsearch running on a different machine, you must update your investigate.yml file. You can also enable SSL and set a variety of other options.

External datasource configuration is documented in the JDBC datasources and Legacy REST datasources chapters, while access control configuration is documented in the Search Guard Integration and Siren Investigate access control chapter.

Environment Variable Placeholders

It is possible to use environment variable placeholders in configuration settings. The syntax of placeholders is ${ENV_VARIABLE_NAME}.

For example, to set elasticsearch.url to the value of the environment variable ES_URL, edit config/investigate.yml as follows:

elasticsearch.url: ${ES_URL}
Configuration Settings
server.port:

Default: 5606 - Siren Investigate is served by a back end server. This setting specifies the port to use.

server.host:

Default: "localhost" - This setting specifies the host of the back end server.

server.basePath:

Enables you to specify a path to mount Siren Investigate as if you are running behind a proxy. This only affects the URLs generated by Siren Investigate, your proxy is expected to remove the basePath value before forwarding requests to Siren Investigate. This setting cannot end in a slash (/).

server.maxPayloadBytes:

Default: 1048576 - The maximum payload size in bytes for incoming server requests.

server.name:

Default: "your-hostname" - A human-readable display name that identifies this Siren Investigate instance.

server.defaultRoute:

Default: "/app/kibana" - This setting specifies the default route when opening Siren Investigate. You can use this setting to modify the landing page when opening Siren Investigate.

elasticsearch.url:

Default: "http://localhost:9220" - The URL of the Elasticsearch instance to use for all your queries.

elasticsearch.preserveHost:

Default: true - When this setting’s value is true Siren Investigate uses the hostname specified in the server.host setting. When the value of this setting is false, Siren Investigate uses the hostname of the host that connects to this Siren Investigate instance.

kibana.index:

Default: ".siren" - Siren Investigate uses an index in Elasticsearch to store saved searches, visualizations and dashboards. Siren Investigate creates a new index if the index does not already exist.

kibana.defaultAppId:

Default: "discover" - The default application to load.

tilemap.url:

The URL to the tile service that Siren Investigate uses to display map tiles in tilemap visualizations. By default, Siren Investigate reads this url from an external metadata service, but users can still override this parameter to use their own Tile Map Service.

tilemap.options.minZoom:

Default: 1 - The minimum zoom level.

tilemap.options.maxZoom:

Default: 10 - The maximum zoom level.

tilemap.options.attribution:

Default: "© [OpenStreetMap]("http://www.openstreetmap.org/copyright")" - The map attribution string.

tilemap.options.subdomains:

An array of subdomains used by the tile service. Specify the position of the subdomain in the URL with the token {s}.

regionmap

Specifies additional vector layers for use in Region Map visualizations. Each layer object points to an external vector file that contains a geojson FeatureCollection. The file must use the WGS84 coordinate reference system and only include polygons. If the file is hosted on a separate domain from Siren Investigate, the server needs to be CORS-enabled so Siren Investigate can download the file. The following example shows a valid regionmap configuration.

regionmap:
  layers:
     - name: "Departments of France"
       url: "http://my.cors.enabled.server.org/france_departements.geojson"
       attribution: "INRAP"
       fields:
          - name: "department"
            description: "Full department name"
          - name: "INSEE"
            description: "INSEE numeric identifier"
name:

Mandatory. A description of the map being provided.

url:

Mandatory. The location of the geojson file as provided by a webserver.

attribution:

Optional. References the originating source of the geojson file.

fields:

Mandatory. Each layer can contain multiple fields to indicate what properties from the geojson features you want to expose. This example shows how to define multiple properties.

fields.name:

Mandatory. This value is used to do an inner-join between the document stored in Elasticsearch and the geojson file. For example, if the field in the geojson is called Location and has city names, there must be a field in Elasticsearch that holds the same values that Siren Investigate can then use to lookup for the geoshape data.

fields.description:

Mandatory. The human readable text that is shown under the Options tab when building the Region Map visualization.

elasticsearch.username: and elasticsearch.password:

If your Elasticsearch is protected with basic authentication, these settings provide the username and password that the Siren Investigate server uses to perform maintenance on the Siren Investigate index at startup. Your Siren Investigate users still need to authenticate with Elasticsearch, which is proxied through the Siren Investigate server.

server.ssl.enabled

Default: "false" Enables SSL for outgoing requests from the Siren Investigate server to the browser. When set to true, server.ssl.certificate and server.ssl.key are required

server.ssl.certificate: and server.ssl.key:

Paths to the PEM-format SSL certificate and SSL key files, respectively.

server.ssl.keyPassphrase:

The passphrase that will be used to decrypt the private key. This value is optional as the key may not be encrypted.

server.ssl.certificateAuthorities:

List of paths to PEM encoded certificate files that should be trusted.

server.ssl.supportedProtocols:

Default: TLSv1, TLSv1.1, TLSv1.2 Supported protocols with versions. Valid protocols: TLSv1, TLSv1.1, TLSv1.2

server.ssl.cipherSuites:

Default: ECDHE-RSA-AES128-GCM-SHA256, ECDHE-ECDSA-AES128-GCM-SHA256, ECDHE-RSA-AES256-GCM-SHA384, ECDHE-ECDSA-AES256-GCM-SHA384, DHE-RSA-AES128-GCM-SHA256, ECDHE-RSA-AES128-SHA256, DHE-RSA-AES128-SHA256, ECDHE-RSA-AES256-SHA384, DHE-RSA-AES256-SHA384, ECDHE-RSA-AES256-SHA256, DHE-RSA-AES256-SHA256, HIGH,!aNULL, !eNULL, !EXPORT, !DES, !RC4, !MD5, !PSK, !SRP, !CAMELLIA. Details on the format, and the valid options, are available using the [OpenSSL cipher list format documentation](https://www.openssl.org/docs/man1.0.2/apps/ciphers.html#CIPHER-LIST-FORMAT)

elasticsearch.ssl.certificate: and elasticsearch.ssl.key:

Optional settings that provide the paths to the PEM-format SSL certificate and key files. These files are used to verify the identity of Siren Investigate to Elasticsearch and are required when xpack.ssl.verification_mode in Elasticsearch is set to either certificate or full.

elasticsearch.ssl.keyPassphrase:

The passphrase that will be used to decrypt the private key. This value is optional as the key may not be encrypted.

elasticsearch.ssl.certificateAuthorities:

Optional setting that enables you to specify a list of paths to the PEM file for the certificate authority for your Elasticsearch instance.

elasticsearch.ssl.verificationMode:

Default: full Controls the verification of certificates presented by Elasticsearch. Valid values are none, certificate, and full. full performs hostname verification, and certificate does not.

elasticsearch.pingTimeout:

Default: the value of the elasticsearch.requestTimeout setting - Time in milliseconds to wait for Elasticsearch to respond to pings.

elasticsearch.requestTimeout:

Default: 30000 - Time in milliseconds to wait for responses from the back end or Elasticsearch. This value must be a positive integer.

elasticsearch.requestHeadersWhitelist:

Default: [ 'authorization' ] - List of Siren Investigate client-side headers to send to Elasticsearch. To send no client-side headers, set this value to [] (an empty list).

elasticsearch.customHeaders:

Default: {} - Header names and values to send to Elasticsearch. Any custom headers cannot be overwritten by client-side headers, regardless of the elasticsearch.requestHeadersWhitelist configuration.

elasticsearch.shardTimeout:

Default: 0 - Time in milliseconds for Elasticsearch to wait for responses from shards. Set to 0 to disable.

elasticsearch.startupTimeout:

Default: 5000 - Time in milliseconds to wait for Elasticsearch at Siren Investigate startup before retrying.

pid.file:

Specifies the path where Siren Investigate creates the process ID file.

path.data:

Default: ./data The path where Siren Investigate stores persistent data not saved in Elasticsearch

logging.dest:

Default: stdout Enables you specify a file where Siren Investigate stores log output.

logging.silent:

Default: false Set the value of this setting to true to suppress all logging output.

logging.quiet:

Default: false Set the value of this setting to true to suppress all logging output other than error messages.

logging.verbose:

Default: false Set the value of this setting to true to log all events, including system usage information and all requests.

ops.interval:

Default: 5000 Set the interval in milliseconds to sample system and process performance metrics. The minimum value is 100.

status.allowAnonymous:

Default: false If authentication is enabled, setting this to true allows unauthenticated users to access the Siren Investigate server status API and status page.

cpu.cgroup.path.override:

Override for cgroup cpu path when mounted in manner that is inconsistent with /proc/self/cgroup

cpuacct.cgroup.path.override:

Override for cgroup cpuacct path when mounted in manner that is inconsistent with /proc/self/cgroup

console.enabled

Default: true Set to false to disable Console. Toggling this will cause the server to regenerate assets on the next startup, which may cause a delay before pages start being served.

elasticsearch.tribe.url:

Optional URL of the Elasticsearch tribe instance to use for all your queries.

elasticsearch.tribe.username: and elasticsearch.tribe.password:

If your Elasticsearch is protected with basic authentication, these settings provide the username and password that the Siren Investigate server uses to perform maintenance on the Siren Investigate index at startup. Your Siren Investigate users still need to authenticate with Elasticsearch, which is proxied through the Siren Investigate server.

elasticsearch.tribe.ssl.cert: and elasticsearch.tribe.ssl.key:

Optional settings that provide the paths to the PEM-format SSL certificate and key files. These files validate that your Elasticsearch backend uses the same key files.

elasticsearch.tribe.ssl.keyPassphrase:

The passphrase that will be used to decrypt the private key. This value is optional as the key may not be encrypted.

elasticsearch.tribe.ssl.certificateAuthorities:

Optional setting that enables you to specify a path to the PEM file for the certificate authority for your tribe Elasticsearch instance.

elasticsearch.tribe.ssl.verificationMode:

Default: full - Controls the verification of certificates. Valid values are none, certificate, and full. full performs hostname verification, and certificate does not.

elasticsearch.tribe.pingTimeout:

Default: the value of the elasticsearch.tribe.requestTimeout setting - Time in milliseconds to wait for Elasticsearch to respond to pings.

elasticsearch.tribe.requestTimeout:

Default: 30000 - Time in milliseconds to wait for responses from the back end or Elasticsearch. This value must be a positive integer.

elasticsearch.tribe.requestHeadersWhitelist:

Default: [ 'authorization' ] - List of Siren Investigate client-side headers to send to Elasticsearch. To send no client-side headers, set this value to [] (an empty list).

elasticsearch.tribe.customHeaders:

Default: {} - Header names and values to send to Elasticsearch. Any custom headers cannot be overwritten by client-side headers, regardless of the elasticsearch.tribe.requestHeadersWhitelist configuration.

investigate_core.default_dashboard_title

Default: not set - The dashboard that is displayed when clicking the Dashboard tab for the first time. This property is deprecated starting from Siren Investigate 4.6.4-4, it was moved to advanced_settings (Setting Advanced Options)

Accessing Siren Investigate

Siren Investigate is a web application that you access through port 5606. All you need to do is point your web browser at the machine where Siren Investigate is running and specify the port number. For example, http://localhost:5606 or http://YOURDOMAIN.com:5606.

When you access Siren Investigate, the Discover page loads by default with the default index pattern selected. The time filter is set to the last 15 minutes and the search query is set to match-all (*).

If you do not see any documents, try setting the time filter to a wider time range. If you still do not see any results, it is possible that you do not have any documents.

Checking Siren Investigate status

You can reach the Siren Investigate server’s status page by navigating to http://localhost:5606/status. The status page displays information about the server’s resource usage and lists the installed plugins.

kibi status page

Collecting Elasticsearch diagnostics

The Elasticsearch diagnostics button generates a single file by collecting different metrics about your Elasticsearch cluster. All collected information are saved to a local file and never transferred over a network. You can see a full list of Elasticsearch API calls by clicking the more info icon next to the button.

kibi status page diagnostics help
For JSON-formatted server status details, use the API endpoint at localhost:5601/api/status

Connect Siren Investigate to Elasticsearch

Before you can start using Siren Investigate, you need to tell it which Elasticsearch indices you want to explore. The first time you access Siren Investigate, you are prompted to define an index pattern that matches the name of one or more of your indices. That’s it. That’s all you need to configure to start using Siren Investigate. You can add index patterns at any time from the Management tab.

By default, Siren Investigate connects to the Elasticsearch instance running on localhost. To connect to a different Elasticsearch instance, modify the Elasticsearch URL in the investigate.yml configuration file and restart Siren Investigate. For information about using Siren Investigate with your production nodes, see Using Siren Investigate in a production environment.

To configure the Elasticsearch indices you want to access with Siren Investigate:

  1. Point your browser at port 5606 to access the Siren Investigate UI. For example, localhost:5606 or http://YOURDOMAIN.com:5606.

    Siren Investigate start page

  2. Specify an index pattern that matches the name of one or more of your Elasticsearch indices. You may have to access the index patterns management in the Management tab. By default, Siren Investigate guesses that you are working with data being fed into Elasticsearch by Logstash. If that’s the case, you can use the default logstash-* as your index pattern. The asterisk (*) matches zero or more characters in an index’s name. If your Elasticsearch indices follow some other naming convention, enter an appropriate pattern. The "pattern" can also simply be the name of a single index.

  3. Select the index field that contains the timestamp that you want to use to perform time-based comparisons. Siren Investigate reads the index mapping to list all of the fields that contain a timestamp. If your index does not have time-based data, disable the Index contains time-based events option.

  4. Click Create to add the index pattern. This first pattern is automatically configured as the default. When you have more than one index pattern, you can designate which one to use as the default by clicking on the star icon above the index pattern title from Management > Index Patterns.

All done. Siren Investigate is now connected to your Elasticsearch data. Siren Investigate displays a read-only list of fields configured for the matching index.

Siren Investigate relies on dynamic mapping to use fields in visualizations and manage the .siren index. If you have disabled dynamic mapping, you need to manually provide mappings for the fields that Siren Investigate uses to create visualizations. For more information, see Siren Investigate and Elasticsearch Dynamic Mapping.

Start exploring your data

You are ready to dive in to your data:

  • Search and browse your data interactively from the Discover page.

  • Chart and map your data from the Visualize page.

  • Create and view custom dashboards from the Dashboard page.

For a step-by-step introduction to these core Siren Investigate concepts, see the Getting Started tutorial.

Siren Investigate and Elasticsearch dynamic mapping

By default, Elasticsearch enables dynamic mapping for fields. Siren Investigate needs dynamic mapping to use fields in visualizations correctly, as well as to manage the .siren index where saved searches, visualizations, and dashboards are stored.

If your Elasticsearch use case requires you to disable dynamic mapping, you need to manually provide mappings for fields that Siren Investigate uses to create visualizations. You also need to manually enable dynamic mapping for the .siren index.

The following procedure assumes that the .siren index does not already exist in Elasticsearch and that the index.mapper.dynamic setting in elasticsearch.yml is set to false:

  1. Start Elasticsearch.

  2. Create the .siren index with dynamic mapping enabled just for that index:

    PUT .siren
    {
      "index.mapper.dynamic": true
    }
  3. Start Siren Investigate and navigate to the web UI and verify that there are no error messages related to dynamic mapping.

Using Siren Investigate with tribe nodes

While tribe nodes have been deprecated in Elasticsearch in favor of Cross cluster search, you can still use Siren Investigate with tribe nodes until Elasticsearch version 7.0. Unlike tribe nodes, using cross cluster search in Siren Investigate requires no server-side configurations and does not disable functionality like Console.

Siren Investigate can be configured to connect to a tribe node for data retrieval. Because tribe nodes cannot create indices, Siren Investigate additionally requires a separate connection to a node to maintain state. When configured, searches and visualizations will retrieve data using the tribe node and administrative actions (such as saving a dashboard) will be sent to non-tribe node.

Configuring Siren Investigate for tribe nodes

Tribe nodes take all of the same configuration options used when configuring Elasticsearch in investigate.yml. Tribe options are prefixed with elasticsearch.tribe and at a minimum requires a URL:

elasticsearch.url: "<your_administration_node>"
elasticsearch.tribe.url: "<your_tribe_node>"

When configured to use a tribe node, actions that modify Siren Investigate’s state will be sent to the node at elasticsearch.url. Searches and visualizations will retrieve data from the node at elasticsearch.tribe.url. It’s acceptable to use a node for elasticsearch.url that is part of one of the clusters that a tribe node is pointing to.

The full list of configurations can be found at Configuring Kibana.

Limitations

Due to the ambiguity of which cluster is being used, certain features are disabled in Siren Investigate:

  • Console

  • Managing users and roles with the x-pack plugin

Using Siren Investigate in a production environment

How you deploy Siren Investigate largely depends on your use case. If you are the only user, you can run it on your local machine and configure it to point to whatever Elasticsearch instance you want to interact with. Conversely, if you have a large number of heavy users, you may need to load balance across multiple instances that are all connected to the same Elasticsearch cluster.

While Siren Investigate is not terribly resource intensive, we still recommend running Siren Investigate separate from your Elasticsearch data or master nodes.

Make sure to set the configuration file as readable only to the user running the Siren Investigate process as it contains encryption keys to protect configuration settings stored in Elasticsearch; if you are connecting Siren Investigate to external datasources, we also recommend to use credentials with no write privileges as they are not required by the system.

Access control

Siren Investigate is compatible with Elastic x-pack and Search Guard to enable index and document level access control.

For more information about access control features, see the Access Control section.

Load balancing across multiple elasticsearch nodes

If you have multiple nodes in your Elasticsearch cluster, the easiest way to distribute Siren Investigate requests across the nodes is to run an Elasticsearch client node on the same machine as Siren Investigate. Elasticsearch client nodes are essentially smart load balancers that are part of the cluster. They process incoming HTTP requests, redirect operations to the other nodes in the cluster as needed, and gather and return the results. For more information, see Node in the Elasticsearch reference.

To use a local client node to load balance Siren Investigate requests:

  1. Install Elasticsearch on the same machine as Siren Investigate.

  2. Configure the node as a client node. In elasticsearch.yml, set both node.data and node.master to false:

    # 3. You want this node to be neither master nor data node, but
    #    to act as a "search load balancer" (fetching data from nodes,
    #    aggregating results, and so on)
    #
    node.master: false
    node.data: false
    node.ingest: false
  3. Configure the client node to join your Elasticsearch cluster. In elasticsearch.yml, set the cluster.name to the name of your cluster.

    cluster.name: "my_cluster"
  4. Make sure Siren Investigate is configured to point to your local client node. In investigate.yml, the elasticsearch.url should be set to localhost:9220.

    # The Elasticsearch instance to use for all your queries.
    elasticsearch.url: "http://localhost:9220"

Upgrading from a previous version

Before upgrading ensure to check the Breaking changes section; for information on how to upgrade from Kibi 5.x, ensure to read the Migrating from Kibi 5.x to Siren Investigate 10.x section.

An existing Siren Investigate installation can be upgraded as follows:

  • backup the .siren index.

  • backup the Siren Investigate configuration file (config/investigate.yml)

  • backup the .sirenaccess if ACL (Access Control Layer) is enabled.

  • upgrade Elasticsearch;

  • before restarting each Elasticsearch node, ensure to install a compatible version of the Siren Federate plugin and access control plugins if required.

  • download and extract the new Siren Investigate version.

  • copy the previous configuration file to the config folder of the new installation.

  • copy the files from the data folder in your old installation of the new installation.

  • check for breaking changes to the configuration.

  • install the compatible versions of third party Siren Investigate/Kibana plugins that you may need in addition to the bundled ones.

  • execute the upgrade command.

Backing up and restoring the Siren Investigate indices

Before upgrading you should to have a backup of the .siren index; the recommended way to perform regular backups of Elasticsearch indexes is through the snapshot and restore modules.

Siren Investigate ships with a command line interface for creating dumps of the .siren index and, in case the ACL is enabled, the .sirenaccess index as well. An index dump is composed of two parts: its mappings and its data.

Backup

The backup command requires a running Elasticsearch instance and the path to a folder where the dumps will be written to.

You can know more about its options by executing the following:

$ ./bin/investigate backup --help

For example, the following line will dump in <MY_FOLDER> the .siren index and the .sirenaccess index if the option investigate_access_control.acl.enabled is true in investigate.yml:

$ ./bin/investigate backup --backup-dir <MY_FOLDER>

Restore

The restore command requires a running Elasticsearch instance and the path to a folder where the dumps were written to by the previous backup command.

You can know more about its options by executing the following:

$ ./bin/investigate restore --help

For example, you can restore the previously saved indices by executing the command and pointing to the dump folder, with .sirenaccess as well if the option investigate_access_control.acl.enabled is true in investigate.yml:

$ ./bin/investigate restore --backup-dir <MY_FOLDER>

In addition, an upgrade of the Siren Investigate indices is also executed after a successful restore.

Upgrading the .siren index.

To upgrade the objects in the .siren index (dashboards, visualizations, and so on), move to the folder in which Siren Investigate is installed and execute the following command:

bin/investigate upgrade

The command will look for out of date objects and upgrade them, for example:

$ bin/investigate upgrade
  log   [17:58:33.494] [info][status][plugin:elasticsearch] Status changed from uninitialized to yellow - Waiting for Elasticsearch
  log   [17:58:36.127] [info][migrations] Executing migration "Upgrade scripts from version 1 to version 2"
  log   [17:58:36.141] [info][migrations] Executed migration "Upgrade scripts from version 1 to version 2"
  log   [17:58:36.142] [info][migrations] Executing migration "Upgrade graph browser visualization to version 2."
  log   [17:58:36.157] [info][migrations] Executed migration "Upgrade graph browser visualization to version 2."
  log   [17:58:36.158] [info][migrations] Executing migration "Upgrade saved queries from version 1 to version 2"
  log   [17:58:36.242] [info][migrations] Executed migration "Upgrade saved queries from version 1 to version 2"
  log   [17:58:36.242] [info][migrations] Executing migration "Upgrade saved templates from version 1 to version 2"
  log   [17:58:36.303] [info][migrations] Executed migration "Upgrade saved templates from version 1 to version 2"
  log   [17:58:36.303] [info][migrations] Executing migration "Upgrade saved queries definitions in external query terms aggregation, enhanced search results and query viewer."
  log   [17:58:36.400] [info][migrations] Executed migration "Upgrade saved queries definitions in external query terms aggregation, enhanced search results and query viewer."
Upgraded 20 objects.

It is possible to run the command multiple times, however running the command at the same time from multiple machines is not supported.

The upgrade command runs an automatic backup of the siren indices (.siren, .sirenaccess) and restores them (after deleting the existing index) in the event of a problem in the upgrade process - ensuring the system is not left in an unuseable state.
In the event of a successful upgrade, the backup is deleted but if there is an issue, the backed up indexes are stored in the backup folder (defaults to the /data folder).

You can specify a number of flags to control the backup/restore process.

  • --dont-backup: Runs the upgrade process without creating a backup of the indexes.

  • --keep-backup: Does not destroy the backed up indexes when the upgrade is successful.

  • -y: Accepts all of the options, for example Backup the indexes and delete the indexes before restoring.

  • --backup-dir: Custom backup folder path to store the index backup.

  • --config: Path to the config file.

Breaking changes

This section discusses the changes that you need to be aware of when migrating your application from one version of Siren Investigate to another.

Migrating from Kibi 5.x to Siren Investigate 10.x

Before starting the upgrade procedure, ensure that Kibi is at version 5.4.3-5 or later.

Pre-requisites

Elasticsearch must be at least at version 5.6.9; if the cluster is secured by Search Guard, it will be necessary to alter the security configuration to it is recommended to backup it as explained in the next section.

For each node in the cluster, you will need to:

  • Remove the siren-vanguard plugin if installed by running elasticsearch-plugin remove siren-vanguard.

  • Install the version of the siren-federate plugin compatible with your Elasticsearch version.

If you are upgrading Elasticsearch and using Search Guard, ensure to upgrade the plugin as well; to find out the version of Search Guard 5 compatible with your Elasticsearch installation, check the Search Guard version matrix.

Search Guard can be installed from Maven Central using elasticsearch-plugin, e.g.:

./usr/share/elasticsearch/bin/elasticsearch-plugin install -b com.floragunn:search-guard-5:<version>

In addition, the compatible version of any commercial Search Guard add-on will have to be downloaded and copied to the plugins/search-guard-5 directory; the following list provides links to the download page of all the add-ons commonly used in Siren Investigate instances:

Search Guard configuration backup

If the cluster is secured by Search Guard, you will need to retrieve the current security configuration through the sgadmin tool, modify it as instructed and reload it.

sgadmin.sh is available in the plugins/search-guard-5/tools directory on each Elasticsearch instance in which Search Guard has been installed; a standalone version (sgadmin-standalone.zip) can be downloaded from this page.

The current configuration can be retrieved by creating a backup directory and running the tool as follows:

mkdir configbak
bash /path/to/sgadmin.sh -r -ks CN\=sgadmin-keystore.jks -cd configbak -kspass password -ts truststore.jks -tspass password -icl -nhnv -h elasticsearch.local -p 9300

Arguments:

  • r: retrieve configuration.

  • cd: the directory in which the current configuration will be saved.

  • ks: the path to the administrative keystore

  • kspass: the password of the administrative keystore.

  • ts: the path to the truststore.

  • tspass: the password of the truststore.

  • icl: ignore cluster name.

  • nhvv: do not verify cluster hostname.

  • h: Elasticsearch hostname.

  • p: Elasticsearch transport port.

If execution is successful, you will find the five Search Guard configuration files in the configbak directory suffixed by the current date.

Before proceeding, ensure that the files are not empty and remove the date suffix; you should end up with the following listing:

  • sg_action_groups.yml

  • sg_config.yml

  • sg_internal_users.yml

  • sg_roles_mapping.yml

  • sg_roles.yml

Search Guard configuration changes

This section describes all the Siren Investigate specific changes required to the Search Guard configuration files that were retrieved in the previous section.

After the files have been updated, the updated configuration can be loaded to the cluster using sgadmin as explained at the end of the section.

sg_action_groups.yml

Add the following action groups if missing:

UNLIMITED:
  - "*"

INDICES_ALL:
  - "indices:*"

Set the ALL action group as follows:

ALL:
  - INDICES_ALL

Modify the CREATE_INDEX action group as follows:

CREATE_INDEX:
  - "indices:admin/create"
  - "indices:admin/mapping/put"

Add the INDICES_MONITOR action group:

MONITOR:
  - INDICES_MONITOR

INDICES_MONITOR:
  - "indices:monitor/*"

Update the DATA_ACCESS, READ, WRITE, DELETE, CRUD and SEARCH groups as follows:

DATA_ACCESS:
  - "indices:data/*"
  - CRUD

WRITE:
  - "indices:data/write*"
  - "indices:admin/mapping/put"

READ:
  - "indices:data/read*"
  - "indices:admin/mappings/fields/get*"

DELETE:
  - "indices:data/write/delete*"

CRUD:
  - READ
  - WRITE

SEARCH:
  - "indices:data/read/search*"
  - "indices:data/read/msearch*"
  - "indices:siren/plan*"
  - "indices:siren/mplan*"
  - SUGGEST

Update the INDEX group as follows:

INDEX:
  - "indices:data/write/index*"
  - "indices:data/write/update*"
  - "indices:admin/mapping/put"
  - "indices:data/write/bulk*"

Add or update the CLUSTER_COMPOSITE_OPS_RO and CLUSTER_COMPOSITE_OPS roles as follows:

CLUSTER_COMPOSITE_OPS_RO:
  - "indices:data/read/mget"
  - "indices:data/read/msearch"
  - "indices:siren/mplan"
  - "indices:data/read/mtv"
  - "indices:admin/aliases/exists*"
  - "indices:admin/aliases/get*"

CLUSTER_COMPOSITE_OPS:
  - "indices:data/write/bulk*"
  - "indices:admin/aliases*"
  - CLUSTER_COMPOSITE_OPS_RO

Remove the KIBI_MSEARCH role if present.

Add/update the SIREN_COMPOSITE role with the following definition:

SIREN_COMPOSITE:
  - "indices:siren/mplan*"

If present, replace the KIBI_COMPOSITE role with the following definition for backwards compatibility.

KIBI_COMPOSITE:
  - SIREN_COMPOSITE

Add the SIREN_CLUSTER, SIREN_READONLY and SIREN_READWRITE roles with the following definitions:

SIREN_CLUSTER:
  - "cluster:data/read/lock/create"
  - "cluster:siren/internal*"
  - "cluster:admin/plugin/siren/license/get"
  - "indices:data/read/scroll*"
  - "indices:data/read/msearch*"
  - "indices:siren/mplan*"
  - CLUSTER_COMPOSITE_OPS_RO

SIREN_READONLY:
  - "indices:data/read/field_stats*"
  - "indices:data/read/field_caps*"
  - "indices:data/read/get*"
  - "indices:data/read/mget*"
  - "indices:data/read/search*"
  - "indices:siren/mplan*"
  - "indices:siren/plan*"
  - "indices:admin/get*"
  - "indices:admin/mappings/get*"
  - "indices:admin/mappings/fields/get*"
  - "indices:admin/validate/query*"
  - "indices:admin/version/get*"
  - "indices:data/siren/connector/*"
  - SIREN_COMPOSITE

SIREN_READWRITE:
  - "indices:admin/exists*"
  - "indices:admin/mapping/put*"
  - "indices:admin/refresh*"
  - "indices:data/write/delete*"
  - "indices:data/write/index*"
  - "indices:data/write/update*"
  - "indices:data/write/bulk*"
  - SIREN_READONLY

If present, replace KIBI_CLUSTER, KIBI_READONLY and KIBI_READWRITE with the following definitions for backwards compatibility.

KIBI_CLUSTER:
  - SIREN_CLUSTER

KIBI_READONLY:
  - SIREN_READONLY

KIBI_READWRITE:
  - SIREN_READWRITE
sg_roles.yml
kibiserver

Replace the kibiserver role with the following; ensure to write the correct names for your configuration indices in place of ?siren and ?sirenaccess.

kibiserver:
  cluster:
      - cluster:monitor/nodes/info
      - cluster:monitor/health
      - cluster:monitor/main
      - cluster:monitor/state
      - cluster:monitor/nodes/stats
      - SIREN_CLUSTER
      - CLUSTER_COMPOSITE_OPS
  indices:
    '*':
      '*':
        - indices:admin/get
    '?siren':
      '*':
        - ALL
    '?sirenaccess':
      '*':
        - ALL
Siren Investigate user roles

In each existing user role with access to Siren Investigate:

  • replace KIBI_MSEARCH with SIREN_COMPOSITE.

  • replace KIBI_CLUSTER with SIREN_CLUSTER.

  • replace KIBI_COMPOSITE with SIREN_COMPOSITE.

  • remove the following permission on the main Siren Investigate index (which was named .kibi in previous versions) from each user role that has it:

    ?kibi:
      null:
        - "indices:data/read/search"
        - "indices:data/read/coordinate-search"

If you are using the Siren Investigate access control plugin and want to enable access control rules on saved objects, ensure that Siren Investigate users do not have access to the Siren Investigate index (which was set to .kibi in previous releases), otherwise they would be able to peek at saved objects by issuing custom Elasticsearch queries.

Access to the Siren Investigate index was usually granted by the following lines in previous releases; these must be removed to use access control rules effectively as the configuration index will be managed exclusively by the kibiserver role:

'?kibi':
  '*':
    - KIBI_READWRITE
'?kibi':
  '*':
    - KIBI_READONLY

If there are users with read access to all indices (*), they will be able to search the Siren Investigate index as well; if you want to inhibit access to it, you should replace the wildcard with an explicit list of indices or a more restrictive index pattern.

License management

To grant a role the permission to upload an Siren Investigate license from the UI, ensure that the user has the cluster:admin/plugin/siren/license/* permission, for example:

sirenadmin:
  cluster:
    - SIREN_CLUSTER
    - 'cluster:admin/siren/connector/*'
    - 'cluster:admin/plugin/siren/license/*'
  ...

Monitoring

To grant a user the permission to use monitoring plugins and retrieve diagnostics information from the UI, you will need to add the CLUSTER_MONITOR and INDICES_MONITOR permissions to its role, for example:

# Permissions for an administrator
sirenadmin:
  cluster:
    ...
  indices:
    '*':
      '*':
        - KIBI_READONLY
        - INDICES_MONITOR
    '?kibi':
      '*':
        - KIBI_READWRITE
    'watcher':
      '*':
        - KIBI_READWRITE

JDBC datasources management

For information on how to setup Search Guard for JDBC datasources support see the JDBC datasources section after the upgrade.

marvel / X-Pack monitoring

If you were using Marvel and are migrating to X-Pack monitoring, the following role can be used in place of the previous sample marvel role:

# Permissions for an X-Pack monitoring agent.
monitoring:
  cluster:
    - CLUSTER_MONITOR
    - 'indices:admin/aliases'
    - 'indices:admin/template/get'
    - 'indices:admin/template/put'
    - 'cluster:admin/ingest/pipeline/get'
    - 'cluster:admin/ingest/pipeline/put'
    - 'indices:data/write/bulk'
  indices:
    '?marvel*':
      '*':
        - ALL
    '?monitoring*':
      '*':
        - ALL
Loading the updated configuration

To load the modified configuration run sgadmin as follows:

$ bash /path/to/sgadmin.sh -ks CN\=sgadmin-keystore.jks -cd confignew -kspass password -ts truststore.jks -tspass password -icl -nhnv -h elasticsearch.local -p 9300
Search Guard Admin v5
Will connect to elasticsearch.local:9330 ... done
Contacting elasticsearch cluster 'elasticsearch' and wait for YELLOW clusterstate ...
Clustername: escluster
Clusterstate: YELLOW
Number of nodes: 1
Number of data nodes: 1
searchguard index already exists, so we do not need to create one.
Will update 'config' with confignew/sg_config.yml
   SUCC: Configuration for 'config' created or updated
Will update 'roles' with confignew/sg_roles.yml
   SUCC: Configuration for 'roles' created or updated
Will update 'rolesmapping' with confignew/sg_roles_mapping.yml
   SUCC: Configuration for 'rolesmapping' created or updated
Will update 'internalusers' with confignew/sg_internal_users.yml
   SUCC: Configuration for 'internalusers' created or updated
Will update 'actiongroups' with confignew/sg_action_groups.yml
   SUCC: Configuration for 'actiongroups' created or updated
Done with success

Arguments:

  • cd: the directory containing the modified Search Guard configuration.

  • ks: the path to the administrative keystore

  • kspass: the password of the administrative keystore.

  • ts: the path to the truststore.

  • tspass: the password of the truststore.

  • icl: ignore cluster name.

  • nhvv: do not verify cluster hostname.

  • h: Elasticsearch hostname.

  • p: Elasticsearch transport port.

Investigate upgrade

  1. Extract Siren Investigate 10.x to a new folder.

  2. Copy the config/kibi.yml file from the old installation to the config folder of the new installation.

  3. copy the data directory from your old installation to the new installation.

  4. copy the pki directory from your old installation to the new installation.

  5. Update the Search Guard configuration as described in the next section.

  6. Install the upgraded versions of third party plugins if applicable.

  7. Run the upgrade scripts:

    1. bin/investigate upgrade-config;

    2. bin/investigate upgrade.

  8. Start the new installation and ensure it is working as expected.

Certificates and keystores

Ensure that any certificate files or keystores are accessible and correctly referenced from the new installation, particularly if relative paths are used.

License

You must upload your license after the cluster is up and running.

From Siren Investigate

To upload the license from Siren Investigate, navigate to the management section.

management no license

This page will show the license has not been installed.

Click License and select Upload license

license page before install

After you have selected your license and it has been uploaded, the page will show the license details.

license page after install

Via CLI

To upload the license you can use curl as follows:

curl http://localhost:9220/_siren/license -XPUT -T license.sig

If your cluster is protected by Search Guard, remember to specify credentials:

curl -uadmin:password --cacert=ca.pem https://localhost:9220/_siren/license -XPUT -T license.sig

Breaking changes in Kibi and Kibana 5.4.0

This section list the changes in Kibi and Kibana that you need to be aware of when migrating to Kibi 5.4.0.

Kibana binds to localhost by default

Details: Kibana (like Elasticsearch) now binds to localhost for security purposes instead of 0.0.0.0 (all addresses). Previous binding to 0.0.0.0 also caused issues for Windows users.

Impact: If you are running Kibana inside a container/environment that does not allow localhost binding, this will cause Kibana not to start up unless server.host is configured in the kibana.yml to a valid IP address/host, etc..

Markdown headers

Details: As part of addressing the security issue ESA-2016-03 (CVE-2016-1000220) in the Kibana product, the markdown version has been bumped.

Impact: As a result of the fix to ESA-2016-03, there is a slight change in the markdown format for headers.

Previously, headers are defined using # followed by the title:

###Packetbeat:
   [Dashboard](/#/dashboard/Packetbeat-Dashboard)
   [Web transactions](/#/dashboard/HTTP)

It should now be defined as follows (with a space between # and the title):

### Packetbeat:
    [Dashboard](/#/dashboard/Packetbeat-Dashboard)
    [Web transactions](/#/dashboard/HTTP)

Only whitelisted client headers are sent to Elasticsearch

Details: The only headers that are proxied from the browser client to Elasticsearch are the ones set using the elasticsearch.requestHeadersWhitelist server configuration.

Impact: If you are relying on client headers in Elasticsearch, you must whitelist the specific headers in your kibana.yml.

server.defaultRoute is now always prefixed by server.basePath

Details: The base path configuration now precedes the default route configuration when accessing the default route.

Impact: If you were relying on both defaultRoute and basePath configurations, you must remove the hardcoded basePath from your defaultRoute.

Folder listings of static assets are no longer rendered

Details: The server no longer renders a list of static files if you try to access a folder.

Impact: If you were relying on this behavior before, you must expose underlying folder listings using a reverse proxy instead.

Console logs display date/time in UTC

Details: All server logs now render in UTC rather than the server’s local time.

Impact: If you are parsing the timestamps of Kibana server logs in an automated way, ensure to update your automation to accomodate UTC values.

A column for Average no longer renders along with Standard Deviation

Details: From the early days of Kibana, adding a standard deviation metric to a data table also resulted in an average column being added to that data table. This is no longer the case.

Impact: If you want to have both standard deviation and average in the same data table, then add both columns just as you would any other metric.

Minimum size on terms aggregations has been changed from 0 to 1

Details: Elasticsearch has removed the ability to specify a size of 0 for terms aggregations, so Kibana’s minimum value has been adjusted to follow suit.

Impact: Any saved visualization that relies on size=0 must be updated.

Saved objects with previously deprecated Elasticsearch features

Details: Since Kibana 4.3, users have been able to arbitrarily modify filters using a generic JSON editor. If users took advantage of any deprecated Elasticsearch features in this way, then they will cause errors in Kibana because they are removed from Elasticsearch 5.0. Check the Elasticsearch breaking changes documentation for more details.

Impact: Discover, Visualize, and Dashboard will error for any saved objects that are relying on removed Elasticsearch functionality. Users must update the JSON of any affected filters.

Getting started

The demo distribution

The quickest way to get started with Siren Investigate is to download the Siren Platform demo distribution from the downloads page.

The Siren Platform demo distribution includes a ready to use Elasticsearch cluster in the elasticsearch folder; the cluster contains five indices:

company

a collection of companies

article

a collection of articles about companies

investment

a collection of investments in companies

investor

a collection of investors

.siren

Siren Investigate configuration

The indices have been populated through Logstash from the SQLite database in siren-investigate/crunchbase.db, as described in the Loading data into Elasticsearch chapter.

The demo dataset has been built from a sample of articles gathered from tech blogs in 2013 and from data about companies, investments and investors in the CrunchBase 2013 Snapshot, which is copyright © 2013 CrunchBase Inc.

Users

You can access the demo as three different users:

  • sirenadmin: has read access to all data indices and can modify the Siren Investigate configuration.

  • sirenuser: has read access to all data indices but cannot modify the Siren Investigate configuration.

  • sirennoinvestor: has read access to all data indices except investor; the user has no access to the Investors dashboard.

The password for all users is password.

After starting Elasticsearch and Siren Investigate, as described in the Setup chapter, start your web browser and navigate to http://localhost:5606.

The leftmost sidebar is used to navigate between apps within Siren Investigate.

It can be expanded or contracted using the expand/contract button (Contract Button) at the bottom of the sidebar.

By default Siren Investigate displays the Articles dashboard, dashboards can be configured to display multiple visualizations on the documents stored in a specific index or returned by a saved search on an index pattern.

Each dashboard is represented by a tab containing the dashboard title and the number of documents available to visualizations.

The articles dashboard

You can quickly search specific articles through the search input in the navigation bar. For example, let’s find all the articles written about wireless or wifi:

The dashboard search bar

We can immediately see that there are 11595 articles about those topics and all the visualizations are refreshed to aggregate data for this subset of articles.

Besides simple text, queries in the search bar can be written using the Lucene query syntax, or Elasticsearch Query DSL.

Filters

Visualizations can be used to create filters. For example, you can see all the articles about wireless or wifi published by TechCrunch by clicking the TechCrunch slice inside the Articles by Source pie chart visualization:

Clicking a pie slice
The dashboard with the filter on the slice applied
To disable a filter, just move the mouse over it and click the checkbox icon; you can disable all the filters applied to the dashboard by clicking Actions and then Disable; for more information about filtering, read the filters chapter.

Filters created by Relational Navigator

The Relational Navigator visualization enables you to create cross-dashboard filters. For example, by looking at the Companies button in the dashboard, you can see that there are 470 companies mentioned in the TechCrunch articles about wireless or wifi.

Relational filter from Articles to Companies

The relational filter created by clicking the button is displayed in the filter bar, and can be disabled or deleted just like any other filter. Moving the mouse over the filter will display the list of joined indices and their filters:

Relational filter in the filter bar

Relational filter can be accumulated. For example, if you click the Investment rounds -→ button, you will see data about the 351 investment rounds related to a subset of 470 companies mentioned in the TechCrunch articles about wireless or wifi.

for more information about relational navigator, read the Relational Navigator chapter.

Query Based Aggregations

It is possible to get additional information about companies by using the results of queries on SQL databases (or any of the datasources supported by Siren Investigate) as aggregations on Elasticsearch documents.

For example, in the Query on Companies visualization you can see that 41 of the 96 companies have competitors and 8 of them are in the top 500 companies by number of employees:

SQL based aggregations

Companies "With competitors" and Top 500 companies (HR count) are queries on the SQLite database. The records returned by the queries are used to filter Elasticsearch documents, which can be then aggregated in a metric.

To better understand this feature, let’s have a look at the Top 500 companies (HR count) query. To see the query, click Saved Objects in the Management menu.

The query editor

The query returns the id, label and number_of_employees columns from the company table for the top 500 companies by number of employees:

select id, label, number_of_employees
from company
where number_of_employees>0
order by number_of_employees desc
limit 500

Click Dashboard, then Edit, and then Edit (fa pencil) in the heading of the Query on Companies visualization to customize its configuration. The metrics section defines the aggregations on Elasticsearch documents, displayed as columns in the table. The buckets section defines the groups of Elasticsearch documents aggregated by metrics, displayed as row headers in the table.

Query on Companies configuration

By expanding the Split Rows section inside buckets you can see how the queries are used to define groups of Elasticsearch documents. Scroll down to see the configuration of the fourth filter:

Configuration of an external query terms filter

The filter is configured to execute the query Top 500 companies (HR count) on the SQLite database and return the group of Elasticsearch documents from the current search whose id is equal to one of the IDs in the query results. The documents are then processed by the Count metric.

Let’s add a new aggregation to show the average number of employees. Click on Add metrics inside the metrics section, then select Metric as the metric type; select Average as the aggregation and number_of_employees as the field, then click Apply Changes (fa play).

Save the visualization by clicking the Save button, then click the Dashboard tab to see the updated visualization in the Companies dashboard:

Average aggregation

Click Add sub-buckets at the bottom, then select Split Rows as the bucket type. Choose the Terms aggregation and the countrycode field from the drop-downs. Click Apply Changes (fa play) to add an external ring with the new results.

Countrycode aggregation
read the Create A Visualization chapter for an in-depth explanation of aggregations.

In addition to defining groups to aggregate, queries can be used as filters. To see this click Dashboard, then in the 'Query on Companies' dashboard tile, hover the mouse over the row for Top-500-companies-(HR-count) and click the + icon which appears.

Filter dashboard using a SQL query

Then you will see only the companies mentioned in the articles which are also in the top 500 by number of employees:

Filter dashboard using a SQL query result

Datasource Entity Selection

It is possible to select a company entity (record) in the SQLite database ( and entities in external datasources in general) by clicking its label in the Companies Table.

The selected entity can be used as a parameter in queries; for example, click Browshot in Companies Table:

Entity selection

Selecting an entity enables additional queries on external datasources. For example, in the Query on Companies visualization you can see that, amongst the top 500 companies by number of employees mentioned in articles about wireless or wifi, Browshot has 487 competitors and there are nineteen companies in the same domain. All widgets affected by the selected entity are marked by a purple header.

For the Y-axis metrics aggregation, select Unique Count, with speaker as the field. For Shakespeare plays, it may be useful to know which plays have the lowest number of distinct speaking parts, if your theater company is short on actors. For the X-Axis buckets, select the Terms aggregation with the play_name field. For the Order, select Ascending, leaving the Size at 5.

Leave the other elements at their default values and click Apply Changes (fa play). Your chart should now look like this:

Selecting an entity also enables the display of additional data in the Company Info visualization; by clicking the (show) links you can toggle the list of companies in the same domain and competitors. The data in the tables is fetched from queries on the SQLite database, using the selected company ID as a parameter. The queries are rendered using customizable templates, which will be introduced later.

The selected entity appears as a light blue box on the right of the filter bar; to deselect an entity, click the bin icon displayed when moving the mouse over the purple box.

for additional documentation about entity selection, read the Datasource entity selection section in the Legacy REST datasources chapter.

Enhanced Search Results

The Enhanced search results visualization displays the current set of Elasticsearch documents as a table. For example, Companies Table is configured to display the following fields:

  • Time (foundation date)

  • label (the company name)

  • description

  • category_code

  • founded_year

  • countrycode

  • Why Relevant? (a relational column)

Companies table

By selecting Edit and then clicking the pencil icon, you are brought to a view where you can choose which fields to display and customize the order of the columns. If the index is time based, the Time column will be always displayed.

Expand the first row by clicking the right arrow, then scroll down to the homepage_url field and click the Toggle column icon:

Column positioning

You can click the arrows to move the column to the desired position:

Column positioning

Click handlers

You can define click handlers on cells to perform several actions. Let’s add a click handler to open the company homepage when clicking the cell displaying the URL.

The table is pre-configured with a click handler on label that is used to select an entity in the SQLite database.

To add a new click handler, go into edit mode, scroll down view options and click Add click handler; select homepage_url in the Column dropdown, then Follow the URL in the On click I want to dropdown. Select homepage_url as the URL field, then click Apply Changes (fa play).

You can test the click handler immediately by clicking a cell displaying an homepage URL in the preview displayed on the right:

URL click handler

Relational column

You can enable the relational column to be displayed when an Elasticsearch document is matched by a query on the SQLite database. The relational column reports on the relationship, based on the queries configured.

In the following example, in the Companies Table, you can see that Big Fish is listed here because it has competitors.

Relational column example
Relational column configuration

Saving the visualization

Click Save in the top right to save the visualization, then click Dashboard to go back to the Companies dashboard.

for additional documentation about this visualization, read the Enhanced search results chapter.

Query Templates

Company Info which is an instance of a Siren Investigate query viewer visualization, displays the results of three SQL queries by rendering their results through templates; the queries take the selected entity ID as an input, thus the associated templates will be displayed only when an entity is selected.

Siren Investigate query viewer example

The association between the query and templates can be set in the visualization configuration:

Siren Investigate query viewer configuration

Query templates can be managed by clicking the Management icon, then select Advanced Settings followed by Templates.

you can find the documentation about templates in the Legacy REST datasources chapter; the visualization is documented in the Siren Investigate query viewer chapter.

Discover

You can interactively explore your data from the Discover page. You have access to every document in every index that matches the selected index pattern. You can submit search queries, filter the search results, and view document data. You can also see the number of documents that match the search query and get field value statistics. If a time field is configured for the selected index pattern, the distribution of documents over time is displayed in a histogram at the top of the page.

Discover

Setting the time filter

The time filter restricts the search results to a specific time period. You can set a time filter if your index contains time-based events and a time-field is configured for the selected index pattern.

By default the time filter is set to the last 15 minutes. You can use the Time Picker to change the time filter or select a specific time interval or time range in the histogram at the top of the page.

To set a time filter with the Time Picker:

  1. Click Time Picker (fa clock o) in the Siren Investigate toolbar.

  2. To set a quick filter, click one of the shortcut links.

    Time filter shortcuts
  3. Click Select All in the Apply to Dashboards section to apply time filter to all dashboards or select individual dashboards. The time filter is applied to the current dashboard by default.

  4. To specify a time filter relative to the current time, click Relative and specify the start time as a number of seconds, minutes, hours, days, months, or years. You can also specify the end time relative to the current time. Relative times can be in the past or future.

    Relative time filter
  5. To specify both the start and end times for the time filter, click Absolute and select a start and end date. You can adjust the time by editing the To and From fields.

    Absolute time filter
  6. Click the caret in the bottom right corner to close the Time Picker.

To set a time filter from the histogram, do one of the following:

  • Click the bar that represents the time interval you want to zoom in on.

  • Click and drag to view a specific timespan. You must start the selection with the cursor over the background of the chart—​the cursor changes to a plus sign when you hover over a valid start point.

To move forward/backward in time, click the arrows to the left or right of the Time Picker:

Move backwards in time

You can use the browser Back button to undo your changes.

The displayed time range and interval are shown on the histogram. By default, the interval is set automatically based on the time range. To use a different interval, click the link and select an interval.

You can search the indices that match the current index pattern by entering your search criteria in the Query bar. You can perform a simple text search, use the Lucene query syntax, or use the full JSON-based Elasticsearch Query DSL.

When you submit a search request, the histogram, Documents table, and Fields list are updated to reflect the search results. The total number of hits (matching documents) is shown in the toolbar. The Documents table shows the first five hundred hits. By default, the hits are listed in reverse chronological order, with the newest documents shown first. You can reverse the sort order by clicking the Time column header. You can also sort the table by the values in any indexed field. For more information, see Sorting the Documents Table.

To search your data, enter your search criteria in the Query bar and press Enter or click Search (fa search) to submit the request to Elasticsearch.

  • To perform a free text search, simply enter a text string. For example, if you are searching web server logs, you could enter safari to search all fields for the term safari.

  • To search for a value in a specific field, prefix the value with the name of the field. For example, you could enter status:200 to find all of the entries that contain the value 200 in the status field.

  • To search for a range of values, you can use the bracketed range syntax, [START_VALUE TO END_VALUE]. For example, to find entries that have 4xx status codes, you could enter status:[400 TO 499].

  • To specify more complex search criteria, you can use the Boolean operators AND, OR, and NOT. For example, to find entries that have 4xx status codes and have an extension of php or html, you could enter status:[400 TO 499] AND (extension:php OR extension:html).

These examples use the Lucene query syntax. You can also submit queries using the Elasticsearch Query DSL. For examples, see query string syntax in the Elasticsearch Reference.

Saving searches enables you to reload them into Discover and use them as the basis for visualizations. Saving a search saves both the search query string and the currently selected index pattern.

To save the current search:

  1. Click Save in the Siren Investigate toolbar.

  2. Enter a name for the search and click Save.

You can import, export and delete saved searches from management/Siren/Saved Objects.

To load a saved search into Discover:

  1. Click Open in the Siren Investigate toolbar.

  2. Select the search you want to open.

If the saved search is associated with a different index pattern than is currently selected, opening the saved search also changes the selected index pattern.

Changing which indices you are searching

When you submit a search request, the indices that match the currently-selected index pattern are searched. The current index pattern is shown below the toolbar. To change which indices you are searching, click the index pattern and select a different index pattern.

For more information about index patterns, see Creating an Index Pattern.

Refreshing the search results

As more documents are added to the indices you are searching, the search results shown in Discover and used to display visualizations get stale. You can configure a refresh interval to periodically resubmit your searches to retrieve the latest results.

To enable auto refresh:

  1. Click Time Picker (Time Picker) in the Siren Investigate toolbar.

  2. Click Auto refresh.

  3. Choose a refresh interval from the list.

    autorefresh intervals

When auto refresh is enabled, the refresh interval is displayed next to the Time Picker, along with a Pause button. To temporarily disable auto refresh, click Pause.

If auto refresh is not enabled, you can manually refresh visualizations by clicking Refresh.

Filtering by field

You can filter the search results to display only those documents that contain a particular value in a field. You can also create negative filters that exclude documents that contain the specified field value.

You add field filters from the Fields list, the Documents table, or by manually adding a filter. In addition to creating positive and negative filters, the Documents table enables you to filter on whether or not a field is present. The applied filters are shown below the Query bar. Negative filters are shown in red.

To add a filter from the Fields list:

  1. Click the name of the field you want to filter on. This displays the top five values for that field.

    filter field
  2. To add a positive filter, click Positive Filter (Positive Filter). This includes only those documents that contain that value in the field.

  3. To add a negative filter, click Negative Filter (Negative Filter). This excludes documents that contain that value in the field.

To add a filter from the Documents table:

  1. Expand a document in the Documents table by clicking Expand (Expand Button) to the left of the document’s table entry.

    Expanded Document
  2. To add a positive filter, click Positive Filter (Positive Filter) to the right of the field name. This includes only those documents that contain that value in the field.

  3. To add a negative filter, click Negative Filter (Negative Filter Button) to the right of the field name. This excludes documents that contain that value in the field.

  4. To filter on whether or not documents contain the field, click Exists (Exists Button) to the right of the field name. This includes only those documents that contain the field.

To manually add a filter:

  1. Click Add Filter. A popup will be displayed for you to create the filter.

    add filter
  2. Choose a field to filter by. This list of fields will include fields from the index pattern you are currently querying against.

    add filter field
  3. Choose an operation for your filter.

    add filter operator

    The following operators can be selected:

    is

    Filter where the value for the field matches the given value.

    is not

    Filter where the value for the field does not match the given value.

    is one of

    Filter where the value for the field matches one of the specified values.

    is not one of

    Filter where the value for the field does not match any of the specified values.

    is between

    Filter where the value for the field is in the given range.

    is not between

    Filter where the value for the field is not in the given range.

    exists

    Filter where any value is present for the field.

    does not exist

    Filter where no value is present for the field.

  4. Choose the value(s) for your filter.

    add filter value
  5. (Optional) Specify a label for the filter. If you specify a label, it will be displayed below the Query bar instead of the filter definition.

  6. Click Save. The filter will be applied to your search and be displayed below the Query bar.

To make the filter editor more user-friendly, you can enable the filterEditor:suggestValues advanced setting. Enabling this will cause the editor to suggest values from your indices if you are filtering against an aggregatable field. However, this is not recommended for extremely large datasets, as it can result in long queries.

Managing filters

To modify a filter, hover over it and click one of the action buttons.

filter allbuttons

 

fa check square o Enable Filter

Disable the filter without removing it. Click again to reenable the filter. Diagonal stripes indicate that a filter is disabled.

fa thumb tack mod Pin Filter

Pin the filter. Pinned filters persist when you switch contexts in Siren Investigate. For example, you can pin a filter in Discover and it remains in place when you switch to Visualize. Note that a filter is based on a particular index field—​if the indices being searched do not contain the field in a pinned filter, it has no effect.

fa search minus Invert Filter

Switch from a positive filter to a negative filter and vice-versa.

fa trash Remove Filter

Remove the filter.

fa pencil square o Edit Filter

Edit the filter definition. Enables you to manually update the filter and specify a label for the filter.

To apply a filter action to all of the applied filters, click Actions and select the action.

Editing a filter

You can edit a filter by changing the field, operator, or value associated with the filter (see the Add Filter section), or by directly modifying the filter query that is performed to filter your search results. This enables you to create more complex filters that are based on multiple fields.

  1. To edit the filter query, first click Edit for the filter, then click Edit Query DSL.

    edit filter query
  2. You can then edit the query for the filter.

    edit filter query json

For example, you could use a bool query to create a filter for the sample log data that displays the hits that originated from Canada or China that resulted in a 404 error:

{
  "bool": {
    "should": [
      {
        "term": {
          "geoip.country_name.raw": "Canada"
        }
      },
      {
        "term": {
          "geoip.country_name.raw": "China"
        }
      }
    ],
    "must": [
      {
        "term": {
          "response": "404"
        }
      }
    ]
  }
}

Viewing document data

When you submit a search query, the 500 most recent documents that match the query are listed in the Documents table. You can configure the number of documents shown in the table by setting the discover:sampleSize property in Advanced Settings. By default, the table shows the localized version of the time field configured for the selected index pattern and the document _source. You can add fields to the Documents table from the Fields list. You can sort the listed documents by any indexed field that’s included in the table.

To view a document’s field data, click Expand (Expand Button) to the left of the document’s table entry.

Expanded Document

To view the original JSON document (pretty-printed), click the JSON tab.

To view the document data as a separate page, click the View single document link. You can bookmark and share this link to provide direct access to a particular document.

To display or hide a field’s column in the Documents table, click the Add Column Toggle column in table button.

To collapse the document details, click Collapse (Collapse Button).

Sorting the document list

You can sort the documents in the Documents table by the values in any indexed field. If a time field is configured for the current index pattern, the documents are sorted in reverse chronological order by default.

To change the sort order, hover over the name of the field you want to sort by and click the sort button. Click again to reverse the sort order.

Adding field columns to the documents table

By default, the Documents table shows the localized version of the time field that’s configured for the selected index pattern and the document _source. You can add fields to the table from the Fields list or from a document’s field data.

To add a field column from the Fields list, hover over the field and click its add button.

Add Field From Sidebar

To add a field column from a document’s field data, expand the document and click the field’s Toggle column in table (Add Column) button.

Added field columns replace the _source column in the Documents table. The added fields are also added to the Selected Fields list.

To rearrange the field columns, hover over the header of the column you want to move and click the Move left or Move right button.

Move Column

Removing field columns from the documents table

To remove a field column from the Documents table, hover over the header of the column you want to remove and click Remove (Remove Field Button).

Viewing document context

For certain applications it can be useful to inspect a window of documents surrounding a specific event. The context view enables you to do just that for index patterns that are configured to contain time-based events.

To show the context surrounding an anchor document, click Expand (Expand Button) to the left of the document’s table entry and then click the View surrounding documents link.

Expanded Document

 

The context view displays a number of documents before and after the anchor document. The anchor document itself is highlighted in blue. The view is sorted by the time field specified in the index pattern configuration and uses the same set of columns as the Discover view the context was opened from. If there are multiple documents with the same time field value, the internal document order is used as a secondary sorting criterion by default.

The field used for tiebreaking in case of equal time field values can be configured using the advanced setting context:tieBreakerFields in Management > Advanced Settings, which defaults to the _doc field. The value of this setting can be a comma-separated list of field names, which will be checked in sequence for suitability when a context is about to be displayed. The first suitable field is then used as the tiebreaking field. A field is suitable if the field exists and is sortable in the index pattern the context is based on.

While not required, you should only use fields which have doc values enabled to achieve good performance and avoid unnecessary field data usage. Common examples for suitable fields include log line numbers, monotonically increasing counters and high-precision timestamps.

Context View
The number of documents displayed by default can be configured using the context:defaultSize setting in Management > Advanced Settings.

Changing the context size

You can change the number documents displayed before and after the anchor document independently.

To increase the number of displayed documents that are newer than the anchor document, click Load 5 more above the document list or enter the desired number into the input box right of the button.

Discover ContextView SizePicker Newer

 

To increase the number of displayed documents that are older than the anchor document, click Load 5 more below the document list or enter the desired number into the input box right of the button.

Discover ContextView SizePicker Older

 

The default number of documents loaded with each button click can be configured using the context:step setting in Management > Advanced Settings.

Filtering the context

Depending on how the documents are partitioned into index patterns, the context view may contain a large number of documents not related to the event under investigation. In order to adapt the focus of the context view to the task at hand, you can use filters to restrict the documents considered by Kibana for display in the context view.

When switching from the discover view to the context view, the previously applied filters are carried over. Pinned filters remain active while normal filters are copied in a disabled state. You can selectively re-enabled them to refine your context view.

New filters can be added using the Add a filter link in the filter bar, by clicking the filter icons appearing when hovering a field, or by expanding documents and clicking the filter icons in the table.

Discover ContextView FilterMontage

Viewing field data statistics

From the Fields list, you can see how many of the documents in the Documents table contain a particular field, what the top 5 values are, and what percentage of documents contain each value.

To view field data statistics, click the name of a field in the Fields list.

Field Statistics

Selected fields

Selected Fields are displayed on the Selected Field list at the top left of the Discover page.

Selected Fields

Click a field to see the field’s stats.

To remove a field, hover over or click the field and click Remove.

Remove Selected Field

After you have selected at least one field, you can then click Generate Dashboard to begin autogenerating a new Dashboard.

If you have not selected any fields, you can allow Siren Investigate to select the fields it believes are the most relevant by clicking Autoselect Most Relevant.

This button is only visible if there are no fields selected. If you have selected fields, the Generate Dashboard button will be visible in its place.

Autoselect fields

Siren Investigate begins testing the fields with each field being analyzed for relevance against a number of heuristics, for example, whether all values are unique - which indicates a potential ID field and unlikely to be relevant for visualizations.

Auto Select Report

After all the fields have been analyzed, a test report is displayed.

Auto Select Report

This report shows all the fields in the Discover data, which fields are selected as most relevant, the field type, the visualization selected for that type, the relevancy score and any notes on why the field was or was not selected as a relevant field.

You can add and remove fields you would like selected using the checkboxes on the left.

When you are ready, click Ok to select the fields.

After the fields have populated the Selected Fields list, you are ready to generate a dashboard.

Visualize

Visualize enables you to create visualizations of the data in your Elasticsearch indices. You can then build dashboards that display related visualizations.

Kibana visualizations are based on Elasticsearch queries. By using a series of Elasticsearch aggregations to extract and process your data, you can create charts that show you the trends, spikes, and dips you need to know about.

You can create visualizations from a search saved from Discover or start with a new search query.

Creating a visualization

  1. Click Visualize in the side navigation.

  2. Click the Create new visualization button or the + button.

  3. Choose the visualization type:

    • Siren Visualizations

      Siren Box Plot

      Display data in an x/y chart using upper and lower percentiles.

      Bubble Diagram

      Show data and parent/child relationships as bubbles.

      Enhanced Search Results

      Show the documents matched by a query on an Elasticsearch index with enhanced features.

      Graph browser

      display Elasticsearch documents as nodes and Siren Investigate relations as links of a graph.

      Multichart

      A visualization in which you can switch between other visualizations at will.

      Query Viewer

      Display the results from multiple queries on external data sources using query templates.

      Scatter Plot

      Show data in an x/y graph as scattered points.

    • Siren Relational Visualizations

      Relational Filter

      (Deprecated) Configure the relational buttons to navigate between dashboards.

      Relational Navigator

      Provide navigation between relationally connected dashboards.

      Automatic Relational Filter

      Automatically build the relations between index patterns and entities and generate relational filter buttons.

    • Basic charts

      Line, Area and Bar charts

      Compare different series in X/Y charts.

      Heat maps

      Shade cells within a matrix.

      Pie chart

      Display each source’s contribution to a total.

    • Data

      Data table

      Display the raw data of a composed aggregation.

      Metric

      Display a single number.

      Goal and Gauge

      Display a gauge.

    • Maps

      Coordinate map

      Associate the results of an aggregation with geographic locations.

      Region map

      Thematic maps where a shape’s color intensity corresponds to a metric’s value. locations.

    • Time Series

      Timelion

      Compute and combine data from multiple time series data sets.

      Time Series Visual Builder

      Visualize time series data using pipeline aggregations.

    • Other

      Tag cloud

      Display words as a cloud in which the size of the word correspond to its importance

      Markdown widget

      Display free-form information or instructions.

  4. Specify a search query to retrieve the data for your visualization:

    • To enter new search criteria, select the index pattern for the indices that contain the data you want to visualize. This opens the visualization builder with a wildcard query that matches all of the documents in the selected indices.

    • To build a visualization from a saved search, click the name of the saved search you want to use. This opens the visualization builder and loads the selected query.

      When you build a visualization from a saved search, any subsequent modifications to the saved search are automatically reflected in the visualization. To disable automatic updates, you can disconnect a visualization from the saved search.
  5. In the visualization builder, choose the metric aggregation for the visualization’s Y axis:

  6. For the visualizations X axis, select a bucket aggregation:

For example, if you are indexing Apache server logs, you could build a horizontal bar chart that shows the distribution of incoming requests by geographic location by specifying a terms aggregation on the geo.src field:

bar terms agg

The y-axis shows the number of requests received from each country, and the countries are displayed across the x-axis.

Bar, line, or area chart visualizations use metrics for the y-axis and buckets for the x-axis. Buckets are analogous to SQL GROUP BY statements. Pie charts, use the metric for the slice size and the bucket for the number of slices.

You can further break down the data by specifying sub aggregations. The first aggregation determines the data set for any subsequent aggregations. Sub aggregations are applied in order—​you can drag the aggregations to change the order in which they are applied.

For example, you could add a terms sub aggregation on the geo.dest field to a vertical bar chart to see the locations those requests were targeting.

bar terms subagg

For more information about working with sub aggregations, see Kibana, Aggregation Execution Order, and You.

Enhanced search results

Enhanced search results is a visualization that shows the documents matched by a query on an Elasticsearch index, similar to the stock Discover table.

In addition to column configuration, the visualization provides the following features:

  • to hide the time column - which represents a time field of the Elasticsearch index, check the Hide time column checkbox

  • it is possible to set page size which is a count of rows displayed each page. Also to enable top paginator, check the Show top paginator checkbox.

  • if you would like to use aliases in place of the column names in the data, see Rename columns.

  • it is possible to enable a column that indicates whether or not a search result is matched by a query on an external datasource. This is described in Relational column.

  • it is possible to define click handlers on the cells in a column, for example to open the URL displayed in a cell. This is described in Click handlers.

  • if you would like to create filters from table rows, see Row filters..

Configuration view of the Enhanced search results table

Rename columns

It is possible to create an alias and set a minimum width for each column.

To enable the rename columns, check the Enable column rename checkbox.

The following image shows the rename columns section of the table.

Rename columns

In order to configure names of columns, you must set the following parameters:

  • Alias: the column alias that will be displayed as a column name.

  • Min width: Optional parameter to set the minimum width of the column.

Relational column

The relational column can be used to display if a search result is matched by a query on an external datasource.

To enable the relational column, check the Enable Relational Column checkbox.

The following image shows the configuration of a relational column named Why Relevant? where the value of a cell depends on the query Top 50 companies (HR count): if the value of the label index field of a document matches the value of the label variable in at least one record returned by the query, the name of the query will be displayed inside the cell.

Relational column configuration
Relational column example

In order to configure the relational column, you must set the following parameters:

  • Column name: the column name that will be displayed in the table header.

  • Source Field: the name of the index field that will be compared to a variable in the query results.

  • Target query: the name of the query to execute.

  • Target query variable name: the name of the query variable that will be compared to the index field specified in Source field.

Click handlers

It is possible to define two different actions when clicking a cell;

  • Open an URL defined in the corresponding index field.

  • Select an entity in an external datasource matching the corresponding index field.

Follow URL

Select the Follow URL action to open a URL stored in an index field in a new window.

For example, the following configuration defines an handler that opens the URL stored in the field homepage_url when clicking the cell displaying the label field.

Follow URL on click

To configure a click handler, you must set the following parameters:

  • Column — the name of the column to which the handler will be bound.

  • On click I want to — the action to perform on click, select Follow the URL here.

  • URL field — the name of the field containing the URL.

  • URL format — a custom format string to compose the URL, where @URL@ is replaced with the value of the field set in URL field.

URL format can be used to create dynamic URL; the following image shows a configuration in which the value of the id field is used to define the path of an URL on example.org.

With this configuration, if the id field is set to 11 the resulting URL will be http://example.org/11 .

Follow URL with a custom format on click

Select an entity

Select the Select an entity action if you want to select an entity stored in an external datasource matching the selected Elasticsearch document; for more information about entity selection, read the Datasource entity selection section.

To configure an entity selection action you must set the following parameters:

  • Column — the name of the column to which the handler will be bound.

  • On click I want to — the action to perform on click, select Select the document here.

  • Redirect to dashboard — if set, clicking the cell will select the entity and display the specified dashboard.

Configuration of an entity selection handler

Row filters

It is possible to create filters from table rows.

To enable the row filters, check the Enable row filters checkbox.

Enable row filters

Then, select rows which you wanted to create filters from and click Create Filter button.

Row filters

Siren Investigate query viewer

This visualization displays the results from multiple queries on external data sources using query templates.

To add a query to the visualization, click the Add query button, then set the following parameters:

  • Label: the caption for the table, in case of a table template like kibi-table-jade. This sets the variable label to the given value.

  • Source query: the query used by the template.

  • Template: the template used to render results returned by Source query.

If one of the source queries requires an entity to be selected, you can set an entity URI for testing in the input field above the preview.

If a source query is not activated, the corresponding template will not be rendered.

The following images show the configuration and output of a Templated query viewer visualization for a selected company:

Configuration of a Siren Investigate query viewer visualization

Advanced options

By clicking the Advanced link, you can set additional rendering options.

It is possible to set additional template variables by writing them as JSON object properties in the Template variables textarea.

For example, to customize the heading of the generic table template (this is done automatically by the Label input field above), which is set by default to the id of the source query, you can customize the label variable as follows:

{
    "label": "Info"
}

By default, template contents are hidden and can be displayed by clicking the show link in the heading; to make template contents visible by default, check Render opened box.

Advanced options

Siren Investigate multi chart

This visualization displays a multiple types of chart according to the current selection of multiple configurations.

Multi Chart

multi chart

Multi Chart is not a type of chart by itself, it can contains a set of other charts (such as a Pie chart) and enables you to switch to other type of chart with the same aggregations.

Multi configurations

multi configurations

Visualize settings

Selection 1

New configuration

After changing the aggregation settings and setting the desired type of chart, you can click Add this configuration to save the configuration as a separate one.

new configuration

Multi Chart has the following options

  • Show type selector - Enables you to show/hide the button bar for the chart type selection.

  • Show dropdown menu - Enables you to show/hide the dropdown menu for the aggregation configuration selection.

  • Show menu navigation buttons - Allow show/hide the navigation buttons around the dropdown menu.

Data table

Metric Aggregations:

Count

The count aggregation returns a raw count of the elements in the selected index pattern.

Average

This aggregation returns the average of a numeric field. Select a field from the drop-down.

Sum

The sum aggregation returns the total sum of a numeric field. Select a field from the drop-down.

Min

The min aggregation returns the minimum value of a numeric field. Select a field from the drop-down.

Max

The max aggregation returns the maximum value of a numeric field. Select a field from the drop-down.

Unique Count

The cardinality aggregation returns the number of unique values in a field. Select a field from the drop-down.

Standard Deviation

The extended stats aggregation returns the standard deviation of data in a numeric field. Select a field from the drop-down.

Percentiles

The percentile aggregation divides the values in a numeric field into percentile bands that you specify. Select a field from the drop-down, then specify one or more ranges in the Percentiles fields. Click the X to remove a percentile field. Click + Add to add a percentile field.

Percentile Rank

The percentile ranks aggregation returns the percentile rankings for the values in the numeric field you specify. Select a numeric field from the drop-down, then specify one or more percentile rank values in the Values fields. Click the X to remove a values field. Click +Add to add a values field.

Parent Pipeline Aggregations:

For each of the parent pipeline aggregations you have to define the metric for which the aggregation is calculated. That could be one of your existing metrics or a new one. You can also nest this aggregations (for example to produce 3rd derivative)

Derivative

The derivative aggregation calculates the derivative of specific metrics.

Cumulative Sum

The cumulative sum aggregation calculates the cumulative sum of a specified metric in a parent histogram

Moving Average

The moving average aggregation will slide a window across the data and emit the average value of that window

Serial Diff

The serial differencing is a technique where values in a time series are subtracted from itself at different time lags or period

Sibling Pipeline Aggregations:

Just like with parent pipeline aggregations you need to provide a metric for which to calculate the sibling aggregation. On top of that you also need to provide a bucket aggregation which will define the buckets on which the sibling aggregation will run

Average Bucket

The avg bucket calculates the (mean) average value of a specified metric in a sibling aggregation

Sum Bucket

The sum bucket calculates the sum of values of a specified metric in a sibling aggregation

Min Bucket

The min bucket calculates the minimum value of a specified metric in a sibling aggregation

Max Bucket

The max bucket calculates the maximum value of a specified metric in a sibling aggregation

You can add an aggregation by clicking the + Add Metrics button.

Enter a string in the Custom Label field to change the display label.

The rows of the data table are called buckets. You can define buckets to split the table into rows or to split the table into additional tables.

Each bucket type supports the following aggregations:

Date Histogram

A date histogram is built from a numeric field and organized by date. You can specify a time frame for the intervals in seconds, minutes, hours, days, weeks, months, or years. You can also specify a custom interval frame by selecting Custom as the interval and specifying a number and a time unit in the text field. Custom interval time units are s for seconds, m for minutes, h for hours, d for days, w for weeks, and y for years. Different units support different levels of precision, down to one second. Intervals are labeled at the start of the interval, using the date-key returned by Elasticsearch. For example, the tooltip for a monthly interval will show the first day of the month.

Histogram

A standard histogram is built from a numeric field. Specify an integer interval for this field. Select the Show empty buckets checkbox to include empty intervals in the histogram.

Range

With a range aggregation, you can specify ranges of values for a numeric field. Click Add Range to add a set of range endpoints. Click the red (x) symbol to remove a range.

Date Range

A date range aggregation reports values that are within a range of dates that you specify. You can specify the ranges for the dates using date math expressions. Click Add Range to add a set of range endpoints. Click the red (/) symbol to remove a range.

IPv4 Range

The IPv4 range aggregation enables you to specify ranges of IPv4 addresses. Click Add Range to add a set of range endpoints. Click the red (/) symbol to remove a range.

Terms

A terms aggregation enables you to specify the top or bottom n elements of a given field to display, ordered by count or a custom metric.

Filters

You can specify a set of filters for the data. You can specify a filter as a query string or in JSON format, just as in the Discover search bar. Click Add Filter to add another filter. Click Label (fa tag) to open the label field, where you can type in a name to display on the visualization.

Significant Terms

Displays the results of the experimental significant terms aggregation. The value of the Size parameter defines the number of entries this aggregation returns.

Geohash

The geohash aggregation displays points based on the geohash coordinates.

After you have specified a bucket type aggregation, you can define sub-buckets to refine the visualization. Click + Add sub-buckets to define a sub-bucket, then choose Split Rows or Split Table, then select an aggregation from the list of types.

You can use the up or down arrows to the right of the aggregation’s type to change the aggregation’s priority.

Enter a string in the Custom Label field to change the display label.

You can click the Advanced link to display more customization options for your metrics or bucket aggregation:

Exclude Pattern

Specify a pattern in this field to exclude from the results.

Include Pattern

Specify a pattern in this field to include in the results.

JSON Input

A text field where you can add specific JSON-formatted properties to merge with the aggregation definition, as in the following example:

{ "script" : "doc['grade'].value * 1.2" }
In Elasticsearch releases 1.4.3 and later, this functionality requires you to enable dynamic Groovy scripting.

The availability of these options varies depending on the aggregation you choose.

Select the Options tab to change the following aspects of the table:

Per Page

This field controls the pagination of the table. The default value is ten rows per page.

Checkboxes are available to enable and disable the following behaviors:

Show metrics for every bucket/level

Check this box to display the intermediate results for each bucket aggregation.

Show partial rows

Check this box to display a row even when there is no result.

Enabling these behaviors may have a substantial effect on performance.

Viewing detailed information

Visualization Spy

To display the raw data behind the visualization, click Spy Open (fa chevron circle up) in the bottom left corner of the container. The visualization spy panel will open.

Use the select input (highlighted) to view detailed information about the raw data.

spy panel

Table

A representation of the underlying data, presented as a paginated data grid. You can sort the items in the table by clicking the table headers at the top of each column.

Request

The raw request used to query the server, presented in JSON format.

Response

The raw response from the server, presented in JSON format.

Statistics

A summary of the statistics related to the request and the response, presented as a data grid. The data grid includes the query duration, the request duration, the total number of records found on the server, and the index pattern used to make the query.

Debug

The visualization saved state presented in JSON format.

To export the raw data behind the visualization as a comma-separated-values (CSV) file, click either the Raw or Formatted links at the bottom of any of the detailed information tabs. A raw export contains the data as it is stored in Elasticsearch. A formatted export contains the results of any applicable field formatters.

Markdown widget

The Markdown widget is a text entry field that accepts GitHub-flavored Markdown text. Siren Investigate renders the text you enter in this field and displays the results on the dashboard. You can click the Help link to go to the help page for GitHub flavored Markdown. Click Apply to display the rendered text in the Preview pane or Discard to revert to a previous version.

Metric

A metric visualization displays a single number for each aggregation you select:

Metric Aggregations:

Count

The count aggregation returns a raw count of the elements in the selected index pattern.

Average

This aggregation returns the average of a numeric field. Select a field from the drop-down.

Sum

The sum aggregation returns the total sum of a numeric field. Select a field from the drop-down.

Min

The min aggregation returns the minimum value of a numeric field. Select a field from the drop-down.

Max

The max aggregation returns the maximum value of a numeric field. Select a field from the drop-down.

Unique Count

The cardinality aggregation returns the number of unique values in a field. Select a field from the drop-down.

Standard Deviation

The extended stats aggregation returns the standard deviation of data in a numeric field. Select a field from the drop-down.

Percentiles

The percentile aggregation divides the values in a numeric field into percentile bands that you specify. Select a field from the drop-down, then specify one or more ranges in the Percentiles fields. Click the X to remove a percentile field. Click + Add to add a percentile field.

Percentile Rank

The percentile ranks aggregation returns the percentile rankings for the values in the numeric field you specify. Select a numeric field from the drop-down, then specify one or more percentile rank values in the Values fields. Click the X to remove a values field. Click +Add to add a values field.

Parent Pipeline Aggregations:

For each of the parent pipeline aggregations you have to define the metric for which the aggregation is calculated. That could be one of your existing metrics or a new one. You can also nest this aggregations (for example to produce 3rd derivative)

Derivative

The derivative aggregation calculates the derivative of specific metrics.

Cumulative Sum

The cumulative sum aggregation calculates the cumulative sum of a specified metric in a parent histogram

Moving Average

The moving average aggregation will slide a window across the data and emit the average value of that window

Serial Diff

The serial differencing is a technique where values in a time series are subtracted from itself at different time lags or period

Sibling Pipeline Aggregations:

Just like with parent pipeline aggregations you need to provide a metric for which to calculate the sibling aggregation. On top of that you also need to provide a bucket aggregation which will define the buckets on which the sibling aggregation will run

Average Bucket

The avg bucket calculates the (mean) average value of a specified metric in a sibling aggregation

Sum Bucket

The sum bucket calculates the sum of values of a specified metric in a sibling aggregation

Min Bucket

The min bucket calculates the minimum value of a specified metric in a sibling aggregation

Max Bucket

The max bucket calculates the maximum value of a specified metric in a sibling aggregation

You can add an aggregation by clicking the + Add Metrics button.

Enter a string in the Custom Label field to change the display label.

You can click the Advanced link to display more customization options:

JSON Input

A text field where you can add specific JSON-formatted properties to merge with the aggregation definition, as in the following example:

{ "script" : "doc['grade'].value * 1.2" }
In Elasticsearch releases 1.4.3 and later, this functionality requires you to enable dynamic Groovy scripting.

The availability of these options varies depending on the aggregation you choose.

Click the Options tab to display the font size slider.

Viewing detailed information

Visualization Spy

To display the raw data behind the visualization, click Spy Open (fa chevron circle up) in the bottom left corner of the container. The visualization spy panel will open.

Use the select input (highlighted) to view detailed information about the raw data.

spy panel

Table

A representation of the underlying data, presented as a paginated data grid. You can sort the items in the table by clicking the table headers at the top of each column.

Request

The raw request used to query the server, presented in JSON format.

Response

The raw response from the server, presented in JSON format.

Statistics

A summary of the statistics related to the request and the response, presented as a data grid. The data grid includes the query duration, the request duration, the total number of records found on the server, and the index pattern used to make the query.

Debug

The visualization saved state presented in JSON format.

To export the raw data behind the visualization as a comma-separated-values (CSV) file, click either the Raw or Formatted links at the bottom of any of the detailed information tabs. A raw export contains the data as it is stored in Elasticsearch. A formatted export contains the results of any applicable field formatters.

Goal and gauge

A goal visualization displays how your metric progresses toward a fixed goal. A gauge visualization displays in which predefined range falls your metric.

Metric Aggregations:

Count

The count aggregation returns a raw count of the elements in the selected index pattern.

Average

This aggregation returns the average of a numeric field. Select a field from the drop-down.

Sum

The sum aggregation returns the total sum of a numeric field. Select a field from the drop-down.

Min

The min aggregation returns the minimum value of a numeric field. Select a field from the drop-down.

Max

The max aggregation returns the maximum value of a numeric field. Select a field from the drop-down.

Unique Count

The cardinality aggregation returns the number of unique values in a field. Select a field from the drop-down.

Standard Deviation

The extended stats aggregation returns the standard deviation of data in a numeric field. Select a field from the drop-down.

Percentiles

The percentile aggregation divides the values in a numeric field into percentile bands that you specify. Select a field from the drop-down, then specify one or more ranges in the Percentiles fields. Click the X to remove a percentile field. Click + Add to add a percentile field.

Percentile Rank

The percentile ranks aggregation returns the percentile rankings for the values in the numeric field you specify. Select a numeric field from the drop-down, then specify one or more percentile rank values in the Values fields. Click the X to remove a values field. Click +Add to add a values field.

Parent Pipeline Aggregations:

For each of the parent pipeline aggregations you have to define the metric for which the aggregation is calculated. That could be one of your existing metrics or a new one. You can also nest this aggregations (for example to produce 3rd derivative)

Derivative

The derivative aggregation calculates the derivative of specific metrics.

Cumulative Sum

The cumulative sum aggregation calculates the cumulative sum of a specified metric in a parent histogram

Moving Average

The moving average aggregation will slide a window across the data and emit the average value of that window

Serial Diff

The serial differencing is a technique where values in a time series are subtracted from itself at different time lags or period

Sibling Pipeline Aggregations:

Just like with parent pipeline aggregations you need to provide a metric for which to calculate the sibling aggregation. On top of that you also need to provide a bucket aggregation which will define the buckets on which the sibling aggregation will run

Average Bucket

The avg bucket calculates the (mean) average value of a specified metric in a sibling aggregation

Sum Bucket

The sum bucket calculates the sum of values of a specified metric in a sibling aggregation

Min Bucket

The min bucket calculates the minimum value of a specified metric in a sibling aggregation

Max Bucket

The max bucket calculates the maximum value of a specified metric in a sibling aggregation

You can add an aggregation by clicking the + Add Metrics button.

Enter a string in the Custom Label field to change the display label.

Open the Advanced link to display more customization options:

JSON Input

A text field where you can add specific JSON-formatted properties to merge with the aggregation definition, as in the following example:

{ "script" : "doc['grade'].value * 1.2" }
In Elasticsearch releases 1.4.3 and later, this functionality requires you to enable dynamic Groovy scripting.

The availability of these options varies depending on the aggregation you choose.

Click the Options tab to change the following options:

  • Gauge Type select between arc, circle and metric display type.

  • Percentage Mode will show all values as percentages

  • Vertical Split will put the gauges one under another instead of one next to another

  • Show Labels selects whether you want to show or hide the labels

  • Sub Text text for the label that appears below the value

  • Auto Extend Range automatically grows the gauge if value is over its extents.

  • Ranges you can add custom ranges. Each range will get assigned a color. If value falls within that range it will get assigned that color. A chart with a single range is called a goal chart. A chart with multiple ranges is called a gauge chart.

  • Color Options define how to color your ranges (which color schema to use). Color options are only visible if more than one range is defined.

  • Style - Show Scale shows or hides the scale

  • Style - Color Labels whether the labels should have the same color as the range where the value falls in

Pie charts

The slice size of a pie chart is determined by the metrics aggregation. The following aggregations are available for this axis:

Count

The count aggregation returns a raw count of the elements in the selected index pattern.

Sum

The sum aggregation returns the total sum of a numeric field. Select a field from the drop-down.

Unique Count

The cardinality aggregation returns the number of unique values in a field. Select a field from the drop-down.

Enter a string in the Custom Label field to change the display label.

The buckets aggregations determine what information is being retrieved from your data set.

Before you choose a buckets aggregation, specify if you are splitting slices within a single chart or splitting into multiple charts. A multiple chart split must run before any other aggregations. When you split a chart, you can change if the splits are displayed in a row or a column by clicking the Rows | Columns selector.

You can specify any of the following bucket aggregations for your pie chart:

Date Histogram

A date histogram is built from a numeric field and organized by date. You can specify a time frame for the intervals in seconds, minutes, hours, days, weeks, months, or years. You can also specify a custom interval frame by selecting Custom as the interval and specifying a number and a time unit in the text field. Custom interval time units are s for seconds, m for minutes, h for hours, d for days, w for weeks, and y for years. Different units support different levels of precision, down to one second. Intervals are labeled at the start of the interval, using the date-key returned by Elasticsearch. For example, the tooltip for a monthly interval will show the first day of the month.

Histogram

A standard histogram is built from a numeric field. Specify an integer interval for this field. Select the Show empty buckets checkbox to include empty intervals in the histogram.

Range

With a range aggregation, you can specify ranges of values for a numeric field. Click Add Range to add a set of range endpoints. Click the red (x) symbol to remove a range.

Date Range

A date range aggregation reports values that are within a range of dates that you specify. You can specify the ranges for the dates using date math expressions. Click Add Range to add a set of range endpoints. Click the red (/) symbol to remove a range.

IPv4 Range

The IPv4 range aggregation enables you to specify ranges of IPv4 addresses. Click Add Range to add a set of range endpoints. Click the red (/) symbol to remove a range.

Terms

A terms aggregation enables you to specify the top or bottom n elements of a given field to display, ordered by count or a custom metric.

Filters

You can specify a set of filters for the data. You can specify a filter as a query string or in JSON format, just as in the Discover search bar. Click Add Filter to add another filter. Click Label (fa tag) to open the label field, where you can type in a name to display on the visualization.

Significant Terms

Displays the results of the experimental significant terms aggregation. The value of the Size parameter defines the number of entries this aggregation returns.

After defining an initial bucket aggregation, you can define sub-buckets to refine the visualization. Click + Add sub-buckets to define a sub-aggregation, then choose Split Slices to select a sub-bucket from the list of types.

When multiple aggregations are defined on a chart’s axis, you can use the up or down arrows to the right of the aggregation’s type to change the aggregation’s priority.

You can customize the colors of your visualization by clicking the color dot next to each label to display the color picker.

An array of color dots that users can select

Enter a string in the Custom Label field to change the display label.

You can click the Advanced link to display more customization options for your metrics or bucket aggregation:

Exclude Pattern

Specify a pattern in this field to exclude from the results.

Include Pattern

Specify a pattern in this field to include in the results.

JSON Input

A text field where you can add specific JSON-formatted properties to merge with the aggregation definition, as in the following example:

{ "script" : "doc['grade'].value * 1.2" }
In Elasticsearch releases 1.4.3 and later, this functionality requires you to enable dynamic Groovy scripting.

The availability of these options varies depending on the aggregation you choose.

Select the Options tab to change the following aspects of the table:

Donut

Display the chart as a sliced ring instead of a sliced pie.

Show Tooltip

Check this box to enable the display of tooltips.

After changing options, click Apply changes to update your visualization, or Discard changes to keep your visualization in its current state.

Viewing detailed information

Visualization Spy

To display the raw data behind the visualization, click Spy Open (fa chevron circle up) in the bottom left corner of the container. The visualization spy panel will open.

Use the select input (highlighted) to view detailed information about the raw data.

spy panel

Table

A representation of the underlying data, presented as a paginated data grid. You can sort the items in the table by clicking the table headers at the top of each column.

Request

The raw request used to query the server, presented in JSON format.

Response

The raw response from the server, presented in JSON format.

Statistics

A summary of the statistics related to the request and the response, presented as a data grid. The data grid includes the query duration, the request duration, the total number of records found on the server, and the index pattern used to make the query.

Debug

The visualization saved state presented in JSON format.

To export the raw data behind the visualization as a comma-separated-values (CSV) file, click either the Raw or Formatted links at the bottom of any of the detailed information tabs. A raw export contains the data as it is stored in Elasticsearch. A formatted export contains the results of any applicable field formatters.

Coordinate maps

A coordinate map displays a geographic area overlaid with circles keyed to the data determined by the buckets you specify.

By default, Siren Investigate uses a demonstration Siren tilemap server Open Street Maps service to display map tiles. This server has limited features and you should update the tilemap settings to another tilemap provider that you have configured, especially in a production setting. To use other tile service providers, configure the tilemap settings in investigate.yml.

Configuration

Configuring external tilemap providers

You can use existing free or paid tilemap providers or build and serve your own tilemap tiles.

After you have setup your own tilemap provider, configure these settings in investigate.yml to have map visualizations render these tiles.

For example, to use an OpenStreetMap default provider, the configuration YAML settings would look like:

tilemap:
  url: 'https://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png'
  options:
    attribution: '&copy; [OpenStreetMap]("http://www.openstreetmap.org/copyright")'
    subdomains:
      - a

Data

Metrics

The default metrics aggregation for a coordinate map is the Count aggregation. You can select any of the following aggregations as the metrics aggregation:

Count

The count aggregation returns a raw count of the elements in the selected index pattern.

Average

This aggregation returns the average of a numeric field. Select a field from the drop-down.

Sum

The sum aggregation returns the total sum of a numeric field. Select a field from the drop-down.

Min

The min aggregation returns the minimum value of a numeric field. Select a field from the drop-down.

Max

The max aggregation returns the maximum value of a numeric field. Select a field from the drop-down.

Unique Count

The cardinality aggregation returns the number of unique values in a field. Select a field from the drop-down.

Enter a string in the Custom Label field to change the display label.

Buckets

Coordinate maps use the geohash aggregation. Select a field, typically coordinates, from the drop-down.

  • The_Change precision on map zoom_ box is checked by default. Uncheck the box to disable this behavior. The Precision slider determines the granularity of the results displayed on the map. See the documentation for the geohash grid aggregation for details on the area specified by each precision level.

Higher precisions increase memory usage for the browser displaying Siren Investigate as well as for the underlying Elasticsearch cluster.
  • The place markers off grid (use geocentroid) box is checked by default. When this box is checked, the markers are placed in the center of all the documents in that bucket. When unchecked, the markers are placed in the center of the geohash grid cell. Leaving this checked generally results in a more accurate visualization.

Enter a string in the Custom Label field to change the display label.

You can click the Advanced link to display more customization options for your metrics or bucket aggregation:

Exclude Pattern

Specify a pattern in this field to exclude from the results.

Include Pattern

Specify a pattern in this field to include in the results.

JSON Input

A text field where you can add specific JSON-formatted properties to merge with the aggregation definition, as in the following example:

{ "script" : "doc['grade'].value * 1.2" }
In Elasticsearch releases 1.4.3 and later, this functionality requires you to enable dynamic Groovy scripting.

The availability of these options varies depending on the aggregation you choose.

Options

Map type

Select one of the following options from the drop-down.

Scaled Circle Markers

Scale the size of the markers based on the metric aggregation’s value.

Shaded Circle Markers

Displays the markers with different shades based on the metric aggregation’s value.

Shaded Geohash Grid

Displays the rectangular cells of the geohash grid instead of circular markers, with different shades based on the metric aggregation’s value.

Heatmap

A heat map applies blurring to the circle markers and applies shading based on the amount of overlap. Heatmaps have the following options:

  • Radius: Sets the size of the individual heatmap dots.

  • Blur: Sets the amount of blurring for the heatmap dots.

  • Maximum zoom: Tilemaps in Siren Investigate support 18 zoom levels. This slider defines the maximum zoom level at which the heatmap dots appear at full intensity.

  • Minimum opacity: Sets the opacity cutoff for the dots.

  • Show Tooltip: Check this box to have a tooltip with the values for a given dot when the cursor is on that dot.

Desaturate map tiles

Desaturate the map’s color in order to make the markers stand out more clearly.

WMS compliant map server

Check this box to enable the use of a third-party mapping service that complies with the Web Map Service (WMS) standard. Specify the following elements:

  • WMS url: The URL for the WMS map service.

  • WMS layers: A comma-separated list of the layers to use in this visualization. Each map server provides its own list of layers.

  • WMS version: The WMS version used by this map service.

  • WMS format: The image format used by this map service. The two most common formats are image/png and image/jpeg.

  • WMS attribution: An optional, user-defined string that identifies the map source. Maps display the attribution string in the lower right corner.

  • WMS styles: A comma-separated list of the styles to use in this visualization. Each map server provides its own styling options.

After changing options, click Apply changes to update your visualization, or Discard changes to keep your visualization in its current state.

After your tilemap visualization is ready, you can explore the map in several ways:

  • Click and hold anywhere on the map and move the cursor to move the map center. Hold Shift and drag a bounding box across the map to zoom in on the selection.

  • Click Zoom In/Out (si zoom) to change the zoom level manually.

  • Click Fit Data Bounds (fa crop) to automatically crop the map boundaries to the geohash buckets that have at least one result.

  • Click Latitude/Longitude Filter (fa stop), then drag a bounding box across the map, to create a filter for the box coordinates.

Viewing detailed information

Visualization Spy

To display the raw data behind the visualization, click Spy Open (fa chevron circle up) in the bottom left corner of the container. The visualization spy panel will open.

Use the select input (highlighted) to view detailed information about the raw data.

spy panel

Table

A representation of the underlying data, presented as a paginated data grid. You can sort the items in the table by clicking the table headers at the top of each column.

Request

The raw request used to query the server, presented in JSON format.

Response

The raw response from the server, presented in JSON format.

Statistics

A summary of the statistics related to the request and the response, presented as a data grid. The data grid includes the query duration, the request duration, the total number of records found on the server, and the index pattern used to make the query.

Debug

The visualization saved state presented in JSON format.

To export the raw data behind the visualization as a comma-separated-values (CSV) file, click either the Raw or Formatted links at the bottom of any of the detailed information tabs. A raw export contains the data as it is stored in Elasticsearch. A formatted export contains the results of any applicable field formatters.

Region maps

Region maps are thematic maps in which boundary vector shapes are colored using a gradient: higher intensity colors indicate larger values, and lower intensity colors indicate smaller values. These are also known as choropleth maps.

regionmap

Configuration

To create a region map, you configure an inner join that joins the result of an Elasticsearch terms aggregation and a reference vector file based on a shared key.

Data

Metrics

Select any of the supported Metric or Sibling Pipeline Aggregations.

Buckets

Configure a Terms aggregation. The term is the key that is used to join the results to the vector data on the map.

Options

Layer settings
  • Vector map: select from a list of vector maps. This list includes the maps that are hosted by the © Elastic Maps Service, as well as your self-hosted layers that are configured in the config/kibana.yml file. To learn more about how to configure Kibana to make self-hosted layers available, see the regionmap settings documentation.

  • Join field: this is the property from the selected vector map that will be used to join on the terms in your terms-aggregation. When terms cannot be joined to any of the shapes in the vector layer because there is no exact match in the vector layer, Kibana will display a warning. To turn of these warnings, go to Management/Kibana/Advanced Settings and set visualization:regionmap:showWarnings to false.

Style settings
  • Color Schema: the color range used to color the shapes.

Basic settings
  • Legend Position: the location on the screen where the legend should be rendered.

  • Show Tooltip: indicates whether a tooltip should be displayed when hovering over a shape..

Time Series Visual Builder

Experimental Feature

Time Series Visual Builder is a time series data visualizer with an emphasis on allowing you to use the full power of Elasticsearch aggregation framework. Time Series Visual Builder enables you to combine an infinite number of aggregations and pipeline aggregations to display complex data in a meaningful way.

Time Series Visual Builder Interface

Time Series Visual Build comes with 5 different visualization types. You can switch between each visualization type using the tabbed picker at the top of the interface.

Time Series

A histogram visualization that supports area, line, bar, and steps along with multiple y-axis. You can fully customize the colors, points, line thickness and fill opacity. This visualization also supports time shifting to compare two time periods. This visualization also supports annotations which can be loaded from a seperate index based on a query.

Time Series Visualization

Metric

A visualization for displaying the latest number in a series. This visualization supports 2 metrics; a primary metric and a secondary metric. The labels and backgrounds can be fully customizable based on a set of rules.

Metric Visualization

Top N

This is a horizontal bar chart where the y-axis is based on a series of metrics and the x-axis is the latest value in those series; sorted in descending order. The color of the bars are fully customizable based on set of rules.

Top N Visualization

Gauge

This is a single value gauge visualization based on the latest value in a series. The face of the gauge can either be a half-circle gauge or full-circle. You can customize the thicknesses of the inner and outer lines to achieve a desired design aesthetic. The color of the gauge and the text are fully customizable based on a set of rules.

Gauge Visualization

Markdown

This visualization enables you to enter Markdown text and embed Mustache template syntax to customize the Markdown with data based on a set of series. This visualization also supports HTML markup along with the ability to define a custom stylesheet.

Markdown Visualization

Interface Overview

The user interface for each visualization is compose of a "Data" tab and "Panel Options". The only exception to that is the Time Series and Markdown visualizations; the Time Series has a third tab for annotations and the Markdown has a third tab for the editor.

Data Tab

The data tab is used for configuring the series for each visualization. This tab enables you to add multiple series, depending on what the visualization supports, with multiple aggregations composed together to create a single metric. Here is a breakdown of the significant components of the data tab UI.

Series label and color

Each series supports a label which will be used for legends and titles depending on which visualization type is selected. For series that are grouped by a term, you can specify a mustache variable of {{key}} to substitute the term. For most visualizations you can also choose a color by clicking the swatch, this will display the color picker.

Label Example

Metrics

Each series supports multiple metrics (aggregations); the last metric (aggregation) is the value that will be displayed for the series, this is indicated with the "eye" icon to the left of the metric. Metrics can be composed using pipeline aggregations. A common use case is to create a metric with a "max" aggregation then create a "derivative" metric and choose the previous "max" metric as the source; this will create a rate.

Derivative Example

Series options

Each series also supports a set of options which are dependent on the type of visualizations you have selected. Universal across each visualization type you can configure:

  • Data format

  • Time range offset

  • Index pattern, timestamp, and interval override

Default Series Options

For the Time Series visualization you can also configure:

  • Chart type

  • Options for each chart type

  • Legend Visibility

  • Y-Axis options

  • Split color theme

Time Series Series Options

Group by controls

At the bottom of the metrics there is a set of "Group By" controls that enables you to specify how the series should be grouped or split. There are four choices:

  • Everything

  • Filter (single)

  • Filters (multiple with configurable colors)

  • Terms

By default the series is grouped by everything.

Panel Options Tab

The panel options tab is used for configuring the entire panel; the set of options available is dependent on which visualization you have selected. The following is a list of the options available per visualization:

Time Series

  • Index pattern, timestamp, and Interval

  • Y-Axis min and max

  • Y-Axis position

  • Background color

  • Legend visibility

  • Legend position

  • Panel filter

Metric

  • Index pattern, timestamp, and interval

  • Panel filter

  • Color rules for background and primary value

Top N

  • Index pattern, timestamp, and interval

  • Panel filter

  • Background color

  • Item URL

  • Color rules for bar colors

Gauge

  • Index pattern, timestamp, and interval

  • Panel filter

  • Background color

  • Gauge max

  • Gauge style

  • Inner gauge color

  • Inner gauge width

  • Gauge line width

  • Color rules for gauge line

Markdown

  • Index pattern, timestamp, and interval

  • Panel filter

  • Background color

  • Scroll bar visibility

  • Vertical alignment of content

  • Custom Panel CSS with support for Less syntax

Annotations Tab

The annotations tab is used for adding annotation data sources to the Time Series Visualization. You can configure the following options:

  • Index pattern and time field

  • Annotation color

  • Annotation icon

  • Fields to include in message

  • Format of message

  • Filtering options at the panel and global level

Annotation Tab

Markdown Tab

The markdown tab is used for editing the source for the Markdown visualization. The user interface has an editor on the left side and the available variables from the data tab on the right side. You can click the variable names to insert the mustache template variable into the markdown at the cursor position. The mustache syntax uses the Handlebar.js processor which is an extended version of the Mustache template language.

Markdown Tab

Tag clouds

A tag cloud visualization is a visual representation of text data, typically used to visualize free form text. Tags are usually single words, and the importance of each tag is shown with font size or color.

The font size for each word is determined by the metrics aggregation. The following aggregations are available for this chart:

Metric Aggregations:

Count

The count aggregation returns a raw count of the elements in the selected index pattern.

Average

This aggregation returns the average of a numeric field. Select a field from the drop-down.

Sum

The sum aggregation returns the total sum of a numeric field. Select a field from the drop-down.

Min

The min aggregation returns the minimum value of a numeric field. Select a field from the drop-down.

Max

The max aggregation returns the maximum value of a numeric field. Select a field from the drop-down.

Unique Count

The cardinality aggregation returns the number of unique values in a field. Select a field from the drop-down.

Standard Deviation

The extended stats aggregation returns the standard deviation of data in a numeric field. Select a field from the drop-down.

Percentiles

The percentile aggregation divides the values in a numeric field into percentile bands that you specify. Select a field from the drop-down, then specify one or more ranges in the Percentiles fields. Click the X to remove a percentile field. Click + Add to add a percentile field.

Percentile Rank

The percentile ranks aggregation returns the percentile rankings for the values in the numeric field you specify. Select a numeric field from the drop-down, then specify one or more percentile rank values in the Values fields. Click the X to remove a values field. Click +Add to add a values field.

Parent Pipeline Aggregations:

For each of the parent pipeline aggregations you have to define the metric for which the aggregation is calculated. That could be one of your existing metrics or a new one. You can also nest this aggregations (for example to produce 3rd derivative)

Derivative

The derivative aggregation calculates the derivative of specific metrics.

Cumulative Sum

The cumulative sum aggregation calculates the cumulative sum of a specified metric in a parent histogram

Moving Average

The moving average aggregation will slide a window across the data and emit the average value of that window

Serial Diff

The serial differencing is a technique where values in a time series are subtracted from itself at different time lags or period

Sibling Pipeline Aggregations:

Just like with parent pipeline aggregations you need to provide a metric for which to calculate the sibling aggregation. On top of that you also need to provide a bucket aggregation which will define the buckets on which the sibling aggregation will run

Average Bucket

The avg bucket calculates the (mean) average value of a specified metric in a sibling aggregation

Sum Bucket

The sum bucket calculates the sum of values of a specified metric in a sibling aggregation

Min Bucket

The min bucket calculates the minimum value of a specified metric in a sibling aggregation

Max Bucket

The max bucket calculates the maximum value of a specified metric in a sibling aggregation

You can add an aggregation by clicking the + Add Metrics button.

Enter a string in the Custom Label field to change the display label.

The buckets aggregations determine what information is being retrieved from your data set.

Before you choose a buckets aggregation, select the Split Tags option.

You can specify the following bucket aggregations for tag cloud visualization:

Terms

A terms aggregation enables you to specify the top or bottom n elements of a given field to display, ordered by count or a custom metric.

You can click the Advanced link to display more customization options for your metrics or bucket aggregation:

JSON Input

A text field where you can add specific JSON-formatted properties to merge with the aggregation definition, as in the following example:

{ "script" : "doc['grade'].value * 1.2" }
In Elasticsearch releases 1.4.3 and later, this functionality requires you to enable dynamic Groovy scripting.

Select the Options tab to change the following aspects of the chart:

Text Scale

You can select linear, log, or square root scales for the text scale. You can use a log scale to display data that varies exponentially or a square root scale to regularize the display of data sets with variabilities that are themselves highly variable.

Orientation

You can select how to orientate your text in the tag cloud. You can choose one of the following options: Single, right angles and multiple.

Font Size

Enables you to set minimum and maximum font size to use for this visualization.

Visualization Spy

To display the raw data behind the visualization, click Spy Open (fa chevron circle up) in the bottom left corner of the container. The visualization spy panel will open.

Use the select input (highlighted) to view detailed information about the raw data.

spy panel

Table

A representation of the underlying data, presented as a paginated data grid. You can sort the items in the table by clicking the table headers at the top of each column.

Request

The raw request used to query the server, presented in JSON format.

Response

The raw response from the server, presented in JSON format.

Statistics

A summary of the statistics related to the request and the response, presented as a data grid. The data grid includes the query duration, the request duration, the total number of records found on the server, and the index pattern used to make the query.

Debug

The visualization saved state presented in JSON format.

To export the raw data behind the visualization as a comma-separated-values (CSV) file, click either the Raw or Formatted links at the bottom of any of the detailed information tabs. A raw export contains the data as it is stored in Elasticsearch. A formatted export contains the results of any applicable field formatters.

Heatmap chart

A heat map is a graphical representation of data where the individual values contained in a matrix are represented as colors. The color for each matrix position is determined by the metrics aggregation. The following aggregations are available for this chart:

Metric Aggregations:

Count

The count aggregation returns a raw count of the elements in the selected index pattern.

Average

This aggregation returns the average of a numeric field. Select a field from the drop-down.

Sum

The sum aggregation returns the total sum of a numeric field. Select a field from the drop-down.

Min

The min aggregation returns the minimum value of a numeric field. Select a field from the drop-down.

Max

The max aggregation returns the maximum value of a numeric field. Select a field from the drop-down.

Unique Count

The cardinality aggregation returns the number of unique values in a field. Select a field from the drop-down.

Standard Deviation

The extended stats aggregation returns the standard deviation of data in a numeric field. Select a field from the drop-down.

Percentiles

The percentile aggregation divides the values in a numeric field into percentile bands that you specify. Select a field from the drop-down, then specify one or more ranges in the Percentiles fields. Click the X to remove a percentile field. Click + Add to add a percentile field.

Percentile Rank

The percentile ranks aggregation returns the percentile rankings for the values in the numeric field you specify. Select a numeric field from the drop-down, then specify one or more percentile rank values in the Values fields. Click the X to remove a values field. Click +Add to add a values field.

Parent Pipeline Aggregations:

For each of the parent pipeline aggregations you have to define the metric for which the aggregation is calculated. That could be one of your existing metrics or a new one. You can also nest this aggregations (for example to produce 3rd derivative)

Derivative

The derivative aggregation calculates the derivative of specific metrics.

Cumulative Sum

The cumulative sum aggregation calculates the cumulative sum of a specified metric in a parent histogram

Moving Average

The moving average aggregation will slide a window across the data and emit the average value of that window

Serial Diff

The serial differencing is a technique where values in a time series are subtracted from itself at different time lags or period

Sibling Pipeline Aggregations:

Just like with parent pipeline aggregations you need to provide a metric for which to calculate the sibling aggregation. On top of that you also need to provide a bucket aggregation which will define the buckets on which the sibling aggregation will run

Average Bucket

The avg bucket calculates the (mean) average value of a specified metric in a sibling aggregation

Sum Bucket

The sum bucket calculates the sum of values of a specified metric in a sibling aggregation

Min Bucket

The min bucket calculates the minimum value of a specified metric in a sibling aggregation

Max Bucket

The max bucket calculates the maximum value of a specified metric in a sibling aggregation

You can add an aggregation by clicking the + Add Metrics button.

Enter a string in the Custom Label field to change the display label.

The buckets aggregations determine what information is being retrieved from your data set.

Before you choose a buckets aggregation, specify if you are defining buckets for X or Y axis within a single chart or splitting into multiple charts. A multiple chart split must run before any other aggregations. When you split a chart, you can change if the splits are displayed in a row or a column by clicking the Rows | Columns selector.

This chart’s X and Y axis supports the following aggregations. Click the linked name of each aggregation to visit the main Elasticsearch documentation for that aggregation.

Date Histogram

A date histogram is built from a numeric field and organized by date. You can specify a time frame for the intervals in seconds, minutes, hours, days, weeks, months, or years. You can also specify a custom interval frame by selecting Custom as the interval and specifying a number and a time unit in the text field. Custom interval time units are s for seconds, m for minutes, h for hours, d for days, w for weeks, and y for years. Different units support different levels of precision, down to one second. Intervals are labeled at the start of the interval, using the date-key returned by Elasticsearch. For example, the tooltip for a monthly interval will show the first day of the month.

Histogram

A standard histogram is built from a numeric field. Specify an integer interval for this field. Select the Show empty buckets checkbox to include empty intervals in the histogram.

Range

With a range aggregation, you can specify ranges of values for a numeric field. Click Add Range to add a set of range endpoints. Click the red (x) symbol to remove a range.

Date Range

A date range aggregation reports values that are within a range of dates that you specify. You can specify the ranges for the dates using date math expressions. Click Add Range to add a set of range endpoints. Click the red (x) symbol to remove a range.

IPv4 Range

The IPv4 range aggregation enables you to specify ranges of IPv4 addresses. Click Add Range to add a set of range endpoints. Click the red (x) symbol to remove a range.

Terms

A terms aggregation enables you to specify the top or bottom n elements of a given field to display, ordered by count or a custom metric.

Filters

You can specify a set of filters for the data. You can specify a filter as a query string or in JSON format, just as in the Discover search bar. Click Add Filter to add another filter. Click Label (Label button icon) to open the label field, where you can type in a name to display on the visualization.

Significant Terms

Displays the results of the experimental significant terms aggregation.

Enter a string in the Custom Label field to change the display label.

You can click the Advanced link to display more customization options for your metrics or bucket aggregation:

Exclude Pattern

Specify a pattern in this field to exclude from the results.

Include Pattern

Specify a pattern in this field to include in the results.

JSON Input

A text field where you can add specific JSON-formatted properties to merge with the aggregation definition, as in the following example:

{ "script" : "doc['grade'].value * 1.2" }

The availability of these options varies depending on the aggregation you choose.

Select the Options tab to change the following aspects of the chart:

Show Tooltips

Check this box to enable the display of tooltips.

Highlight

Check this box to enable highlighting of elements with same label

Legend Position

You can select where to display the legend (top, left, right, bottom)

Color Schema

You can select an existing color schema or go for custom and define your own colors in the legend

Reverse Color Schema

Checking this checkbox will reverse the color schema.

Color Scale

You can switch between linear, log and sqrt scales for color scale.

Scale to Data Bounds

The default Y axis bounds are zero and the maximum value returned in the data. Check this box to change both upper and lower bounds to match the values returned in the data.

Number of Colors

Number of color buckets to create. Minimum is 2 and maximum is 10.

Percentage Mode

Enabling this will show legend values as percentages.

Custom Range

You can define custom ranges for your color buckets. For each of the color bucket you need to specify the minimum value (inclusive) and the maximum value (exclusive) of a range.

Show Label

Enables showing labels with cell values in each cell

Rotate

Allows rotating the cell value label by 90 degrees.

Visualization Spy

To display the raw data behind the visualization, click Spy Open (fa chevron circle up) in the bottom left corner of the container. The visualization spy panel will open.

Use the select input (highlighted) to view detailed information about the raw data.

spy panel

Table

A representation of the underlying data, presented as a paginated data grid. You can sort the items in the table by clicking the table headers at the top of each column.

Request

The raw request used to query the server, presented in JSON format.

Response

The raw response from the server, presented in JSON format.

Statistics

A summary of the statistics related to the request and the response, presented as a data grid. The data grid includes the query duration, the request duration, the total number of records found on the server, and the index pattern used to make the query.

Debug

The visualization saved state presented in JSON format.

To export the raw data behind the visualization as a comma-separated-values (CSV) file, click either the Raw or Formatted links at the bottom of any of the detailed information tabs. A raw export contains the data as it is stored in Elasticsearch. A formatted export contains the results of any applicable field formatters.

Line, area, and bar charts

Line, Area, and Bar charts allow you to plot your data on X/Y axis.

First you need to select your metrics which define Value axis.

Metric Aggregations:

Count

The count aggregation returns a raw count of the elements in the selected index pattern.

Average

This aggregation returns the average of a numeric field. Select a field from the drop-down.

Sum

The sum aggregation returns the total sum of a numeric field. Select a field from the drop-down.

Min

The min aggregation returns the minimum value of a numeric field. Select a field from the drop-down.

Max

The max aggregation returns the maximum value of a numeric field. Select a field from the drop-down.

Unique Count

The cardinality aggregation returns the number of unique values in a field. Select a field from the drop-down.

Standard Deviation

The extended stats aggregation returns the standard deviation of data in a numeric field. Select a field from the drop-down.

Percentiles

The percentile aggregation divides the values in a numeric field into percentile bands that you specify. Select a field from the drop-down, then specify one or more ranges in the Percentiles fields. Click the X to remove a percentile field. Click + Add to add a percentile field.

Percentile Rank

The percentile ranks aggregation returns the percentile rankings for the values in the numeric field you specify. Select a numeric field from the drop-down, then specify one or more percentile rank values in the Values fields. Click the X to remove a values field. Click +Add to add a values field.

Parent Pipeline Aggregations:

For each of the parent pipeline aggregations you have to define the metric for which the aggregation is calculated. That could be one of your existing metrics or a new one. You can also nest this aggregations (for example to produce 3rd derivative)

Derivative

The derivative aggregation calculates the derivative of specific metrics.

Cumulative Sum

The cumulative sum aggregation calculates the cumulative sum of a specified metric in a parent histogram

Moving Average

The moving average aggregation will slide a window across the data and emit the average value of that window

Serial Diff

The serial differencing is a technique where values in a time series are subtracted from itself at different time lags or period

Sibling Pipeline Aggregations:

Just like with parent pipeline aggregations you need to provide a metric for which to calculate the sibling aggregation. On top of that you also need to provide a bucket aggregation which will define the buckets on which the sibling aggregation will run

Average Bucket

The avg bucket calculates the (mean) average value of a specified metric in a sibling aggregation

Sum Bucket

The sum bucket calculates the sum of values of a specified metric in a sibling aggregation

Min Bucket

The min bucket calculates the minimum value of a specified metric in a sibling aggregation

Max Bucket

The max bucket calculates the maximum value of a specified metric in a sibling aggregation

You can add an aggregation by clicking the + Add Metrics button.

Enter a string in the Custom Label field to change the display label.

The buckets aggregations determine what information is being retrieved from your data set.

Before you choose a buckets aggregation, specify if you are splitting slices within a single chart or splitting into multiple charts. A multiple chart split must run before any other aggregations. When you split a chart, you can change if the splits are displayed in a row or a column by clicking the Rows | Columns selector.

The X axis of this chart is the buckets axis. You can define buckets for the X axis, for a split area on the chart, or for split charts.

This chart’s X axis supports the following aggregations. Click the linked name of each aggregation to visit the main Elasticsearch documentation for that aggregation.

Date Histogram

A date histogram is built from a numeric field and organized by date. You can specify a time frame for the intervals in seconds, minutes, hours, days, weeks, months, or years. You can also specify a custom interval frame by selecting Custom as the interval and specifying a number and a time unit in the text field. Custom interval time units are s for seconds, m for minutes, h for hours, d for days, w for weeks, and y for years. Different units support different levels of precision, down to one second. Intervals are labeled at the start of the interval, using the date-key returned by Elasticsearch. For example, the tooltip for a monthly interval will show the first day of the month.

Histogram

A standard histogram is built from a numeric field. Specify an integer interval for this field. Select the Show empty buckets checkbox to include empty intervals in the histogram.

Range

With a range aggregation, you can specify ranges of values for a numeric field. Click Add Range to add a set of range endpoints. Click the red (x) symbol to remove a range.

Date Range

A date range aggregation reports values that are within a range of dates that you specify. You can specify the ranges for the dates using date math expressions. Click Add Range to add a set of range endpoints. Click the red (x) symbol to remove a range.

IPv4 Range

The IPv4 range aggregation enables you to specify ranges of IPv4 addresses. Click Add Range to add a set of range endpoints. Click the red (x) symbol to remove a range.

Terms

A terms aggregation enables you to specify the top or bottom n elements of a given field to display, ordered by count or a custom metric.

Filters

You can specify a set of filters for the data. You can specify a filter as a query string or in JSON format, just as in the Discover search bar. Click Add Filter to add another filter. Click Label (Label button icon) to open the label field, where you can type in a name to display on the visualization.

Significant Terms

Displays the results of the experimental significant terms aggregation.

External query terms filter

a Siren Investigate aggregator where one can define one or more buckets based on some record value (typically a primary key) matching the results of an external query. Multiple such buckets, corresponding to multiple queries, can be defined. For more information see the query menu in the configuration. This displays the results of the external query terms filter aggregation.

After you have specified an X axis aggregation, you can define sub-aggregations to refine the visualization. Click + Add Sub Aggregation to define a sub-aggregation, then choose Split Area or Split Chart, then select a sub-aggregation from the list of types.

When multiple aggregations are defined on a chart’s axis, you can use the up or down arrows to the right of the aggregation’s type to change the aggregation’s priority.

Enter a string in the Custom Label field to change the display label.

You can customize the colors of your visualization by clicking the color dot next to each label to display the color picker.

An array of color dots that users can select

Enter a string in the Custom Label field to change the display label.

You can click the Advanced link to display more customization options for your metrics or bucket aggregation:

Exclude Pattern

Specify a pattern in this field to exclude from the results.

Include Pattern

Specify a pattern in this field to include in the results.

JSON Input

A text field where you can add specific JSON-formatted properties to merge with the aggregation definition, as in the following example:

{ "script" : "doc['grade'].value * 1.2" }
In Elasticsearch releases 1.4.3 and later, this functionality requires you to enable dynamic Groovy scripting.

The availability of these options varies depending on the aggregation you choose.

Metrics and Axes

Select the Metrics and Axes tab to change the way each individual metric is shown on the chart. The data series are styled in the Metrics section, while the axes are styled in the X and Y axis sections.

Metrics

Modify how each metric from the Data panel is visualized on the chart.

Chart type

Choose between Area, Line, and Bar types.

Mode

stack the different metrics, or plot them next to each other

Value Axis

choose the axis you want to plot this data too (the properties of each are configured under Y-axes).

Line mode

should the outline of lines or bars appear smooth, straight, or stepped.

Y-axis

Style all the Y-axes of the chart.

Position

position of the Y-axis (left or right for vertical charts, and top or bottom for horizontal charts).

Scale type

scaling of the values (linear, log, or square root)

Advanced Options
Labels - Show Labels

Enables you to hide axis labels

Labels - Filter Labels

If filter labels is enabled some labels will be hidden in case there is not enough space to display them

Labels - Rotate

You can enter the number in degrees for how much you want to rotate labels

Labels - Truncate

You can enter the size in pixels to which the label is truncated

Scale to Data Bounds

The default Y-axis bounds are zero and the maximum value returned in the data. Check this box to change both upper and lower bounds to match the values returned in the data.

Custom Extents

You can define custom minimum and maximum for each axis

X-axis

Position

position of the X-Axis (left or right for horizontal charts, and top or bottom for vertical charts).

Advanced Options
Labels - Show Labels

Enables you to hide axis labels

Labels - Filter Labels

If filter labels is enabled some labels will be hidden in case there is not enough spave to display them

Labels - Rotate

You can enter the number in degrees for how much you want to rotate labels

Labels - Truncate

You can enter the size in pixels to which the label is truncated

Panel settings

These are options that apply to the entire chart and not just the individual data series.

Common options

Legend Position

Move your legend to the left, right, top or bottom

Show Tooltip

Enables or disables the display of tooltip on hovering over chart objects

Current Time Marker

Show a line indicating the current time

Grid options

You can enable grid on the chart. By default grid is displayed on the category axis only.

X-axis

You can disable the display of grid lines on category axis

Y-axis

You can choose on which (if any) of the value axes you want to display grid lines

Viewing detailed information

Visualization Spy

To display the raw data behind the visualization, click Spy Open (fa chevron circle up) in the bottom left corner of the container. The visualization spy panel will open.

Use the select input (highlighted) to view detailed information about the raw data.

spy panel

Table

A representation of the underlying data, presented as a paginated data grid. You can sort the items in the table by clicking the table headers at the top of each column.

Request

The raw request used to query the server, presented in JSON format.

Response

The raw response from the server, presented in JSON format.

Statistics

A summary of the statistics related to the request and the response, presented as a data grid. The data grid includes the query duration, the request duration, the total number of records found on the server, and the index pattern used to make the query.

Debug

The visualization saved state presented in JSON format.

To export the raw data behind the visualization as a comma-separated-values (CSV) file, click either the Raw or Formatted links at the bottom of any of the detailed information tabs. A raw export contains the data as it is stored in Elasticsearch. A formatted export contains the results of any applicable field formatters.

Siren Investigate timeline

The Siren Investigate Timeline visualization displays series of data coming from different saved searches on a single timeline component. Events are color coded to distinguish between different groups.

Each event on a timeline become a clickable term filter which allow to quickly filter the related data based on what is shown on the timeline.

Timeline

Configuration

To configure the visualization, add a new Group and select:

  • Saved search id - date for this group will be taken from corresponding index.

  • Group label - a label for the group.

  • Event label field - field value will be used as individual event label.

  • Event start date - date from this field will be used to position start of the event.

  • Event end date - (optional) date from this field will be used to position end of the event.

  • Events number limit - (optional) limit number of events in this group.

Timeline configuration

Advanced option

By default events from multiple groups are rendered all mixed together. It is possible to show different groups on different levels by enabling the advanced option

  • Groups rendered on separate levels

Timeline advanced configuration

Below timeline where each group is rendered on separate level

Timeline

Siren Investigate scatter plot

This visualization displays a scatter plot chart in four different modes: Straight, Significant terms, Any aggregator, Filtered aggregator.

Straight

Straight

This mode does not use aggregates, it pulls the data directly from Elasticsearch using the Random scoring method to get a random sample of records.

  • X values - The value can be String, Date or Numeric. Select a field from the drop-down.

  • Y values - The field value can be String, Date or Numeric. Select a field from the drop-down.

  • X axis label - A label for the X axis.

  • Y axis label - A label for the Y axis.

  • X axis scale - You can select linear, log, or square root scales for the chart’s X axis. You can use a log scale to display data that varies exponentially, such as a compounding interest chart, or a square root scale to regularize the display of data sets with variabilities that are themselves highly variable. This kind of data, where the variability is itself variable over the domain being examined, is known as heteroscedastic data. For example, if a data set of height versus weight has a relatively narrow range of variability at the short end of height, but a wider range at the taller end, the data set is heteroscedastic.

  • Y axis scale - You can select linear, log, or square root scales for the chart’s Y axis.

  • Jitter field - Deterministic jitter to add pseudo random data distribution in the X axis data interval. Jitter is useful for distributing the values across X axis. Doing so enables you to show the data distributed across the bucket, in that way the dot is more visible.

  • Jitter scale - You can select linear, log, or square root scales for the Jitter.

  • Label - A label for the dot.

    • Display label - Check this box to enable the display of a label next to the dot.

    • Label hover effect - Check this box to enable the tooltip label.

  • Color - A color for the dot.

  • Color field - The field used as an input to generate the dot colors. Only number field types are allowed.

  • Dot size - A size for the dot.

  • Dot size field - The field used as an input for the dot size. Only number field types are allowed.

  • Dot size scale - You can select linear, log, or square root scales for the dot size.

  • Size - Number of random records to fetch from Eleasticsearch query.

  • Shape opacity - Value from 0 to 1 which defines the dot transparency.

Significant terms

Significant term

In this mode the chart is built from a Significant terms aggregation query result. The X values are taken from the bg_count field and the Y values from doc_count field.

  • Field - the field which will provide terms to be aggregated.

  • Size - the number of significant terms to be aggregated.

  • X axis label - A label for the X axis.

  • Y axis label - A label for the Y axis.

  • Color - A color for the dot.

  • Shape opacity - Value from 0 to 1 which defines the dot transparency.

Any aggregator

Any aggregator

The chart is built from a Date Histogram, Histogram, Terms or Significant terms aggregation query result.

  • Aggregation - Select an aggregation from the drop-down list.

  • X Metric - X axis values. Select a metric from the drop-down list.

  • Y Metric - Y axis values. Select a metric from the drop-down list.

  • Color - A color for the dot.

  • Dot size - A size for the dot.

  • Shape opacity - Value from 0 to 1 which defines the dot transparency.

Filtered aggregator

Filtered aggregator

The chart is built from a Date Histogram, Histogram, Terms or Significant terms aggregation query result. The X and Y values are taken from Filters aggregation results.

  • Aggregation - Select an aggregation from the drop-down list.

  • Filter X - A filter string for the X axis.

  • Filter Y - A filter string for the Y axis.

  • Metric - Metric to be calculated for each filter aggregation. Select a metric from the drop-down list.

  • Color - A color for the dot.

  • Dot size - A size for the dot.

  • Shape opacity - Value from 0 to 1 which defines the dot transparency.

After changing options, click Apply changes to update your visualization, or Discard changes to return your visualization to its previous state.

Radar chart

A radar chart introduced in kibi-0.3.0 is a graphical method of displaying multivariate data in the form of a two-dimensional chart of three or more quantitative variables represented on axes starting from the same point. The relative position and angle of the axes is typically uninformative.

Radar chart visualization
Radar chart settings

The radar chart is also known as web chart, spider chart, star chart. It is developed as a standalone plugin suitable to install in Kibi 0.3+, Siren Investigate 10+ and Kibana 4.3+.

Siren Investigate Graph Browser

The Siren Investigate Graph Browser displays Elasticsearch documents as nodes and Siren Investigate relations as links of a graph.

Graph Browser Example
Figure 1. Graph Browser Example

Configuration

Big nodes threshold

If a node would expand into more than this configured number of nodes it will be considered a big node and the user will be given a choice to proceed or to select a sample.

Big Nodes Handling
Figure 2. Big Nodes Handling

Relations

You can configure the ontology relations you want to use in this visualization. If no relation gets set, they will all be used.

Scripts

The Graph Browser supports three types of scripts:

  • Expansion - These scripts are used to customize the expansion policy. The provided one (Default Expansion Policy) will retrieve the first level connected elements to the expanded nodes

  • Contextual - These scripts show up in the contextual menu (shown with a RIGHT CLICK) and allow you to perform operations on the graph.
    Provided contextual scripts:

    • Expand by relation - It opens a popup which enables you to choose one or more of the available relations and it expands the selected elements using only the selected ones. This does not override the graph browser configuration, you will see only the configured relations (if available for the selected nodes)

    • Expand by top comention - To be used with company nodes from our demo. This script expands the selected nodes using an Elasticsearch aggregation to get the top comentioned company nodes

    • Replace investment with edge - To be used with our demo. This script replaces the investment nodes with a direct link between the company nodes and the investor nodes

    • Select - All - Select all the elements. Same as CTRL + A

    • Select - By edge count - It select nodes based on their link count. You can specify the count through the popup that appears

    • Select - By type - It select nodes based on their type. You can specify the type through the popup that appears

    • Select - Extend - Extends the current selection to the sibling elements

    • Selecte - Invert - Inverts the current selection

    • Shortest Path - Calculates the shortest path between two selected nodes by fetching the connected elements.

    • Show nodes count by type - Shows a popup with information about how many nodes per type are currently displayed

  • Lenses -Lenses mutate the visual appearance of graph nodes and edges, can be cascaded as well as switched on and off at will during investigation.
    Provided lens scripts:

    • Size lens - Enables you to set the size for all nodes using an expression.

    • Color lens - Enables you to define color for all nodes using a field.

    • Conditional lens - Enables you to set node properties using expressions.

    • Label lens - Enables you to set the label for all nodes using an expression.

    • Associate records based on ontology lens - Replaces a node with associated records based on ontology.

    • Time and location lens - Enables you to set time and/or location properties.

  • On Update - These are scripts that mutate the graph when new nodes are inserted. They can be cascaded.
    Provided on update scripts:

    • Add time fields - Adds the time field used by the timebar mode.

    • Add geo-locations for map visualization - Adds the geographic field used by the map mode.

    • Replace investment with edge - Same as the contextual script Replace investment with edge, but executed automatically after every expansion.

    • Signal dead companies - Changes to black all the company nodes that have a deadpooled_date.

To create a new script go to ManagementScripts

Scripts Management
Figure 3. Scripts Management

Here you can configure new scripts or modify the saved ones.

Fields to exclude

You can configure a set of fields for each entity that you do not want to retrieve. Typically one will exclude large fields that do not contribute to the link analysis (for example large textual blobs, technical metadata)for extra performance.


Navigating the Graph

After your Siren Investigate Graph Browser visualization is ready, you can start your investigations.

Toolbar

You have several operations available:

Toolbar
Figure 4. Toolbar
  1. Undo - By default the graph browser saves the last 5 states. With this function you can go back one step at a time, until there are no more available. You can configure the steps number in kibi advanced settings.

  2. Redo - With the redo you can restore an undoed state. Be careful: if you undo and perform any operation, the redo state will be lost.

  3. Filter - This will add a filter to the current dashboard synched with the graph selection. This will let you:

    • Do your investigation on the graph, select the vertices you are interested into, activate the filter, pin it and go back to the related dashboard to get more detailed information about those vertices.

    • If you have other visualizations in the same dashboard, it will let you have more information on the selected nodes. For example, if you have the current dashboard associated to a companies saved search, you can do your investigation in the graph, activate the filter, select some vertices and get the visualizations to show information on the selected vertices.

  4. Crop - This will delete every element that is not selected

  5. Remove - This will remove all the selected elements. Right next to the Remove button there is a dropdown that shows the Remove All button. This will clean the whole graph, regardless of selected elements or not.

Remove All
Figure 5. Remove All
  1. Expand - This will expand the currently selected nodes. Right next to the expand button, there is a dropdown that shows advanced options for the expansion.

  1. Highlight mode - This toggle enables and disables the Highlight mode. The Highlight mode moves to the background everything that is not selected and/or connected to a selected node/link.

Highlighting On
Figure 6. Highlighting On
Highlighting Off
Figure 7. Highlighting Off
  1. Layouts - This button lets you change the current graph’s layout. There are 2 available layouts:

    • Standard - This one is the standard layout used by the graph. Pressing it will force the graph to relayout. Note: selected nodes will preserve their relative position.

    • Hierarchy - This layout lays out nodes top down according to their connections. Note: It needs at least one selected node to work; selected nodes will be moved at the top of the hierarchy.

Standard Layout
Figure 8. Standard Layout
Hierarchy Layout
Figure 9. Hierarchy Layout
  1. Add - The Add button opens a popup with the following options:

    • Add selected document - This will add the currently selected document. You can see your selected document in the upper right purple selection box.Standard Layout

    • Add from saved graph - This will open a popup showing the available saved graphs. By using this feature you will add a set of nodes and links, but you will not preserve the layout you had when you saved the graph.

    • Add from another dashboard - This will add nodes using the filtered (optionally) dashboard you select.

Add from saved graph
Figure 10. Add from saved graph
  1. Map Mode - This toggle enables or disables the Map mode. The Map mode will move the nodes geographically on an interactive map. You must set up a script to configure the geographic properties of the nodes (See Scripts).

Map mode
Figure 11. Map mode
  1. Timebar Mode - This toggle enables or disables the Timebar mode. The Timebar mode will display a timebar at the bottom of the Graph Browser that enables time based filtering of nodes. After you enable this mode you will be able to add/remove node types to the timebar through the new menu: Timebar Filter
    You must set up a script to configure the time property of the nodes (See Scripts).

Timebar mode
Figure 12. Timebar mode
  1. Save Graph - This buttons opens a popup that lets you save the current graph.

Save Graph
Figure 13. Save Graph
  1. Open Graph - This button opens a popup that lets you open a saved graph. Note: unlike the add from saved graph this feature preserves the saved graph layout.

Open Graph
Figure 14. Open Graph

Shortcuts

The Graph Browser supports some shortcuts:

  • CTRL + A: select every element in the graph

  • DEL: delete the selected elements (same as the remove button)

  • CTRL + CLICK: enables you to add elements to the current selection

  • DOUBLE CLICK: expands the selected nodes (same as the expand button)

  • ARROWS: move the selected elements in the input direction

  • Mouse Wheel: changes the zoom level of the graph

Navigation Bar

Navigation Bar
Figure 15. Navigation Bar

The navigation bar enables you to:

  1. Move the graph view in the clicked direction

  2. Switch between:

    • Arrow - enables you to select elements

    • Hand - enables you to move the graph regardless of selected elements

  3. Enables you to change the zoom level

Side Bar

Side bar
Figure 16. Side bar

The graph browser side bar enables you to:

  • Show, search, filter, sort, group and change node/links data.

  • Change the current selection.

  • Change node/links attributes (i.e: Color, label, tooltip, etc).

Lenses Tab
Side bar - lenses tab
Figure 17. Side bar - lenses tab

The graph browser side bar lenses tab enables you to make alterations on the displayed nodes/links.

Lenses can be:
  • Color - Enables you to select a field which is then used to color the nodes using a coloring schema.

  • Conditional - Enables you to change a node property value using configurable expressions.

  • Label - Enables you to set the node label using an expression.

  • Size - Use a log scale to adjust the node’s size according to an expression.

  • Spatio-Temporal - Enables you to set the node time and/or geographic location from field values.

  • Associate records based on ontology - Enables you to replace a node with a relation between two of its children.

See here for more information on lens expressions.

Parameters
Lens parameters
Figure 18. Lens parameters

Each lens has specific parameters which will be used for every graph node.

Conditional lens
Side bar - conditional lens
Figure 19. Side bar - conditional lens

A conditional lens can change a property for all the nodes that satisfy the condition.

Properties available to change are:

  • Color

  • Node font icon

  • Node glyphs

  • Hidden

  • Label

  • Location

  • Node image

  • Size

  • Time

  • Tooltip

Associate records based on ontology lens
Side bar - associate records based on ontology lens
Figure 20. Side bar - Associate records based on ontology lens

The associate records based on ontology lens can use the node’s underlying model, as in the following example, to replace a node with the relation between two of its children.

Side bar - investment model graph view
Figure 21. Side bar - Investment model graph view

After you configure the lens, two nodes and its relationship will be displayed.

For instance, if you apply this lens to this node:

Side bar - investment node
Figure 22. Side bar - Investment node

You could obtain this result:

Side bar - associate records based on ontology result
Figure 23. Side bar - Associate records based on ontology result
Selection Tab
Side bar - selection tab
Figure 24. Side bar - selection tab

The graph browser side bar selection tab enables you to show, search, filter, sort, group and change node/links data. When this tab is opened, it reacts with your current node selection and loads the data in rows and columns.

The main component is the data grid, every grid’s row represents a node in the graph and every column a field data related information.

Document type selection

The Main selection combo box enables selection between the different document types in the selected nodes.

Selection change

The second column in the grid enables multiple row selection, once selected it will reflect on the graph turning each node bigger and changing the node’s border to red.

After you complete the selection you can click the Make main selection button floating over the grid to remove the non-selected nodes.

Global filter

Typing in the Filter input enables you to search/filter in all rows and columns.

Local filter

Typing inside of one column’s input enables you to search/filter in all rows of that column.

Grid menu
Side bar - grid menu
Figure 25. Side bar - grid menu

This menu enables you to hide/show columns and clear all local filters.

Column menu
Side bar - column menu
Figure 26. Side bar - column menu

The menu options allow you to:

  • Change the sort order -Multiple column order is supported by keeping shift key pressed on column selection-.

  • Hide the column.

  • Group the data.

  • Add aggregated function - the result of which will be displayed at the bottom.

  • Pin the column to the left or right side of the grid.

Lens Expressions

Siren Investigate’s lens expression parser is based on Jexl.

The expression created within the lens is applied to each node of the selection. Each node contains an object named payload which contains the node’s data returned from Elasticsearch.

Jexl Operators

There are a number of operators which can be applied to the payload data for transformation, comparison, etc.

Here are a selection, there are further details at the Jexl Github page.

Operators

Use these operators to perform mathematical operations on values

Operation Symbol Example

Negate

!

!truefalse

Add/Concat

+

3 + 47

Subtract

-

4 - 31

Multiply

*

3 * 824

Divide

/

15 / 43.75

Divide and Floor

//

15 // 43

Modulus

%

23 % 21

Power of

^

2^38

Logical AND

&&

true && truetrue

Logical OR

||

`true

Comparisons

Use these expressions to compare two values, the boolean results can be used for, for example filtering.

Operation Symbol Example

Equal

==

1 == 2false

Not Equal

!=

1 != 2true

Greater Than

>

2 > 3false

Greater Than or Equal

>=

3 >= 3true

Less Than

<

2 < 3true

Less Than or Equal

<=

2 ⇐ 4true

Element in array or string

in

"cat" in ["cat", "dog", "mouse"]true

Conditional Operators

Conditional operators return the second or third expression based on the result of the first expression. If the first expression ("Bob" in ["Bob", "Mary"] below) return true, "Yes" is returned. If it returns false, "No" is returned.

Example Result

"Bob" in ["Bob", "Mary"] ? "Yes" : "No"

"Yes"

Identifiers

Access variables in the payload with dot notation or by using brackets, for example:

{
  name: {
    first: 'John'
    last: 'Smith'
  },
  age: 55,
  colleagues: [
    'Mary',
    'Bob',
    'Ted'
  ],
  teammate: 2
}
Example Result

name.first

"John"

colleagues[teammate]

"Ted"

name['la' + 'st']

"Smith"

Collection Filtering

Arrays of objects (Collections) can be filtered by including a filter expression in brackets. Properties of each collection can be referenced by prefixing them with a leading dot. The result is an array of objects for which the filter returns a truthy value.

{
  users: [
    { first: 'John', last: 'Smith', age: 20},
    { first: 'Mary', last: 'Jones', age: 46},
    { first: 'Ted', last: 'Cotter', age: 16},
    { first: 'Bob', last: 'White', age: 66}
  ],
  adult: 21
}
Example Result

users[.last == 'Jones']

[{ first: 'Mary', last: 'Jones', age: 46}]

users[.age < adult]

[{ first: 'John', last: 'Smith', age: 20}, first: 'Ted', last: 'Cotter', age: 16}]

users[first == 'John'].last

"Smith"

Lens Expression Functions

In addition to the general Jexl parsing functionality, Siren Investigate also exposes a number of Javascript-like functions for use in Lens Expressions. Payload values (or the results from earlier parsing) are piped into the function using the | character. These values become the val parameter for the functions below - meaning the val does not need to be added in the () after the function name. In some cases, this value is all that is needed by the function and some functions require extra parameters.

Some functions require string inputs and some require integer or floating point inputs

Table 1. String Lens Expressions
Function Example Explanation

split(val, delimiter[, limit])

payload.IP | split('.', 3)

Splits an IP address by the '.' and returns the first 3 entries as an array

endsWith(val, substring[, length])

payload.name | endsWith('smith', 10)

Returns true if val ends with substring, if length is added, that number of characters from the beginning of val is checked.

startsWith(val, substring[, position])

payload.name | startsWith('smith', 10)

Returns true if val begins with substring, if position is added, the substring from that position to the end of val is checked.

indexOf(val, substring[, length])

payload.name | indexOf('smith', 10)

Returns the position of the first character of substring if val contains substring, if length is added, val is checked from that position.

upper(val)

payload.name | upper

Returns val in upper case.

lower(val)

payload.name | lower

Returns val in lower case.

indexOf(val, start, end)

payload.name | substring(5, 10)

Returns the string within val found between start and end.

replace(val, substring, newSubString)

payload.name | replace('smith', 'jones')

Replaces substring with newSubString in val.

Table 2. Number lens Expressions
Function Example Explanation

round(val)

payload.range | round

Returns val rounded to the nearest integer.

trunc(val)

payload.range | trunc

Returns the integer part of val.

sqrt(val)

payload.range | sqrt

Returns √val.

sign(val)

payload.range | sign

Returns 1 if val is positive, -1 if val is negative or 0 if val equals 0.

ceil(val)

payload.price | ceil

Returns the nearest integer greater than val

floor(val)

payload.price | floor

Returns the nearest integer less than val

abs(val)

payload.temperature_change | abs

Returns the absolute value for a Number or 0 if the number is null

exp(val)

payload.difference | exp

Returns val

log(val)

payload.difference | log

Returns the natural logarithm of val, for example ln(val)

random(val)

payload.range | random

Returns val multiplied by a floating-point, pseudo-random number between 0 (inclusive) and 1 (exclusive).

Siren Investigate box plot

This visualization displays a box plot chart from the data in the current set of Elasticsearch documents.

Usage

Box plot

Ensure that you have:

  • One Percentiles metric, with three Percentiles defined:

    • Bottom Percentile (Usually around 25%)

    • Mean (Usually around 50%)

    • Top Percentile (Usually around 75%)

  • One Max metric

  • One Min metric

  • One Aggregation (Optional)

Options

Box plot options
  • Y Axis Text - A label for the X axis.

  • X Axis Text - A label for the Y axis.

  • Show values - Check this box to enable the display the value next to its box.

  • Restrict Y axis MAX - Restricts the domain of the Y axis to a maximum value.

    • Global Max Y Value - Y axis domain maximum value.

  • Restrict Y axis MIN - Restricts the domain of the Y axis to a minimum value.

    • Global Min Y Value - Y axis domain minimum value.

After changing options, click the Apply changes to update your visualization, or Discard changes to return your visualization to its previous state.

Bubble diagram

The Bubble Diagram visualization displays series of data grouped into packed circles.

First

Bubble size

The radius of circles depends on the type of metric aggregations.

Count

The count aggregation returns a raw count of the elements in the selected index pattern.

Average

This aggregation returns the average of a numeric field. Select a field from the drop-down.

Sum

The sum aggregation returns the total sum of a numeric field. Select a field from the drop-down.

Min

The min aggregation returns the minimum value of a numeric field. Select a field from the drop-down.

Max

The max aggregation returns the maximum value of a numeric field. Select a field from the drop-down.

Unique Count

The cardinality aggregation returns the number of unique values in a field. Select a field from the drop-down.

Buckets aggregations

The buckets aggregations determine what information will come out in the diagram.

You can do a maximum of two aggregations at one time. The first aggregation will create the parent circles, while the second aggregation will create the child circles.

Parent circles look slightly different to the children ones. Parent circles have a thicker border and the label is written in bold.

The parents bubbles are divided by color. If you do a subaggregation (children), you will see the bubbles divided by family. Children are located near the parent and all have the same color. Families are united. If you drag a bubble, all members of the family will drag along.

Aggregation configuration

Options

In the diagram there are two options

Options configuration
Show Parents

If checked parent bubbles are visible when doing the subaggregation.

Enable Zoom

Enables zoom on the page. To use the zoom you have to use the mouse wheel.

Circles movements

All circles gravitate towards the center of the visualization.

When you drag a circle, its family follows it.

bubbles Movement

When moving the mouse over a circle, detailed information is shown in a tooltip.

Detailed information on hover

Filters

You can create filters by double-clicking the bubbles.

When you double-click a child, you will asked to confirm the application of the filter on the filter bar.

Filter Child

Pressing Apply Now will set the filter and display the bubble itself and its parent.

Filter Child

When you double-click a parent, you will see the bubble itself and its family.

Filter Parent

Siren Investigate horizontal bar chart

This visualization displays a horizontal bar chart from the data in the current set of Elasticsearch documents.

Usage

Horizontal Bar Chart

Dashboard

A Siren Investigate dashboard displays a set of saved visualizations in a customizable grid layout. You can save a dashboard to share or reload at a later time.

In Siren Investigate, dashboards are displayed in the left panel and can be organized as dashboard groups.

Getting Started

You need at least one saved visualization to use a dashboard.

Building a New Dashboard

Dashboards can be accessed using the home icon or the Siren logo. When you click Dashboard, Siren Investigate displays the first available dashboard or, if no dashboards have been defined, the dashboard creation screen.

New Dashboard screen

You can create a new dashboard by clicking the icon in the dashboard panel:

Create New Dashboard]

Creating a New Dashboard

Build your dashboard by adding visualizations. By default, Siren Investigate dashboards use a light color theme. To use a dark color theme instead, click Options (which you can find on the top horizonal menu or by right clicking the dashboard name) and check the Use dark theme box.

Dark Theme Example

You can change the default theme in the Advanced section of the Settings tab.

Autogenerate dashboard

If you have selected fields (or used the autoselect fields feature) in Discover, you can click Generate Dashboard to autogenerate a dashboard from those selected fields. This will create a new dashboard with visualizations selected to display the data in those fields.

First the generated dashboard details are shown and can be edited:

Create Autogenerated Dashboard Panel

Here, you can edit the title of the new dashboard, choose whether to store the time with the dashboard (a time filter set on the dashboard will be stored and retrieved when the dashboard is reloaded).

If you want to add a Multichart visualization to the dashboard, select the checkbox. Be aware that this can slow the generation of the dashboard.

The generated dashboard will be associated to a saved search, which by default is a new saved search created from the current state of the Discover page. If a saved search is currently open in the Discover page, however, you have the option to use it either before or after saving its state.

Generate dashboard report

Clicking OK on the generated dashboard’s details panel opens a panel showing the details of the visualizations to be added to the dashboard:

Generate Dashboard Visualizations Panel

Here, the visualizations to be applied to the dashboard are listed. You can choose whether or not to add them with the checkboxes on the left and you can also edit the automatically generated title of each visualization in the input boxes on the right.

Clicking OK here will save and generate your new dashboard.

New Auto Generated Dashboard

If you want to resize or reorder your visualizations on the dashboard, click Edit on the top bar to begin customizing.

Saving dashboards

Click the Save button:

Saving a dashboard

The name of the dashboard can be set in the Save As field.

If Store time with dashboard is checked, the time filter Time Filter currently set will be restored when the dashboard is opened.

To display the number of Elasticsearch documents displayed by the dashboard in the corresponding tab, select a Saved Search:

Dashboard settings

Sharing Dashboards

You can share dashboards with other users by sending a link or by embedding them into HTML pages; ensure that your Siren Investigate installation is properly secured when sharing a dashboard on a public facing server.

To view shared dashboards users must be able to access Siren Investigate; keep this in mind if your Siren Investigate instance is protected by an authentication proxy.

To share a dashboard, click Share to display the Sharing panel.

sharing panel

Click Copy to copy the native URL or embed HTML to the clipboard. The Share Snapshot link field contains a pre-shortened URL for sharing or embedding.

Embedding dashboards

Copy the embed code from the Share display into your external web application.

Adding visualizations to a dashboard

Click Add in the toolbar panel, then select a previously created visualization from the list:

Adding a visualization to the dashboard

You can filter the list of visualizations by typing a filter string into the Visualization Filter field.

The visualization you select appears in a container on your dashboard.

If you see a message about the container’s height or width being too small, resize the container.

Reset all dashboards to their default state

remove all filters One can save with a dashboard some specific filters, a custom query or a certain time range.

If you click (Reset filter icon) in the toolbar panel, the temporary filters/queries/time set on all dashboards would be removed, reverted to a dashboard’s default state with the saved filters/query/time.

Customizing Dashboard Elements

The visualizations in your dashboard are stored in resizable containers that you can arrange on the dashboard. This section discusses customizing these containers.

Moving containers

Click and hold a container’s header to move the container around the dashboard.
Other containers will shift as needed to make room for the moving container.
Release the mouse button to confirm the container’s new location.

Resizing containers

Move the cursor to the bottom right corner of the container until the cursor changes to point at the corner.
After the cursor changes, click and drag the corner of the container to change the container’s size.
Release the mouse button to confirm the new container size.

Removing containers

Click the x icon at the top right corner of a container to remove that container from the dashboard.
Removing a container from a dashboard does not delete the saved visualization in that container.

Viewing detailed information

To display the raw data behind the visualization, click the bar at the bottom of the container. Tabs with detailed information about the raw data replace the visualization, as in this example:

Table

A representation of the underlying data, presented as a paginated data grid. You can sort the items in the table by clicking the table headers at the top of each column.

vis spy table

Request

The raw request used to query the server, presented in JSON format.

vis spy request

Response

The raw response from the server, presented in JSON format.

vis spy response

Statistics

A summary of the statistics related to the request and the response, presented as a data grid. The data grid includes the query duration, the request duration, the total number of records found on the server, and the index pattern used to make the query.

vis spy stats

Debug

A summary of the visualization state (for example, visualization parameters and aggregations) and other details.

vis spy debug

To export the raw data behind the visualization as a comma-separated-values (CSV) file, click either the Raw or Formatted links at the bottom of any of the detailed information tabs. A raw export contains the data as it is stored in Elasticsearch. A formatted export contains the results of any applicable Siren Investigate [field formatters].

Changing the Visualization

Click Edit (Pencil button) at the top right of a container to open the visualization in the Visualize page.

Working with filters

When you create a filter anywhere in Siren Investigate, the filter conditions display in an oval under the search text entry box:

filter sample

Hovering on the filter oval displays the following icons:

filter allbuttons
Enable Filter (fa check square o)

Click this icon to disable the filter without removing it. You can enable the filter again later by clicking the icon again. Disabled filters are displayed with a striped shaded color.

Pin Filter (fa thumb tack mod)

Click this icon to pin a filter. Pinned filters persist across Siren Investigate tabs. You can pin filters from the Visualize tab, click the Discover or Dashboard tabs, and those filters remain in place.

If you have a pinned filter and you are not seeing any query results, check that your current tab’s index pattern is one that the filter applies to. For example, a filter name:giovanni will results in 0 results if pinned and therefore "dragged along" to a dashboard whose underlying index does not have a name field, let alone a giovanni value. For this reason a good pattern in Siren Investigate is to use Dashboard Groups to group together dashboard which are based on the same underlying index. In this case the user can safely pin and "drag along" a filter across dashboards in the same group.
Toggle Filter (fa search minus)

Click this icon to toggle a filter. By default, filters are inclusion filters, and are displayed in green. Only elements that match the filter are displayed. To change this to an exclusion filter, displaying only elements that do not match, toggle the filter. Exclusion filters are displayed in red.

Remove Filter (fa trash)

Click this icon to remove a filter entirely.

Custom Filter (fa pencil square o)

Click this icon to display a text field where you can customize the JSON representation of the filter and specify an alias to use for the filter name, for example, to filter the data to just the companies based in London:

London Companies Filter Example

Adding the London Companies label to the filter displays that label on the filter bar:

London Companies Filter Bar

Omitting the label displays the filter query in the filter bar:

London Companies Filter Bar

You can use a JSON filter representation to implement predicate logic, with should for OR, must for AND, and must_not for NOT:

+ .OR Example

{
  "bool": {
    "should": [
      {
        "term": {
          "geoip.country_name.raw": "Canada"
        }
      },
      {
        "term": {
          "geoip.country_name.raw": "China"
        }
      }
    ]
  }
}

+ .AND Example

{
  "bool": {
    "must": [
      {
        "term": {
          "geoip.country_name.raw": "United States"
        }
      },
      {
        "term": {
          "geoip.city_name.raw": "New York"
        }
      }
    ]
  }
}

+ .NOT Example

{
  "bool": {
    "must_not": [
      {
        "term": {
          "geoip.country_name.raw": "United States"
        }
      },
      {
        "term": {
          "geoip.country_name.raw": "Canada"
        }
      }
    ]
  }
}

Click Done to update the filter with your changes.

See Query DSL documentation for more information on the possibilities.

To apply any of the filter actions to all the filters currently in place, click Actions > Global Filter Actions and select an action.

Dashboard groups

Dashboards can be organized in dashboard groups.

Dashboard Groups Panel

If the dashboard is associated with a saved search, the count of documents on the dashboard is displayed next to the dashboard name. Two additional indicators that may be displayed are:

  • Filters/Queries indicator - the filter icon is displayed if there are any filter or query currently applied on the dashboard

  • Pruned joins indicator - a star symbol is displayed if any of the join operations was pruned.

Edit dashboard groups

Create New Dashboard Group

In the left dashboard panel, you can change the order of the dashboard groups and move dashboards between groups by dragging and dropping.

Dashboard groups can be managed by clicking the Create new group icon in the dashboard panel, and by right clicking the dashboard group name to get Edit and Delete options.

Dashboard Group Edit Panel

In edit, you can change the title of an existing group, set the icon to a custom image by inserting a URL or use a Font Awesome icon.

Refreshing the search results

You can configure a refresh interval to automatically refresh the page with the latest index data. This periodically resubmits the search query.

When a refresh interval is set, it is displayed to the left of the Time Filter in the menu bar.

To set the refresh interval:

  1. Click Time Filter (Time Filter).

  2. Click the Refresh Interval tab.

  3. Choose a refresh interval from the list.

To automatically refresh the data, click Auto-refresh (fa repeat) when the time picker is open and select an autorefresh interval:

autorefresh intervals

When auto-refresh is enabled, Siren Investigate’s top bar displays a Pause (fa pause) button and the auto-refresh interval.

Timelion

Timelion is a time series data visualizer that enables you to combine totally independent data sources within a single visualization. It’s driven by a simple expression language you use to retrieve time series data, perform calculations to tease out the answers to complex questions, and visualize the results.

For example, Timelion enables you to easily get the answers to questions like:

  • How many pages does each unique user view over time?

  • What’s the difference in traffic volume between this Friday and last Friday?

  • What percent of Japan’s population came to my site today?

  • What’s the 10-day moving average of the S&P 500?

  • What’s the cumulative sum of all search requests received in the last 2 years?  

You may also be interested in these tutorial videos:

Getting started

Ready to experience all that is Timelion? This getting started tutorial shows you how to:

Creating time series visualizations

This tutorial will be using the time series data from Metricbeat to walk you through a number of functions that Timelion offers. To get started, download Metricbeat and follow the instructions here to start ingesting the data locally.

The first visualization you will create will compare the real-time percentage of CPU time spent in user space to the results offset by one hour. In order to create this visualization, we will need to create two Timelion expressions. One with the real-time average of system.cpu.user.pct and another with the average offset by one hour.

To start, you must define an index, timefield and metric in the first expression. Go ahead and enter the following expression into the Timelion query bar.

.es(index=metricbeat-*, timefield='@timestamp', metric='avg:system.cpu.user.pct')
timelion create01

 

Now you need to add another series with data from the previous hour for comparison. To do so, you will have to add an offset arguement to the .es() function. offset will offset the series retrieval by a date expression. For this example, you will want to offset the data back one hour and will be using the date expression -1h. Using a comma to separate the two series, enter the following expression into the Timelion query bar:

.es(index=metricbeat-*, timefield='@timestamp', metric='avg:system.cpu.user.pct'), .es(offset=-1h,index=metricbeat-*, timefield='@timestamp', metric='avg:system.cpu.user.pct')
timelion create02

 

It is a bit hard to differentiate the two series. Customize the labels in order to easily distinguish them. You can always append the .label() function to any expression to add a custom label. Enter the below expression into the Timelion query bar to customize your labels:

.es(offset=-1h,index=metricbeat-*, timefield='@timestamp', metric='avg:system.cpu.user.pct').label('last hour'), .es(index=metricbeat-*, timefield='@timestamp', metric='avg:system.cpu.user.pct').label('current hour')
timelion create03

 

Save the entire Timelion sheet as Metricbeat Example. As a best practice, you should be saving any significant changes made to this sheet as you progress through this tutorial.

Customize and format visualizations

Timelion has plenty of options for customization. You can personalize nearly every aspect of a chart with the functions available. For this tutorial, you will perform the following modifications.

  • Add a title

  • Change a series type

  • Change the color and opacity of a series

  • Modify the legend

In the previous section, you created a Timelion chart with two series. Let’s continue to customize this visualization.

Before making any other modifications, append the title() function to the end of an expression to add a title with a meaningful name. This will make it much easier for unfamiliar users to understand the visualizations purpose. For this example, add title('CPU usage over time') to the original series. Use the following expression in your Timelion querybar:

.es(offset=-1h,index=metricbeat-*, timefield='@timestamp', metric='avg:system.cpu.user.pct').label('last hour'), .es(index=metricbeat-*, timefield='@timestamp', metric='avg:system.cpu.user.pct').label('current hour').title('CPU usage over time')
timelion customize01

 

To differentiate the last hour series a bit more, you are going to change the chart type to an area chart. In order do so, you must use the .lines() function to customize the line chart. You will be setting the fill and width arguements to set the fill of the line chart and line width respectively. In this example, you will set the fill level to 1 and the width of the border to 0.5 by appending .lines(fill=1,width=0.5). Use the following expression in the Timelion query bar:

.es(offset=-1h,index=metricbeat-*, timefield='@timestamp', metric='avg:system.cpu.user.pct').label('last hour').lines(fill=1,width=0.5), .es(index=metricbeat-*, timefield='@timestamp', metric='avg:system.cpu.user.pct').label('current hour').title('CPU usage over time')
timelion customize02

 

Let’s color these series so that the current hour series pops a bit more than the last hour series. The color() function can be used to change the color of any series and accepts standard color names, hexadecimal values or a color schema for grouped series. For this example you will use .color(gray) for the last hour and .color(#1E90FF) for the current hour. Enter the following expression into the Timelion query bar to make the adjustments:

.es(offset=-1h,index=metricbeat-*, timefield='@timestamp', metric='avg:system.cpu.user.pct').label('last hour').lines(fill=1,width=0.5).color(gray), .es(index=metricbeat-*, timefield='@timestamp', metric='avg:system.cpu.user.pct').label('current hour').title('CPU usage over time').color(#1E90FF)
timelion customize03

 

Last but not least, adjust the legend so that it takes up as little space as possible. You can utilize the .legend() function to set the position and style of the legend. For this example, place the legend in the north west position of the visualization with two columns by appending .legend(columns=2, position=nw) to the original series. Use the following expression to make the adjustments:

.es(offset=-1h,index=metricbeat-*, timefield='@timestamp', metric='avg:system.cpu.user.pct').label('last hour').lines(fill=1,width=0.5).color(gray), .es(index=metricbeat-*, timefield='@timestamp', metric='avg:system.cpu.user.pct').label('current hour').title('CPU usage over time').color(#1E90FF).legend(columns=2, position=nw)
timelion customize04

 

Save your changes and continue on to the next section to learn about mathematical functions.

Using mathematical functions

You have learned how to create and style a Timelion visualization in the previous two sections. This section will explore the mathematical functions Timelion offers. You will continue to use the Metricbeat data to create a new Timelion visualization for inbound and outbound network traffic. To start, you must add a new Timelion visualization to the sheet.

In the top menu, click Add to add a second visualization. When added to the sheet, you will notice that the query bar has been replaced with the default .es(*) expression. This is because the query is associated with the visualization on the Timelion sheet you have selected.

timelion math01

 

To start tracking the inbound / outbound network traffic, your first expression will calculate the maximum value of system.network.in.bytes. Enter the following expression into your Timelion query bar:

.es(index=metricbeat*, timefield=@timestamp, metric=max:system.network.in.bytes)
timelion math02

 

Monitoring network traffic is much more valuable when plotting the rate of change. The derivative() function is used do just that - plot the change in values over time. This can be easily done by appending the .derivative() to the end of an expression. Use the following expression to update your visualization:

.es(index=metricbeat*, timefield=@timestamp, metric=max:system.network.in.bytes).derivative()
timelion math03

 

Now for the outbound traffic. You must add a similar calculation for system.network.out.bytes. Because outbound traffic is leaving your machine, it makes sense to represent this metric as a negative number. The .multiply() function will multiply the series by a number, the result of a series or a list of series. For this example, you will use .multiply(-1) to convert the outbound network traffic to a negative value. Use the following expression to update your visualization:

.es(index=metricbeat*, timefield=@timestamp, metric=max:system.network.in.bytes).derivative(), .es(index=metricbeat*, timefield=@timestamp, metric=max:system.network.out.bytes).derivative().multiply(-1)
timelion math04

 

To make this visualization a bit easier to consume, convert the series from bytes to megabytes. Timelion has a .divide() function that can be used. .divide() accepts the same input as .multiply() and will divide the series by the divisor defined. Use the following expression to update your visualization:

.es(index=metricbeat*, timefield=@timestamp, metric=max:system.network.in.bytes).derivative().divide(1048576), .es(index=metricbeat*, timefield=@timestamp, metric=max:system.network.out.bytes).derivative().multiply(-1).divide(1048576)
timelion math05

 

Utilizing the formatting functions .title(), .label(), .color(), .lines() and .legend() learned in the last section, let’s clean up the visualization a bit. Use the following expression to update your visualization:

.es(index=metricbeat*, timefield=@timestamp, metric=max:system.network.in.bytes).derivative().divide(1048576).lines(fill=2, width=1).color(green).label("Inbound traffic").title("Network traffic (MB/s)"), .es(index=metricbeat*, timefield=@timestamp, metric=max:system.network.out.bytes).derivative().multiply(-1).divide(1048576).lines(fill=2, width=1).color(blue).label("Outbound traffic").legend(columns=2, position=nw)
timelion math06

 

Save your changes and continue on to the next section to learn about conditional logic and tracking trends.

Using conditional logic and tracking trends

In this section you will learn how to modify time series data with conditional logic and create a trend with a moving average. This is helpful to easily detect outliers and patterns over time.

For the purposes of this tutorial, you will continue to use Metricbeat data to add another visualization that monitors memory consumption. To start, use the following expression to chart the maximum value of system.memory.actual.used.bytes.

.es(index=metricbeat-*, timefield='@timestamp', metric='max:system.memory.actual.used.bytes')
timelion conditional01

 

Let’s create two thresholds to keep an eye on the amount of used memory. For the purposes of this tutorial, your warning threshold will be 12.5GB and your severe threshold will be 15GB. When the maximum amount of used memory exceeds either of these thresholds, the series will be colored accordingly.

If the threshold values are too high or low for your machine, adjust accordingly.

To configure these two threshold values, you can utilize Timelion’s conditional logic. In this tutorial you will use if() to compare each point to a number, adjust the styling if the condition evaluates to true and use the default styling if the condition evaluates to false. Timelion offers the following six operator values for comparison.

eq

equal

ne

not equal

lt

less than

lte

less than or equal to

gt

greater than

gte

greater than or equal to

Because there are two thresholds, it makes sense to style them differently. Use the gt operator to color the warning threshold yellow with .color('#FFCC11') and the severe threshold red with .color('red'). Enter the following expression into the Timelion query bar to apply the conditional logic and threshold styling:

.es(index=metricbeat-*, timefield='@timestamp', metric='max:system.memory.actual.used.bytes'), .es(index=metricbeat-*, timefield='@timestamp', metric='max:system.memory.actual.used.bytes').if(gt,12500000000,.es(index=metricbeat-*, timefield='@timestamp', metric='max:system.memory.actual.used.bytes'),null).label('warning').color('#FFCC11'), .es(index=metricbeat-*, timefield='@timestamp', metric='max:system.memory.actual.used.bytes').if(gt,15000000000,.es(index=metricbeat-*, timefield='@timestamp', metric='max:system.memory.actual.used.bytes'),null).label('severe').color('red')
timelion conditional02

 

For additional information on Timelions conditional capabilities, check out the blog post I have but one .condition().

Now that you have thresholds defined to easily identify outliers, let’s create a new series to determine what the trend really is. Timelion’s mvavg() function enables you to calculate the moving average over a given window. This is especially helpful for noisey time series. For this tutorial, you will use .mvavg(10) to create a moving average with a window of 10 data points. Use the following expression to create a moving average of the maximum memory usage:

.es(index=metricbeat-*, timefield='@timestamp', metric='max:system.memory.actual.used.bytes'), .es(index=metricbeat-*, timefield='@timestamp', metric='max:system.memory.actual.used.bytes').if(gt,12500000000,.es(index=metricbeat-*, timefield='@timestamp', metric='max:system.memory.actual.used.bytes'),null).label('warning').color('#FFCC11'), .es(index=metricbeat-*, timefield='@timestamp', metric='max:system.memory.actual.used.bytes').if(gt,15000000000,.es(index=metricbeat-*, timefield='@timestamp', metric='max:system.memory.actual.used.bytes'),null).label('severe').color('red'), .es(index=metricbeat-*, timefield='@timestamp', metric='max:system.memory.actual.used.bytes').mvavg(10)
timelion conditional03

 

Now that you have thresholds and a moving average, let’s format the visualization so it is a bit easier to consume. As with the last section, use the .color(), .line(), .title() and .legend() functions to update your visualization accordingly:

.es(index=metricbeat-*, timefield='@timestamp', metric='max:system.memory.actual.used.bytes').label('max memory').title('Memory consumption over time'), .es(index=metricbeat-*, timefield='@timestamp', metric='max:system.memory.actual.used.bytes').if(gt,12500000000,.es(index=metricbeat-*, timefield='@timestamp', metric='max:system.memory.actual.used.bytes'),null).label('warning').color('#FFCC11').lines(width=5), .es(index=metricbeat-*, timefield='@timestamp', metric='max:system.memory.actual.used.bytes').if(gt,15000000000,.es(index=metricbeat-*, timefield='@timestamp', metric='max:system.memory.actual.used.bytes'),null).label('severe').color('red').lines(width=5), .es(index=metricbeat-*, timefield='@timestamp', metric='max:system.memory.actual.used.bytes').mvavg(10).label('mvavg').lines(width=2).color(#5E5E5E).legend(columns=4, position=nw)
timelion conditional04

 

Save your Timelion sheet and continue on to the next section to add these new visualizations to your dashboard.

Add to dashboard

You have officially harnessed the power of Timelion to create time series visualizations. The final step of this tutorial is to add your new visualizations to a dashboard. This section will show you how to save a visualization from your Timelion sheet and add it to an existing dashboard.

To save a Timelion visualization as a dashboard panel:

  1. Select the visualization you would like to add to one (or multiple) dashboards

  2. Click the Save option in the top menu

  3. Select Save current expression as Kibana dashboard panel

  4. Name your panel and click Save to save as a dashboard visualization

timelion save01

 

Now you can add this dashboard panel to any dashboard you would like. This visualization will now be listed in the Visualize list. Go ahead and follow the same process for the rest of the visualizations you created.

Create a new dashboard or open an existing one to add the Timelion visualizations as you would any other visualization.

timelion save02

 

You can also create time series visualizations right from the Visualize app—​just select the Timeseries visualization type and enter a Timelion expression in the expression field.

Inline help and documentation

Cannot remember a function or searching for a new one? You can always reference the inline help and documentation in Timelion.

Documentation for the Timelion expression language is built-in. Click Docs in the top menu to view the available functions and access the inline reference. As you start to enter functions in the query bar, Timelion will display the relevant arguments in real time.

Timelion inline help

Siren Alert

The Siren Alert infrastructure is integrated in the Siren Platform but can also be used standalone.

When used standalone, it is called Sentinl.

In this section we will refer to Sentinl as a synonym of Siren Alert.

Alerting and reporting

Watching your data, 24/7/365


Siren Alert extends Siren Investigate and Kibana with Alerting and Reporting functionality to monitor, notify and report on data series changes using standard queries, programmable validators and a variety of configurable actions - Think of it as a free an independent Watcher and Reporting alternative, further extended and expanded by the unique Siren Investigate features

Siren Alert is also designed to simplify the process of creating and managing alerts and reports in Siren Investigate/Kibana using its integrated App and Spy integration.


New to Siren Alert and watchers?

No problem. Check out our Siren Alert presentation to get started off right!

Introduction

What is Siren Alert?

Siren Alert is an App Plugin extending Kibana or Siren Investigate with dynamic Alerting and Reporting functionality

"Designed to monitor, validate and inform users and systems on data series changes using standard or join queries, programmable result validators, transformers and templates to send out notifications using a variety of configurable actions reaching users, interfacing with remote APIs for data and commands, generating new Elasticsearch documents, arbitrary metrics towards any other platform, planting triggers for itself to use and so much more."

Siren Alert compared to X-Pack

Siren Alert provides X-Pack-like Alerting and Reporting functionality directly within Siren Investigate and Kibana in the form of a powerful plugin, leveraging all available native features such as a secure client for queries and extending the UI with tools for managing configuration, scheduling and handling executions of user Alerts and Reports.

Siren Alert is also transparent to the Elasticsearch cluster(s) it monitors, appearing as a regular client and requiring no complex installation, restarts and no licensing fees.

Powered by the many I/O modules the Node.JS community offers, SIREN ALERT usage is not limited to Elasticsearch and its capabilities can easily be extended to fully interface with third party data sources and platforms for ingress and egress data.

What is a watcher?

Siren Alert enables automation of recurring "questions" (as queries) by using Watchers.

Some Examples for illustration:

  • HIT COUNT PER HOUR

  • QUESTION: How many hits does index X receive hourly?

  • WATCHER: query index and return count of hits in last hour

  • ACTION: Notify with number of Hits per hour

  • METRIC THRESHOLDS

  • QUESTION: Are any of my monitored metrics surpassing a certain value?

  • WATCHER: query index and type for specific values, aggregated by an arbitrary field.

  • ACTION: Notify with aggs bucket details every time a threshold is surpassed or spike anomaly detected.

  • BLACKLISTS HITS

  • QUESTION: Are any of my users trying to reach blacklisted destinations?

  • WATCHER: query firewall logs comparing destination IPs to a blacklist.

  • ACTION: Notify admin using email if any IP >= 10 matches returned

  • FAILED LOGINS

  • QUESTION: Are there recurring failure attempts authenticating users on my network?

  • WATCHER: query Active Directory logs for login failures in last hour and compare to user index. .

  • ACTION: Notify admin using webhook if >= 10 matches returned

  • LEAK DETECTION (chain)

  • QUESTION: Are there any public leaks about my data I was not aware of?

  • WATCHER: query for user emails included in published leaks ingested from third parties.

  • ACTION: Save hits in secondary result Index. Notify using email if leak was not known in a secondary Watcher


Installation

Libraries to install

Debian, Ubuntu

sudo apt-get install libfontconfig libfreetype6

Centos

sudo yum install fontconfig freetype

Kibana 4.x

Snapshot plugin install

Browse to our releases and choose the relevant version, ie: tag-4.6.4-4 to use for installing the plugin:

/opt/kibana/bin/kibana plugin --install sentinl -u https://github.com/sirensolutions/sentinl/releases/download/tag-4.6.4-4/sentinl.zip

Dev plugin install

git clone https://github.com/sirensolutions/sentinl
cd sentinl && npm install && npm run package
/opt/kibana/bin/kibana plugin --install sentinl -u file://`pwd`/sentinl-latest.tar.gz

Kibana 5.x

There are two modes for Siren Alert install

User mode

  1. Look at the Siren Alert releases, find a release which matches your Kibana version, find .zip package which matches Kibana subversion and copy its URL. For example https://github.com/sirensolutions/sentinl/releases/download/tag-5.6.2/sentinl-v5.6.5.zip

  2. Go in Kibana cd kibana

  3. Install Siren Alert

    ./bin/kibana-plugin install https://github.com/sirensolutions/sentinl/releases/download/tag-5.6.2/sentinl-v5.6.5.zip
  4. Start Kibana ./bin/kibana

Developer mode

  1. Ensure you have correct Node.js version to run your Kibana cat kibana/.node_version

  2. Clone Siren Alert repo git clone https://github.com/sirensolutions/sentinl.git

  3. Go in Siren Alert folder cd sentinl

  4. Install packages npm install

  5. Look at the available branches git branch -a

  6. Find a branch which matches your Kibana version, for example branch-5.6

  7. Checkout inside this branch git checkout branch-5.6

  8. Ensure you have a subversion which matches Kibana subversion grep version package.json. Correct the subversion in package.json to match the Kibana subversion. For example, you should have 5.6.5 there if Kibana is 5.6.5.

  9. Install Siren Alert and leave gulp working to live sync code changes gulp dev --kibanahomepath=/path/to/kibana

  10. Open a new terminal or bash session etc.

  11. Go in Kibana cd kibana

  12. Start Kibana npm start


NEXT: Proceed with configuration.

Siren Investigate Config

Siren Alert Configuration: yaml

Siren Alert is configured using parameters in the main Siren Investigate (or Kibana) yaml file

By default, all actions are disabled and will only produce log entries. To enable one or more actions, configure the required parameters on each, and set the active flag.

Siren Alert before v5.6

Example (minimal)
sentinl:
  settings:
    email:
      active: true
      user: smtp_username
      password: smtp_password
      host: smtp.server.com
      ssl: true
    report:
      active: true
      tmp_path: /tmp/
Example (extended)
sentinl:
  es:
    host: localhost
    port: 9200
    timefield: '@timestamp'
    default_index: watcher
    type: sentinl-watcher
    alarm_index: watcher_alarms
    alarm_type: sentinl-alarm
    script_type: sentinl-script
  sentinl:
    history: 20
    results: 50
    scriptResults: 50
  settings:
    email:
      active: false
      user: username
      password: password
      host: smtp.server.com
      ssl: true
      timeout: 10000  # mail server connection timeout
    slack:
      active: false
      username: username
      hook: 'https://hooks.slack.com/services/<token>'
      channel: '#channel'
    webhook:
      active: false
      method: POST
      host: host
      port: 9200
      path: ':/{{payload.watcher_id}}'
      body: '{{payload.watcher_id}}{payload.hits.total}}'
    report:
      active: false
      tmp_path: /tmp/
      search_guard: false
      simple_authentication: false
    pushapps:
      active: false
      api_key: '<pushapps API Key>'

Siren Alert v5.6+

Example (minimal)
sentinl:
  settings:
    email:
      active: true
      user: smtp_username
      password: smtp_password
      host: smtp.server.com
      ssl: true
    report:
      active: true
      executable_path: '/usr/bin/chromium' # path to Chrome v59+ or Chromium v59+
Example (extended)
sentinl:
  es:
    host: 'localhost'
    port: 9200
    # protocol: 'http'
    # results: 50
    # timefield: '@timestamp'
    # default_type: 'doc'
    # alarm_index: 'watcher_alarms'
    # alarm_type: 'sentinl-alarm'
  settings:
    email:
      active: true
      host: 'localhost'
      # user: 'admin'
      # password: 'password'
      # port: 25
      # domain: 'beast.com'
      # ssl: false
      # tls: false
      # authentication: ['PLAIN', 'LOGIN', 'CRAM-MD5', 'XOAUTH2']
      # timeout: 10000  # mail server connection timeout
      # cert:
      #   key: '/full/sys/path/to/key/file'
      #   cert: '/full/sys/path/to/cert/file'
      #   ca: '/full/sys/path/to/ca/file'
    slack:
      active: false
      username: 'username'
      hook: 'https://hooks.slack.com/services/<token>'
      channel: '#channel'
    webhook:
      active: false
      host: 'localhost'
      port: 9200
      # use_https: false
      # path: ':/{{payload.watcher_id}}'
      # body: '{{payload.watcher_id}}{payload.hits.total}}'
      # method: POST
    report:
      active: true
      executable_path: '/usr/bin/chromium' # path to Chrome v59+ or Chromium v59+
      timeout: 5000
      # authentication:
      #   enabled: true
      #   mode:
      #     searchguard: false
      #     xpack: false
      #     basic: false
      #     custom: true
      #   custom:
      #     username_input_selector: '#username'
      #     password_input_selector: '#password'
      #     login_btn_selector: '#login-btn'
      # file:
      #   pdf:
      #     format: 'A4'
      #     landscape: true
      #   screenshot:
      #     width: 1280
      #     height: 900
    pushapps:
      active: false
      api_key: '<pushapps API Key>'

Tutorial

This tutorial will illustrate a working example of Siren Alert for alerting

WARNING: This guide is a work-in-progress and should not be used as-is in production!

Requirements

  • Elasticsearch + Siren Investigate or Kibana 5.x

  • shell + curl to execute commands

Setup

Before starting, download and install the latest dev version of the plugin using the README instructions.

Dataset

To illustrate the logic and elements involved with Siren Alert we will generate some random data and insert it to Elasticsearch. Our sample JSON object will report a UTC @timestamp and mos value per each interval:

The following BASH script will produce our entries for a realistic example:

#!/bin/bash
INDEX=`date +"%Y.%m.%d"`
SERVER="http://127.0.0.1:9200/mos-$INDEX/mos/"

echo "Press [CTRL+C] to stop.."
while :
do
    header="Content-Type: application/json"
    timestamp=`TZ=UTC date +"%Y-%m-%dT%T.%3N"`
    mos=$(( ( RANDOM % 5 )  + 1 ))
    mystring="{\"mos\":${mos},\"@timestamp\":\"${timestamp}\"}"
    echo $mystring;
    curl -sS -i -XPOST -H "$header" -d "$mystring" "$SERVER"
    sleep 5
done
  • Save the file as elasticgen.sh and execute it for a few minutes

Watcher rule

To illustrate the trigger logic, we will create an alert for an aggregation against the data we just created.

The basic Siren Alert example will use simple parameters:

  • Run each 60 seconds

  • Target the daily mos-* index with query aggregation

  • Trip condition when aggregations.avg.value < 3

  • Email action with details

curl -XPUT http://127.0.0.1:9200/watcher/watch/mos -d'
{
  "trigger": {
    "schedule" : { "later" : "every 1 minute"  }
  },
  "input" : {
    "search" : {
      "request" : {
        "indices" : [ "<mos-{now/d}>", "<mos-{now/d-1d}>"  ],
        "body" : {
          "query" : {
            "filtered" : {
              "query": {
                "query_string": {
                  "query": "mos:*",
                  "analyze_wildcard": true
                }
              },
              "filter" : { "range" : { "@timestamp" : { "from" : "now-5m"  } } }
            }
          },
           "aggs": {
             "avg": {
               "avg": {
                 "field": "mos"
               }
             }
           }
        }
      }
    }
  },
  "condition" : {
    "script" : {
      "script" : "payload.aggregations.avg.value < 3"
    }
  },
  "transform" : {},
  "actions" : {
    "email_admin" : {
    "throttle_period" : "15m",
    "email" : {
      "to" : "mos@qxip.net",
      "from" : "sirenalert@qxip.net",
      "subject" : "Low MOS Detected: {{payload.aggregations.avg.value}} ",
      "priority" : "high",
      "body" : "Low MOS Detected:\n {{payload.aggregations.avg.value}} average with {{payload.aggregations.count.value}} measurements in 5 minutes"
    }
    }
  }
}'

Extending logic

The basic Watcher can be extended and improved following the same logic used with the stock _Watcher, for example by using transform to insert detections back in ES. An interesting set of examples is available here

Alarm triggering

Siren Alert will automatically fetch and schedule jobs, executing the watcher queries according to the trigger.schedule parameter, validating their results according to the provided condition.script

Check output

Assuming all data and scripts are correctly executed, you should start seeing output in Siren Alert Alarms tab and in Elasticsearch watcher_alarms-today:day:time index.

Watcher

Watcher anatomy

A Siren Alert watcher is created using the following structure:

➔ Trigger Schedule

When and How to run the Watcher

➔ Input Query

What Query or Join Query to Execute

➔ Condition

How to conditionally Analyze Response

➔ Transform

How to Adapt or Post-Process data

➔ Actions

How to Notify users about this event

Trigger schedule

The schedule defines a set of constraints that must be met to execute a saved watcher. Any number of constraints can be added to a single schedule, and multiple rules can be combined to achieve complex intervals, programmed using simple text expressions using the NodeJS later module.

watcher anatomy

Interval exceptions can also be defined as follows:

every 2 hours except after 20th hour

Input query

The input parameter is the key element of a watcher, and defines a dynamic range index query feeding the circuit. The input field accepts any standard Elasticsearch query including server side scripts in supported languages and fully supports the Siren Join capabilities out of the box.

"input": {
  "search": {
    "request": {
      "index": [
        "<mos-{now/d}>",
        "<mos-{now/d-1d}>"
      ],
      "body": {}
    }
  }
}

Condition

The condition block is the "entry gate" into the processing pipeline of a Watcher and determines its triggered status.

  • On true condition, the pipeline will proceed further.

  • On false condition, the pipeline will stop (no action will be executed) until its next invocation.

Never condition

Use the never condition to set the condition to false. This means the watch actions are never executed when the watch is triggered. Nevertheless, the watch input is executed. This condition is used for testing. There are no attributes to specify for the never condition.

condition: {
  "never" : {}
}
Compare condition

Use the compare condition to perform a simple comparison against a value in the watch payload.

condition: {
  "compare" : {
    "payload.hits.total" : {
      "gte" : 5
    }
}

Comparison operators (apply to numeric, string and date)

eq

Returns true when the resolved value equals the given one

not_eq

Returns true when the resolved value does not equal the given one

lt

Returns true when the resolved value is less than the given one

lte

Returns true when the resolved value is less/equal than/to the given one

gt

Returns true when the resolved value is greater than the given one

gte

Returns true when the resolved value is greater/equal than/to the given one

Array compare condition

Use array_compare to compare an array of values. For example, the following array_compare condition returns true if there is at least one bucket in the aggregation that has a doc_count greater than or equal to 25:

"condition": {
  "array_compare": {
    "payload.aggregations.top_amounts.buckets" : {
      "path": "doc_count" ,
      "gte": {
        "value": 25,
      }
    }
  }
}

Options

array.path

The path to the array in the execution context, specified in dot notation

array.path.path

The path to the field in each array element that you want to evaluate

array.path.operator.quantifier

How many matches are required for the comparison to evaluate to true: some or all. Defaults to some, there must be at least one match. If the array is empty, the comparison evaluates to false

array.path.operator.value

The value to compare against

Script condition

A condition that evaluates a script. The scripting language is JavaScript. Can be as simple as a function expecting a boolean condition or counter.

condition: {
  "script": {
    "script": "payload.hits.total > 100"
  }
}

Also, it can be as complex as an aggregation parser to filter buckets.

condition: {
  "script": {
    "script": "payload.newlist=[];var match=false;var threshold=10;var start_level=2;var finish_level=3;var first=payload.aggregations[start_level.toString()];function loop_on_buckets(element,start,finish,upper_key){element.filter(function(obj){return obj.key;}).forEach( function ( bucket ) { if (start == finish - 1) { if (bucket.doc_count >= threshold) { match=true;payload.newlist.push({line: upper_key + bucket.key + ' ' + bucket.doc_count}); } } else { loop_on_buckets(bucket[start + 1].buckets, start + 1, finish, upper_key + ' ' + bucket.key); } }); } var upper_key = ''; loop_on_buckets(first.buckets, start_level, finish_level, upper_key);match;"
  }
}

Anomaly detection

Simple anomaly finder based on the three-sigma rule of thumb.

  1. Dynamic detection of outliers/peaks/drops

    {
      "script": {
        "script": "payload.hits.total > 0"
      },
      "anomaly": {
        "field_to_check": "fieldName"
      }
    }
  2. Static detection for known ranges/interrupts

    {
      "script": {
        "script": "payload.hits.total > 0"
      },
      "anomaly": {
        "field_to_check": "fieldName",
        "normal_values": [
          5,
          10,
          15,
          20,
          25,
          30
        ]
      }
    }

Range filtering

Use for getting documents which have a value in between some values. For example, get only the documents which have values from 45 to 155 inside Amount field.

{
  "script": {
    "script": "payload.hits.total > 0"
  },
  "range": {
    "field_to_check": "Amount",
    "min": 50,
    "max": 150,
    "tolerance": 5
  }
}

Transform

A transform processes and changes the payload in the watch execution context to prepare it for the watch actions. No actions executed in case if the payload is empty after transform processing.

Search transform

A transform that executes a search on the cluster and replaces the current payload in the watch execution context with the returned search response.

"transform": {
  "search": {
    "request": {
      "index": [
        "credit_card"
      ],
      "body": {
        "size": 300,
        "query": {
          "bool": {
            "must": [
              {
                "match": {
                  "Class": 1
                }
              }
            ]
          }
        }
      }
    }
  }
}

Script transform

A transform that executes a script (JavaScript) on the current payload and replaces it with a newly generated one.

Use it for - converting format types - generating brand new payload keys - interpolating data - etc.

Create new payload property:

"transform": {
  "script": {
    "script": "payload.outliers = payload.aggregations.response_time_outlier.values['95.0']"
  }
}

Filter aggregation buckets:

"transform": {
  "script": {
    "script": "payload.newlist=[]; payload.payload.aggregations['2'].buckets.filter(function( obj ) { return obj.key; }).forEach(function(bucket){ console.log(bucket.key); if (doc_count.length > 1){ payload.newlist.push({name: bucket.key }); }});"
  }
}

Chain transform

A transform that executes an ordered list of configured transforms in a chain, where the output of one transform serves as the input of the next transform in the chain.

"transform": {
  "chain": [
    {
      "search": {
        "request": {
          "index": [
            "credit_card"
          ],
          "body": {
            "size": 300,
            "query": {
              "bool": {
                "must": [
                  {
                    "match": {
                      "Class": 1
                    }
                  }
                ]
              }
            }
          }
        }
      }
    },
    {
      script: {
        script: "payload.hits.total > 100"
      }
    }
  ]
}

Actions

Actions are used to deliver any results obtained by a Watcher to users, APIs or new documents in the cluster. Multiple Actions and Groups can be defined for each.

Actions use the {{ mustache }} logic-less template syntax, and work by iterating arrays and expanding tags in a template using values provided in the response payload.

A dedicated page is available with supported actions.


Full watcher example

{
  "_index": "watcher",
  "_type": "watch",
  "_id": "new",
  "_source": {
    "trigger": {
      "schedule": {
        "later": "every 5 minutes"
      }
    },
    "input": {
      "search": {
        "request": {
          "index": [
            "<mos-{now/d}>",
            "<mos-{now/d-1d}>"
          ],
          "body": {}
        }
      }
    },
    "condition": {
      "script": {
        "script": "payload.hits.total > 100"
      }
    },
    "transform": {
      "script": {
        "script": "payload.hits.total += 100"
      }
    },
    "actions": {
      "email_admin": {
        "throttle_period": "15m",
        "email": {
          "to": "alarm@localhost",
          "subject": "Siren Alert Alarm",
          "priority": "high",
          "body": "Found {{payload.hits.total}} Events"
        }
      },
      "slack_admin": {
        "throttle_period": "15m",
        "slack": {
          "channel": "#kibi",
          "message": "Siren Alert Alert! Found {{payload.hits.total}} Events"
        }
      }
    }
  }
}

Actions

Currently supported "actions" for Siren Alert watchers:

Email

Send Query results and message using Email/SMTP

"email" : {
  "to" : "root@localhost",
  "from" : "sirenalert@localhost",
  "subject" : "Alarm Title",
  "priority" : "high",
  "body" : "Series Alarm {{ payload._id}}: {{payload.hits.total}}",
  "stateless" : false
}

Email HTML

Send Query results and message using Email/SMTP using HTML body

"email_html" : {
  "to" : "root@localhost",
  "from" : "sirenalert@localhost",
  "subject" : "Alarm Title",
  "priority" : "high",
  "body" : "Series Alarm {{ payload._id}}: {{payload.hits.total}}",
  "html" : "<p>Series Alarm {{ payload._id}}: {{payload.hits.total}}</p>",
  "stateless" : false
}

webHook

Deliver message to remote web API

"webhook" : {
  "method" : "POST",
  "host" : "remote.server",
  "port" : 9200,
  "path": ":/{{payload.watcher_id}}",
  "body" : "{{payload.watcher_id}}:{{payload.hits.total}}",
  "create_alert" : true
}

webHook using Proxy ^^^^^^^

Deliver message to remote API using Proxy - Telegram example:

"webhook": {
  "method": "POST",
  "host": "remote.proxy",
  "port": "3128",
  "path": "https://api.telegram.org/bot{botId}/sendMessage",
  "body": "chat_id={chatId}&text=Count+total+hits:%20{{payload.hits.total}}",
  "headers": {
    "Content-Type": "application/x-www-form-urlencoded"
  },
  "create_alert" : true
}

Slack

Delivery Message to #Slack channel

"slack" : {
  "channel": "#channel",
  "message" : "Series Alarm {{ payload._id}}: {{payload.hits.total}}",
  "stateless" : false
}

Report (BETA)

Take a website Snapshot using PhantomJS and send it using Email/SMTP

  • Requires action settings in Siren Investigate configuration

  • Requires Pageres/PhantomJS: npm install -g pageres

"report" : {
  "to" : "root@localhost",
  "from" : "kaae@localhost",
  "subject" : "Report Title",
  "priority" : "high",
  "body" : "Series Report {{ payload._id}}: {{payload.hits.total}}",
  "snapshot" : {
    "res" : "1280,900",
    "url" : "http://127.0.0.1/app/kibana#/dashboard/Alerts",
    "path" : "/tmp/",
    "params" : {
      "username" : "username",
      "password" : "password",
      "delay" : 5000,
      "crop" : false
    }
  },
  "stateless" : false
}

Console

Output Query results and message to Console

"console" : {
  "priority" : "DEBUG",
  "message" : "Average {{payload.aggregations.avg.value}}"
}

Watcher controllers

The following controls are presented when listing existing Watchers:

watcher controllers

  1. Expand / Edit a Watcher

  2. Execute a Watcher Manually

  3. Delete a Watcher

  4. Disable / Enable a Watcher

Examples

Watchers can be as simple or complex as the query and aggregations they use. Here are some examples to get started with.

Siren Alert: hit watcher example:

{
  "_index": "watcher",
  "_type": "watch",
  "_id": "new",
  "_source": {
    "trigger": {
      "schedule": {
        "later": "every 5 minutes"
      }
    },
    "input": {
      "search": {
        "request": {
          "index": [
            "<mos-{now/d}>",
            "<mos-{now/d-1d}>"
          ],
          "body": {}
        }
      }
    },
    "condition": {
      "script": {
        "script": "payload.hits.total > 100"
      }
    },
    "transform": {},
    "actions": {
      "email_admin": {
        "throttle_period": "15m",
        "email": {
          "to": "alarm@localhost",
          "from": "sirenalert@localhost",
          "subject": "Siren Alert Alarm",
          "priority": "high",
          "body": "Found {{payload.hits.total}} Events"
        }
      },
      "slack_admin": {
        "throttle_period": "15m",
        "slack": {
          "channel": "#kibi",
          "message": "Siren Alert Alert! Found {{payload.hits.total}} Events"
        }
      }
    }
  }
}

Siren Alert: transform example (es 2.x):

{
  "_index": "watcher",
  "_type": "watch",
  "_id": "95th",
  "_score": 1,
  "_source": {
    "trigger": {
      "schedule": {
        "later": "every 5 minutes"
      }
    },
    "input": {
      "search": {
        "request": {
          "index": [
            "<access-{now/d}>",
            "<access-{now/d-1d}>"
          ],
          "body": {
            "size": 0,
            "query": {
              "filtered": {
                "query": {
                  "query_string": {
                    "analyze_wildcard": true,
                    "query": "*"
                  }
                },
                "filter": {
                  "range": {
                    "@timestamp": {
                      "from": "now-5m"
                    }
                  }
                }
              }
            },
            "aggs": {
              "response_time_outlier": {
                "percentiles": {
                  "field": "response_time",
                  "percents": [
                    95
                  ]
                }
              }
            }
          }
        }
      }
    },
    "condition": {
      "script": {
        "script": "payload.aggregations.response_time_outlier.values['95.0'] > 200"
      }
    },
    "transform": {
      "script": {
        "script": "payload.myvar = payload.aggregations.response_time_outlier.values['95.0']"
      }
    },
    "actions": {
      "email_admin": {
        "throttle_period": "15m",
        "email": {
          "to": "username@mycompany.com",
          "from": "sirenalert@mycompany.com",
          "subject": "Siren Alert ALARM {{ payload._id }}",
          "priority": "high",
          "body": "Series Alarm {{ payload._id}}: {{ payload.myvar }}"
        }
      }
    }
  }
}

Siren Alert: insert back to Elasticsearch bulk (via nginx or direct) ^^^^^^^^^^^^^^^^^^^^^^

{
 "_index": "watcher",
 "_type": "watch",
 "_id": "surprise",
 "_score": 1,
 "_source": {
   "trigger": {
     "schedule": {
       "later": "every 50 seconds"
     }
   },
   "input": {
     "search": {
       "request": {
         "index": "my-requests-*",
         "body": {
           "query": {
             "filtered": {
               "query": {
                 "query_string": {
                   "query": "*",
                   "analyze_wildcard": true
                 }
               },
               "filter": {
                 "range": {
                   "@timestamp": {
                     "from": "now-5m"
                   }
                 }
               }
             }
           },
           "size": 0,
           "aggs": {
             "metrics": {
               "terms": {
                 "field": "first_url_part"
               }
             }
           }
         }
       }
     }
   },
   "condition": {
     "script": {
       "script": "payload.hits.total > 1"
     }
   },
   "transform": {},
   "actions": {
     "ES_bulk_request": {
       "throttle_period": "1m",
       "webhook": {
         "method": "POST",
         "host": "elasticsearch.foo.bar",
         "port": 80,
         "path": ":/_bulk",
         "body": "{{#payload.aggregations.metrics.buckets}}{\"index\":{\"_index\":\"aggregated_requests\", \"_type\":\"data\"}}\n{\"url\":\"{{key}}\", \"count\":\"{{doc_count}}\", \"execution_time\":\"tbd\"}\n{{/payload.aggregations.metrics.buckets}}",
         "headers": {
           "Content-Type": "text/plain; charset=ISO-8859-1"
         },
         "create_alert": true
       }
     }
   }
 }
}

Wizard

Siren Alert provides a built-in wizard to assist forming proper watchers using a step-by-step sequence

Step 1: New Watcher

The first step is to give our Watcher a name and choose an execution frequency

Step 2: Input Query

The input query is the focal part of our watcher. Make sure time-range fields are dynamic.

Step 3: Condition

Condition is used as a gate to validate if the results received back are worth processing.

Step 4: Transform

Our data may need adjustments or post processing. Process our payload using a javascript expression/script.

Step 5: Actions

Our data is ready. Let’s form a notification using the mustache templating language

Step 6: Expert Mode

Here’s our fully formed Siren Alert JSON watcher in its naked beauty…​

Authentication

Siren Alert supports authentication via Search Guard. There are several options available.

Authenticate search request

Kibana Elasticsearch basic authentication is used for authentication.

Valid certificate

sentinl:
  settings:
    authentication:
      enabled: true
      username: 'elastic'
      password: 'password'
      cert:
        selfsigned: false
        pem: '/path/to/pem/key'

Self-signed certificate

sentinl:
  settings:
    authentication:
      enabled: true
      username: 'elastic'
      password: 'password'
      cert:
        selfsigned: true

Siren Platform (former Siren Investigate)

Authenticate Siren Alert using single user - default sentinl from Access Control app. For example, default investigate.yml

+

# Access Control configuration
investigate_access_control:
  enabled: true
  cookie:
password: "12345678123456781234567812345678"
  admin_role: kibiadmin
  sentinl:
elasticsearch:
  username: sentinl
  password: password
...

Kibana or Siren Platform

It is possible to create multiple user credentials and assign these credentials to watchers, one credential per watcher, thereby authenticating each watcher separately. It is called impersonation.

  1. Create credentials in Search Guard or X-Pack and assign the permissions you need.

    • You need one user for Sentinl and one user per watcher.

  2. Set Sentinl authentication

    sentinl:
      settings:
        authentication:
          enabled: true
          impersonate: true
          username: 'elastic'
          password: 'password'
          sha: '6859a748bc07b49ae761f5734db66848'
          cert:
            selfsigned: true
    • Set password as clear text in password property. The password can be put in encrypted form instead. Set password hash in sha property, now you can remove password option.

    • Use sentinl/scripts/encryptPassword.js script to obtain the hash. Edit the value of the plainTextPassword variable, replacing admin with your password. Copy the generated hash and paste as the sha value. Also, you can change password hashing complexity by setting options inside encryption. Node.js crypto library is used to hash and unhash user password.

  3. Set watcher authentication

    • Create a sha hash of the watcher password using encryptPassword.js. Put it into password input field and username into username field. Note, these fields are visible only when the impersonation is enabled impersonate: true. The fields are one-way only, you can insert credentials but you do not see them. It is to prevent other Sentinl admins from seeing the credentials.

watcher authentication

Authenticate report

Both username and password should be set in the report action in UI.

Kibana configuration for Siren Alert before v5.6

Search Guard

Then, put the following configuration inside investigate.yml

sentinl
  settings
    report
      active: true
      search_guard: true
Basic authentication

Then, put the following configuration inside investigate.yml

sentinl
  settings
    report
      active: true
      simple_authentication: true

Kibana configuration for Siren Alert v5.6+

Search Guard
sentinl:
  settings:
    report:
      active: true
      authentication:
        enabled: true
        mode:
          searchguard: true
X-Pack
sentinl:
  settings:
    report:
      active: true
      authentication:
        enabled: true
        mode:
          xpack: true
Basic
sentinl:
  settings:
    report:
      active: true
      authentication:
        enabled: true
        mode:
          basic: true
Custom
sentinl:
  settings:
    report:
      active: true
      authentication:
        enabled: true
        mode:
          custom: true
        custom: # you have to replace the following selectors with selectors found on your login page
          username_input_selector: '#username'
          password_input_selector: '#password'
          login_btn_selector: '#login-btn'

How-to

Manual in dashboard

In order to display Siren Alert alarms in Siren Investigate/Kibana:

  • Switch to Discover tab

  • Create and Save a search table for watcher_alarms-* with any desired column

  • Switch to Dashboard tab

  • Add a new Search Widget using the Search tab and selecting the saved query


Query aggregations watcher for Nagios NRDP

In this example we will configure a Siren Alert (or Elastic) Watcher to stream statuses to an extermal Nagios NRDP endpoint.

1. Query Request

Let’s run an aggregation query in Sense to find low MOS groups in the last 5 minutes interval:

GET _search
{
  "query": {
    "filtered": {
      "query": {
        "query_string": {
          "query": "_type:metrics_calls_total_mos AND tab:mos",
          "analyze_wildcard": true
        }
      },
      "filter": {
        "bool": {
          "must": [
            {
              "range": {
                "@timestamp": {
                  "gte": "now-5m",
                  "lte": "now"
                }
              }
            },
            {
              "range" : {
                "value" : {
                  "lte" : 3
                }
              }
            }
          ],
          "must_not": []
        }
      }
    }
  },
  "size": 0,
  "aggs": {
    "mos": {
      "date_histogram": {
        "field": "@timestamp",
        "interval": "30s",
        "time_zone": "Europe/Berlin",
        "min_doc_count": 1
      },
      "aggs": {
        "by_group": {
          "terms": {
            "field": "group.raw",
            "size": 5,
            "order": {
              "_term": "desc"
            }
          },
          "aggs": {
            "avg": {
              "avg": {
                "field": "value"
              }
            }
          }
        }
      }
    }
  }
}

2. Query Response

The response should look similar to this example - let’s analyze the data structure:

{
  "took": 5202,
  "timed_out": false,
  "_shards": {
    "total": 104,
    "successful": 104,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "mos": {
      "buckets": [
        {
          "key_as_string": "2016-08-02T13:41:00.000+02:00",
          "key": 1470138060000,
          "doc_count": 2,
          "by_group": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "domain1.com",
                "doc_count": 2,
                "avg": {
                  "value": 1.85
                }
              }
            ]
          }
        },
        {
          "key_as_string": "2016-08-02T13:42:00.000+02:00",
          "key": 1470138120000,
          "doc_count": 1,
          "by_group": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "domain2.com",
                "doc_count": 1,
                "avg": {
                  "value": 2.81
                }
              }
            ]
          }
        }
      ]
    }
  }
}

3. Watcher Query

Next let’s use Sense to create a custom Siren Alert Watcher based on the query and its response, using mustache syntax to loop trough the aggregation buckets and extracting grouped results in an XML structure accepted by Nagios:

PUT _watcher/watch/low_mos
{
  "metadata": {
    "mos threshold": 3
  },
  "trigger": {
    "schedule": {
      "interval": "5m"
    }
  },
  "input": {
    "search": {
      "request": {
        "indices": [
          "<pcapture_*-{now/d}>"
        ],
        "body": {
          "size": 0,
          "query": {
            "filtered": {
              "query": {
                "query_string": {
                  "query": "_type:metrics_calls_total_mos AND tab:mos",
                  "analyze_wildcard": true
                }
              },
              "filter": {
                "bool": {
                  "must": [
                    {
                      "range": {
                        "@timestamp": {
                          "gte": "now-5m",
                          "lte": "now"
                        }
                      }
                    },
                    {
                      "range": {
                        "value": {
                          "lte": 3
                        }
                      }
                    }
                  ],
                  "must_not": []
                }
              }
            }
          },
          "aggs": {
            "mos": {
              "date_histogram": {
                "field": "@timestamp",
                "interval": "30s",
                "time_zone": "Europe/Berlin",
                "min_doc_count": 1
              },
              "aggs": {
                "by_group": {
                  "terms": {
                    "field": "group.raw",
                    "size": 5,
                    "order": {
                      "_term": "desc"
                    }
                  },
                  "aggs": {
                    "avg": {
                      "avg": {
                        "field": "value"
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  },
  "condition": {
     "script": {
        "script": "payload.hits.total > 1"
     }
  },
  "actions" : {
  "my_webhook" : {
    "throttle_period" : "5m",
    "webhook" : {
      "method" : "POST",
      "host" : "nagios.domain.ext",
      "port" : 80,
      "path": ":/nrdp",
      "body" : "token=TOKEN&cmd=submitcheck&XMLDATA=<?xml version='1.0'?><checkresults>{{#ctx.payload.aggregations.mos.buckets}} <checkresult type='host' checktype='1'>{{#by_group.buckets}}<hostname>{{key}}</hostname><servicename>MOS</servicename><state>0</state><output>MOS is {{avg.value}}</output> {{/by_group.buckets}}</checkresult>{{/ctx.payload.aggregations.mos.buckets}}</checkresults></xml>"
    }
  }
}
}

Action Body (mustache generated)

<?xml version='1.0'?>
<checkresults>
<checkresult type='host' checktype='1'>
<hostname>domain1.com</hostname><servicename>MOS</servicename><state>0</state><output>MOS is 1.85</output> </checkresult>
<checkresult type='host' checktype='1'>
<hostname>domain2.com</hostname><servicename>MOS</servicename><state>0</state><output>MOS is 2.81</output> </checkresult>
</checkresults>
</xml>

Mustache Playground

A simple playground simulating this response and output is available here.

Reports

Siren Alert watchers can generate snapshots of Siren Investigate, Kibana (or any other website) and deliver them on your schedule using the dedicated report action, powered by PhantomJS.

So your Boss wants to see some charts each Monday? No problem!

{
  "_index": "watcher",
  "_type": "watch",
  "_id": "reporter_v8g6p5enz",
  "_score": 1,
  "_source": {
    "trigger": {
      "schedule": {
        "later": "on the first day of the week"
      }
    },
    "report": true,
    "actions": {
      "report_admin": {
        "report": {
          "to": "reports@localhost",
          "from": "sirenalert@localhost",
          "subject": "Siren Alert Report",
          "priority": "high",
          "body": "Sample Siren Alert Screenshot Report",
          "snapshot": {
            "res": "1280x900",
            "url": "http://www.google.com",
            "params": {
              "delay": 5000
            }
          }
        }
      }
    }
  }
}

Requirements

Report actions require:

  • Siren Alert after v4.5 and before v5.6. PhantomJS installed on the Siren Investigate/Kibana host, ie: npm install phantomjs-prebuilt -g

  • Siren Alert v5.6+. Chrome v59+ or Chromium v59+ for Siren Alert v5.6+

Note: Chromium is included in the Linux build of Sentinl. Windows and macOS Sentinl users must download Chromium (https://www.chromium.org/getting-involved/download-chromium) and change the sentinl.settings.report.executable_path to point to it, for example:

sentinl:
  app_name: 'Sentinl'
  settings:
    email:
      active: true
      host: 'localhost'
      #cert:
      #key: '/home'
    report:
      active: true
      executable_path: '/usr/bin/chromium'
  • Valid configuration in kibana.yml, for example

sentinl:
  settings:
    email:
      active: true
      host: 'localhost'
    report:
      active: true
      executable_path: '/usr/bin/chromium' # path to Chrome v59+ or Chromium v59+ # Siren Alert v5.6+
      # tmp_path = '/tmp/' # Siren Alert before v5.6

Report Away. With a pinch of luck, you will soon receive your first report with a screen shot attached.

Common Issues

  • Unhandled rejection Error: spawn phantomjs ENOENT

    • PhantomJS is not available to Node-Horseman

Spy plugin

Siren Alert features an integrated Siren Investigate/Kibana plugin extending the default Spy functionality to help users quickly shape new prototype Watchers based on Visualize queries, and providing them to Siren Alert for fine editing and scheduling.

Annotations

Siren Alert Alerts and Detections can be superimposed over visualizations widgets using the Annotations feature in Kibana 5.5+ revealing points of contact and indicators in real-time. The familiar mustache syntax is utilized to render row elements from the alert based on case requirements.

How-To

Follow this procedure to enable Siren Alert Annotations over your data:

  • Visualize your timeseries using the Query Builder widget

  • Switch to the Annotations Tab

  • Annotations > Add Data Source

  • Select the Index and Timefield for Siren Alert

  • Index Pattern: watcher_alerts*

  • Time Field: @timestamp

  • Select the Field to Display in Annotations

  • Fields: message

  • Row Template: {{ message }}

Visual Example

sentinl annotation

Search Guard

Siren Alert with Kibana 5.5.2 + Search Guard 5.5.2 demo

ATTENTION! In a production environment, you should use unique passwords and valid trusted certificates. Read more about this in Search Guard documentation.

Install Search Guard

  • Install the Search Guard plugin for your Elasticsearch version, for example:

    <ES folder>/bin/elasticsearch-plugin install https://github.com/floragunncom/search-guard/releases/tag/ves-5.5.2-16
  • cd <ES folder>/plugins/search-guard-<version>/tools

  • Execute ./install_demo_configuration.sh, chmod the script first if necessary. This will generate all required TLS certificates and add the Search Guard configuration to your elasticsearch.yml file.

  • Start Elasticsearch ./bin/elasticsearch

  • Execute ./sgadmin_demo.sh, chmod the script if necessary first. This will execute sgadmin and populate the Search Guard configuration index with the files contained in the plugins/search-guard-/sgconfig folder.

  • Test the installation

    curl -uadmin:admin -sS -i --insecure -XGET https://localhost:9200/_searchguard/authinfo?pretty

Allow Siren Alert access

Allow Siren Alert to access watcher and credit_card indices in sg_roles.yml.

sg_kibana_server:
  cluster:
      - CLUSTER_MONITOR
      - CLUSTER_COMPOSITE_OPS
      - cluster:admin/xpack/monitoring*
  indices:
    '?kibana':
      '*':
        - INDICES_ALL
    'watcher*':
      '*':
       - indices:data/read/search
       - MANAGE
       - CREATE_INDEX
       - INDEX
       - READ
       - WRITE
       - DELETE
    'credit_card':
      '*':
       - indices:data/read/search

Apply Search Guard configuration

  • cd into elasticsearch

  • Execute

    ./plugins/search-guard-5/tools/sgadmin.sh -cd plugins/search-guard-5/sgconfig/ -ts config/truststore.jks -ks config/kirk.jks -icl -nhnv

    More details are here

Install Search Guard Kibana plugin

  • cd into kibana folder

  • Execute:

    ./bin/kibana-plugin install https://github.com/floragunncom/search-guard-kibana-plugin/releases/download/v5.5.2-4/searchguard-kibana-5.5.2-4.zip
  • Set HTTPS connection for Elasticsearch in kibana/config/kibana.yml

    elasticsearch.url: "https://localhost:9200"
  • Set Kibana user and password in kibana/config/kibana.yml

    elasticsearch.username: "kibanaserver"
    elasticsearch.password: "kibanaserver"
  • Disregard validity of SSL certificate in kibana/config/kibana.yml

    elasticsearch.ssl.verificationMode: 'none'

Transform

Siren Alert Nuggets

Random nuggets for recurring challenges

Dot Field Selection Transform for Percentile objects

"transform": {
  "script": {
    "script": "payload = JSON.parse(JSON.stringify(payload).split('95.0').join('95'));"
  }
}

Bucket Cloning

"transform": {
  "script": {
    "script": "payload.aggregations.metrics.buckets.forEach(function(e){ e.ninetieth_surprise.value = e.ninetieth_surprise.values['95.0'] })"
  }
}

Anomaly detection

The Siren Alert anomaly detection mechanism is based on the three-sigma rule of thumb. In short, anomalies are the values which lie outside a band around the mean in a normal distribution with a width of two, four and six standard deviations (68.27%, 95.45% and 99.73%).

  1. Create a new watcher.

  2. In watcher editor, inside Input tab insert Elasticsearch query to get the credit card transactions dataset.

    {
      "search": {
        "request": {
          "index": [
            "credit_card"
          ],
          "body": {
            "size": 10000,
            "query": {
              "bool": {
                "must": [
                  {
                    "exists": {
                      "field": "Amount"
                    }
                  }
                ]
              }
            }
          }
        }
      }
    }
  3. In the Condition tab specify a minimum number of results to look for payload.hits.total > 0 and a field name in which to look for anomalies, Amount in our example.

    {
      "script": {
    "script": "payload.hits.total > 0"
      },
      "anomaly": {
    "field_to_check": "Amount"
      }
    }
  4. In Action tab create email html action. In Body HTML field render all the anomalies you have in the payload.anomaly using mustache syntax.

    <h1 style="background-color:DodgerBlue;color:white;padding:5px">Anomalies</h1>
    <div style="background-color:Tomato;color:white;padding:5px">
    <ul>
    {{#payload.anomaly}}
    <li><b>id:</b> {{_id}} <b>Amount</b>: {{_source.Amount}}</li>
    {{/payload.anomaly}}
    </ul>
    </div>

As a result, we have an email with a list of anomaly transactions. anomaly detection

Also, the list of anomalies was indexed in today’s alert index watcher_alarms-{year-month-date}. watcher alarms

Statistical anomaly detection

In this example, we will implement the ATLAS statistical anomaly detector using Siren Alert

Our situation:

  • We have an varnish-cache server as Frontend-LB and caching Proxy

  • The backends are selected based on their first_url_part

  • Backends are dynamically added or removed by our development teams (even new applications)

If we look at the 95th percentile of our consolidated backend runtimes we cannot see problems of a special backend service. If we draw a graph for every service, it will be too much to see a Problem.

To solve this, we will implement the atlas algorithm:

Here is a timelion screeshot of a Loadbalancer Problem:


How to do this? We need two watchers:

  • First the one to collect a most surprising req_runtime of every backend for every hour

  • The second watcher iterates every 5 minute over the atlas index to find anomalies to report

First Watcher

This watcher will collect a most surprising req_runtime of every backend for every hour, and insert any results in the atlas index (using webhook and _bulk)

{
  "_index": "watcher",
  "_type": "watch",
  "_id": "surprise",
  "_score": 1,
  "_source": {
    "trigger": {
      "schedule": {
        "later": "every 1 hours"
      }
    },
    "input": {
      "search": {
        "request": {
          "index": "public-front-*",
          "body": {
            "query": {
              "filtered": {
                "filter": {
                  "range": {
                    "@timestamp": {
                      "gte": "now-24h"
                    }
                  }
                }
              }
            },
            "size": 0,
            "aggs": {
              "metrics": {
                "terms": {
                  "field": "first_url_part"
                },
                "aggs": {
                  "queries": {
                    "terms": {
                      "field": "backend"
                    },
                    "aggs": {
                      "series": {
                        "date_histogram": {
                          "field": "@timestamp",
                          "interval": "hour"
                        },
                        "aggs": {
                          "avg": {
                            "avg": {
                              "script": "doc['req_runtime'].value*1000",
                              "lang": "expression"
                            }
                          },
                          "movavg": {
                            "moving_avg": {
                              "buckets_path": "avg",
                              "window": 24,
                              "model": "simple"
                            }
                          },
                          "surprise": {
                            "bucket_script": {
                              "buckets_path": {
                                "avg": "avg",
                                "movavg": "movavg"
                              },
                              "script": {
                                "file": "surprise",
                                "lang": "groovy"
                              }
                            }
                          }
                        }
                      },
                      "largest_surprise": {
                        "max_bucket": {
                          "buckets_path": "series.surprise"
                        }
                      }
                    }
                  },
                  "ninetieth_surprise": {
                    "percentiles_bucket": {
                      "buckets_path": "queries>largest_surprise",
                      "percents": [
                        90.01
                      ]
                    }
                  }
                }
              }
            }
          }
        }
      }
    },
    "condition": {
      "script": {
        "script": "payload.hits.total > 1"
      }
    },
    "transform": {
      "script": {
        "script": "payload.aggregations.metrics.buckets.forEach(function(e){ e.ninetieth_surprise.value = e.ninetieth_surprise.values['90.01']; e.newts = new Date().toJSON(); })"
      }
    },
    "actions": {
      "ES_bulk_request": {
        "throttle_period": "1m",
        "webhook": {
          "method": "POST",
          "host": "myhost",
          "port": 80,
          "path": "/_bulk",
          "body": "{{#payload.aggregations.metrics.buckets}}{\"index\":{\"_index\":\"atlas\", \"_type\":\"data\"}}\n{\"metric\":\"{{key}}\", \"value\":{{ninetieth_surprise.value}}, \"execution_time\":\"{{newts}}\"}\n{{/payload.aggregations.metrics.buckets}}",
          "headers": {
            "content-type": "text/plain; charset=ISO-8859-1"
          }
        }
      }
    }
  }
}

The transform script makes the 90th suprise value of every buckes accessible for mustache and generates a NOW timestamp. The action writes the relevant values back to a seperate index named atlas.

Second Watcher

The second watcher iterates every 5 minutes over the atlas index to find anomalies to report:

{
  "_index": "watcher",
  "_type": "watch",
  "_id": "check_surprise",
  "_score": 1,
  "_source": {
    "trigger": {
      "schedule": {
        "later": "every 5 minutes"
      }
    },
    "input": {
      "search": {
        "request": {
          "index": "atlas",
          "body": {
            "query": {
              "filtered": {
                "filter": {
                  "range": {
                    "execution_time": {
                      "gte": "now-6h"
                    }
                  }
                }
              }
            },
            "size": 0,
            "aggs": {
              "metrics": {
                "terms": {
                  "field": "metric"
                },
                "aggs": {
                  "series": {
                    "date_histogram": {
                      "field": "execution_time",
                      "interval": "hour"
                    },
                    "aggs": {
                      "avg": {
                        "avg": {
                          "field": "value"
                        }
                      }
                    }
                  },
                  "series_stats": {
                    "extended_stats": {
                      "field": "value",
                      "sigma": 3
                    }
                  }
                }
              }
            }
          }
        }
      }
    },
    "condition": {
      "script": {
        "script": "var status=false;payload.aggregations.metrics.buckets.forEach(function(e){ var std_upper=parseFloat(e.series_stats.std_deviation_bounds.upper); var avg=parseFloat(JSON.stringify(e.series.buckets.slice(-1)[0].avg.value)); if(isNaN(std_upper)||isNaN(avg)) {return status;}; if(avg > std_upper) {status=true; return status;};});status;"
      }
    },
    "transform": {
      "script": {
        "script": "var alerts=[];payload.payload.aggregations.metrics.buckets.forEach(function(e){ var std_upper=parseFloat(e.series_stats.std_deviation_bounds.upper); var avg=parseFloat(JSON.stringify(e.series.buckets.slice(-1)[0].avg.value)); if(isNaN(std_upper)||isNaN(avg)) {return false;}; if(avg > std_upper) {alerts.push(e.key)};}); payload.alerts=alerts"
      }
    },
    "actions": {
      "series_alarm": {
        "throttle_period": "15m",
        "email": {
          "to": "alarms@email.com",
          "from": "sirenalert@localhost",
          "subject": "ATLAS ALARM Varnish_first_url_part",
          "priority": "high",
          "body": "there is an alarm for the following Varnish_first_url_parts:{{#alerts}}{{.}}<br>{{/alerts}}"
        }
      }
    }
  }
}

The condition script tests whether the average runtime of the last bucket is greater than upper bound of the std_dev.

The transform script does something similar to an alerts array at the top of the payload. At the end, we alert per email (or REST POST, etc)

Credits:
Thanks to Christian (@cherweg) for contributing his examples for the community

Outliers

This example performs a brutal outlier detection against a bucket of detections in one go.

Super-Basic Outlier Condition (exploded)

var match=false; // false by default

payload.offenders = new Array();

payload.detections = new Array();

function detect(data){
   data.sort(function(a,b){return a-b});
   var l = data.length;
   var sum=0;
   var sumsq = 0;
   for(var i=0;i<data.length;++i){ sum+=data[i];sumsq+=data[i]*data[i];}
   var mean = sum/l;
   var median = data[Math.round(l/2)];
   var LQ = data[Math.round(l/4)];
   var UQ = data[Math.round(3*l/4)];
   var IQR = UQ-LQ;
   for(var i=0;i<data.length;++i){if(!(data[i]> median - 2 * IQR && data[i] < mean + 2 * IQR)){
      match=true; payload.detections.push(data[i]);
   }
 }
};

var countarr=[];

payload.aggregations.hits_per_hour.buckets.forEach(function(e){
  if(e.doc_count > 1) countarr.push(e.doc_count);
}); detect(countarr);

payload.aggregations.hits_per_hour.buckets.forEach(function(e){
  payload.detections.forEach(function(mat){
     if(e.doc_count == mat) payload.offenders.push(e);
  })
});

match;

Example Siren Alert Watcher

{
  "_index": "watcher",
  "_type": "watch",
  "_id": "anomaly_runner",
  "_score": 1,
  "_source": {
    "uuid": "anomaly_runner",
    "disable": false,
    "trigger": {
      "schedule": {
        "later": "every 30 minutes"
      }
    },
    "input": {
      "search": {
        "request": {
          "body": {
            "size": 0,
            "query": {
              "filtered": {
                "query": {
                  "query_string": {
                    "analyze_wildcard": true,
                    "query": "_type:cdr AND status:8"
                  }
                },
                "filter": {
                  "bool": {
                    "must": [
                      {
                        "range": {
                          "@timestamp": {
                            "gte": "now-1h",
                            "lte": "now"
                          }
                        }
                      }
                    ],
                    "must_not": []
                  }
                }
              }
            },
            "aggs": {
              "hits_per_hour": {
                "date_histogram": {
                  "field": "@timestamp",
                  "interval": "1m",
                  "time_zone": "Europe/Berlin",
                  "min_doc_count": 1
                },
                "aggs": {
                  "top_sources": {
                    "terms": {
                      "field": "source_ip.raw",
                      "size": 5,
                      "order": {
                        "_count": "desc"
                      }
                    }
                  }
                }
              }
            }
          },
          "index": [
            "<pcapture_cdr_*-{now/d}>",
            "<pcapture_cdr_*-{now/d-1d}>"
          ]
        }
      }
    },
    "condition": {
      "script": {
        "script": "payload.detections = new Array();function detect(data){data.sort(function(a,b){return a-b});var l = data.length;var sum=0;var sumsq = 0;for(var i=0;i<data.length;++i){sum+=data[i];sumsq+=data[i]*data[i];}var mean = sum/l; var median = data[Math.round(l/2)];var LQ = data[Math.round(l/4)];var UQ = data[Math.round(3*l/4)];var IQR = UQ-LQ;for(var i=0;i<data.length;++i){if(!(data[i]> median - 2 * IQR && data[i] < mean + 2 * IQR)){ match=true; payload.detections.push(data[i]); } }}; var match=false;var countarr=[]; payload.aggregations.hits_per_hour.buckets.forEach(function(e){ if(e.doc_count > 1) countarr.push(e.doc_count); });detect(countarr);payload.aggregations.hits_per_hour.buckets.forEach(function(e){ payload.detections.forEach(function(mat){ if(e.doc_count == mat) payload.offenders.push(e); })});match;"
      }
    },
    "transform": {},
    "actions": {
      "kibi_actions": {
        "email": {
          "to": "root@localhost",
          "from": "sirenalert@localhost",
          "subject": "Series Alarm {{ payload._id}}: User Anomaly {{ payload.detections }} CDRs per Minute",
          "priority": "high",
          "body": "Series Alarm {{ payload._id}}: Anomaly Detected. Possible Offenders: {{#payload.offenders}} \n{{key_as_string}}: {{doc_count}} {{#top_sources.buckets}}\n IP: {{key}} ({{doc_count}} failures) {{/top_sources.buckets}} {{/payload.offenders}} "        }
      }
    }
  }
}

Troubleshooting

This page offers some common problem-solution pairs, dedicated to both new and existing users.

W.I.P - Make sure you also check the Siren Alert FAQ


Error after Kibana upgrade

Remove Kibana Webpack bundles and restart Kibana.

rm -rf kibana/optimize/bundles/*

Probably you have some old code build there which causes the error. The bundles will be generated again when you start Kibana.

Debug Siren Alert

Ensure you have the following options in kibana.yml:

# Enables you specify a file where Siren Investigate stores log output.
logging.dest: stdout

# Set the value of this setting to true to suppress all logging output.
logging.silent: false

# Set the value of this setting to true to suppress all logging output
other than error messages. logging.quiet: false

# Set the value of this setting to true to log all events, including
system usage information # and all requests. logging.verbose: true

All messages which have Siren Alert in its status are messages related to Siren Alert.

No alert emails

Basic config, kibana.yml:

logging.verbose: true
sentinl:
  settings:
    email:
      active: true
      host: beast-cave
      ssl: false
    report:
      active: true
      tmp_path: /tmp/

Check your server using some email client, for example mailx:

mailx -S smtp=<smtp-server-address> -r <from-address> -s <subject> -v <to-address> < body.txt

Security exception while using Search Guard

For example, this message

p-f45016r31z8-yok6hzhmmii: [security_exception] no permissions for indices:data/read/search :: {\"path\":\"/logstash-2017.09.22/_search\"    ,\"query\":{},\"body\":\"{}\",\"statusCode\":403,\"response\":\"{\\\"error\\\":{\\\"root_cause\\\":[{\\\"type\\\":\\\"security_exception\    \\",\\\"reason\\\":\\\"no permissions for indices:data/read/search\\\"}],\\\"type\\\":\\\"security_exception\\\",\\\"reason\\\":\\\"no pe    rmissions for indices:data/read/search\\\"},\\\"status\\\":403}\"}"}

It says Siren Alert cannot read indices:data/read/search the logstash-2017.09.22 index. Ensure you have the following role for logstash-* indices in sg_roles.yml:

# For the kibana server
sg_kibana_server:
  indices:
    'logstash-*':
      '*':
       - indices:data/read/search

Do not forget to apply Search Guard configuration change using sgadmin.sh.

FAQ

Q: Is Siren Alert a Watcher clone?

Siren Alert is not a Watcher clone per-se, but it does share generic concepts and configuration style with Elastic Watcher to ease the pain of users migrating between the two solutions and could potentially be used to manage Elastic Watcher alerts.

Siren Alert is a Siren Investigate application and its core scheduler runs within the Siren Investigate/Kibana server and is controlled with a dedicated UI, while Elastic Watcher is a head-less, Elasticsearch plugin and runs inside of Elasticsearch (and requires a commercial license to function past trial)


Q: How can I help?

Siren Alert is Open-Source and anyone can tremendously help the project by contributing code, testing, hunting bugs and extending documentation. Non technical user? Help us by spreading the word about our solutions with a blog post, tweet or by sharing your experience using it.

Q: Emails are not being sent - Why?

Siren Alert uses the emailjs npm module to ship out emails. The module requires a correct message formed, so ensure your configuration includes a valid FROM and TO as well as proper authentication method for your mail relay. In case of doubts, refer to the documentation

Q: Reports are not being generated - Why?

Siren Alert uses the node-horseman npm module to control PhantomJS at the core of this feature. The module requires PhantomJS being pre-installed on the system running KaaE and Reports.


Q: Can I disable a watcher without deleting it?

Sure. Just set watcher parameter _source.disable: true and Siren Alert will bypass it entirely.


Q: How many concurrent watcher can Siren Alert handle?

Siren Alert relies on Elasticsearch search thread pool. By default, it is 1000 concurrent requests (if server hardware is powerful enough), also this value can be configured. Thus theoretically, by default, we can support 1000 watchers running at the same time.


Q: Watchers are not running in my timezone - Why?

Siren Alert uses the UTC timezone internally to execute schedule - While rolling watchers are not effected (every x minutes) UTC timezone will be used for absolute timed executions. Future versions will allow adapting to localTimezone of the server executing Kibana/Siren Investigate.


Q: How can I avoid string encoding in mustache templates output?

When using mustache templates, all variables are HTML escaped by default. If you want to return unescaped HTML, use the triple mustache: \{\{{name}}}. You can also use and to unescape a variable: \{\{& name}}. This may be useful when changing delimiters (see documentation)


Q: How can I use Siren Alert with readonlyRest authentication?

When using readonlyRest, the following Siren Alert exceptions should be added to its configuration:

- name: ALLOWPOST
  type: allow
  methods: [POST,HEAD,GET,DELETE,OPTIONS]
  uri_re:  ^/watcher_alarms-.*/
  hosts: [localhost]
  verbosity: info

- name: ALLOWHEAD
  type: allow
  methods: [POST,HEAD,GET,DELETE]
  uri_re:  ^/watcher.*/
  hosts: [localhost]
  verbosity: info

Q: How can I use Siren Alert with SearchGuard authentication?

Here’s an example provided by our Community to use Siren Alert
SearchGuard. Full demo configuration.

  1. Edit the sg_kibana_server role in sg_roles.yml: ` sg_kibana_server: cluster:

    • CLUSTER_MONITOR

    • CLUSTER_COMPOSITE_OPS indices: '?kibana': '*':

      • INDICES_ALL 'watcher': '':

    • MANAGE

    • CREATE_INDEX

    • INDEX

    • READ

    • WRITE

    • DELETE

2. Reinitialize Search Guard afterwards, for example

elasticsearch-5.4.0$ ./plugins/search-guard-5/tools/sgadmin.sh -cd plugins/search-guard-5/sgconfig/ -icl -ts config/truststore.jks -ks config/keystore.jks -h localhost -p 9300 -nhnv ` ---

Q: Why are prebuilt Siren Alert packages so big?

Siren Alert prebuilt packages include PhantomJS binaries, occupying most of the archive space.


Dev tools

The Dev Tools page contains development tools that you can use to interact with your data in Kibana.

Console

The Console plugin provides a UI to interact with the REST API of Elasticsearch. Console has two main areas: the editor, where you compose requests to Elasticsearch, and the response pane, which displays the responses to the request.

introduction screen
Figure 27. The Console UI

Console understands commands in a cURL-like syntax. For example the following Console command

GET /_search
{
  "query": {
    "match_all": {}
  }
}

is a simple GET request to Elasticsearch’s _search API. Here is the equivalent command in cURL.

curl -XGET "http://localhost:9200/_search" -d'
{
  "query": {
    "match_all": {}
  }
}'

In fact, you can paste the command into Console and it will automatically be converted into the Console syntax.

When typing a command, Console will make context sensitive suggestions. These suggestions can help you explore parameters for each API, or to just speed up typing. Console will suggest APIs, indexes and field names.

Suggestions
Figure 28. API suggestions

After you have typed a command in to the left pane, you can submit it to Elasticsearch by clicking the little green triangle that appears next to the URL line of the request. Notice that as you move the cursor around, the little triangle and wrench icons follow you around. We call this the Action Menu. You can also select multiple requests and submit them all at once.

The Action Menu
Figure 29. The Action Menu

When the response come back, you should see it in the left panel:

introduction output
Figure 30. The Output Pane

The console user interface

In this section you will find a more detailed description of the UI of the Console. The basic aspects of the UI are explained in the Console section.

Multiple requests support

The Console editor enables writing multiple requests below each other. As shown in the Console section, you can submit a request to Elasticsearch by positioning the cursor and using the Action Menu. Similarly you can select multiple requests in one go:

Multiple Requests
Figure 31. Selecting Multiple Requests

Console will send the request one by one to Elasticsearch and show the output on the right pane as Elasticsearch responds. This is very handy when debugging an issue or trying query combinations in multiple scenarios.

Selecting multiple requests also enables you to auto format and copy them as cURL in one go.

Auto formatting

Console enables you to auto format messy requests. To do so, position the cursor on the request you would like to format and select Auto Indent from the action menu:

Auto format before
Figure 32. Auto Indent a request

Console will adjust the JSON body of the request and it will now look like this:

Auto format after
Figure 33. A formatted request

If you select Auto Indent on a request that is already perfectly formatted, Console will collapse the request body to a single line per document. This is very handy when working with Elasticsearch’s bulk APIs:

Auto format bulk
Figure 34. One doc per line

Keyboard shortcuts

Console comes with a set of nifty keyboard shortcuts making working with it even more efficient. Here is an overview:

General editing

Ctrl/Cmd + I

Auto indent current request.

Ctrl + Space

Open Auto complete (even if not typing).

Ctrl/Cmd + Enter

Submit request.

Ctrl/Cmd + Up/Down

Jump to the previous/next request start or end.

Ctrl/Cmd + Alt + L

Collapse/expand current scope.

Ctrl/Cmd + Option + 0

Collapse all scopes but the current one. Expand by adding a shift.

When auto-complete is visible

Down arrow

Switch focus to auto-complete menu. Use arrows to further select a term.

Enter/Tab

Select the currently selected or the top most term in auto-complete menu.

Esc

Close auto-complete menu.

History

Console maintains a list of the last 500 requests that were successfully executed by Elasticsearch. The history is available by clicking the History icon on the top right side of the window. The icons opens the history panel where you can see the old requests. You can also select a request here and it will be added to the editor at the current cursor position.

History Panel
Figure 35. History Panel

Settings

Console has multiple settings you can set. All of them are available in the Settings panel. To open the panel click the Settings icon on the top right.

Setting Panel
Figure 36. Settings Panel

Configuring console

You can add the following options in the config/investigate.yml file:

console.enabled

Default: true Set to false to disable Console. Toggling this will cause the server to regenerate assets on the next startup, which may cause a delay before pages start being served.

Translate join query

Siren Investigate has a tool for translating Siren Investigate specific DSL query syntax into raw Elasticsearch query syntax.

To access it, go to /app/kibana#/dev_tools/translateJoinQuery in the URL bar.

Translate Join Query

Paste your DSL query into the Raw Query box at the top.

Then click Translate to see the raw Elasticsearch query in the Translated Query box at the bottom.

Management

The Management application is where you perform your runtime configuration of Siren Investigate, including both the initial setup and ongoing configuration of index patterns, advanced settings that tweak the behaviors of Siren Investigate itself, and the various "objects" that you can save throughout Siren Investigate such as searches, visualizations, and dashboards.

This section is pluginable, so in addition to the out of the box capabitilies, packs such as Siren Investigate Access Control and X-Pack can add additional management capabilities.

Index patterns

To use Siren Investigate, you have to tell it about the Elasticsearch indices that you want to explore by configuring one or more index patterns. You can also:

  • Create scripted fields that are computed on the fly from your data. You can browse and visualize scripted fields, but you cannot search them.

  • Set advanced options such as the number of rows to show in a table and how many of the most popular fields to show. Use caution when modifying advanced options, as it is possible to set values that are incompatible with one another.

  • Configure Siren Investigate for a production environment

Creating an index pattern to connect to Elasticsearch

An index pattern identifies one or more Elasticsearch indices that you want to explore with Siren Investigate. Siren Investigate looks for index names that match the specified pattern. An asterisk (*) in the pattern matches zero or more characters. For example, the pattern myindex-* matches all indices whose names start with myindex-, such as myindex-1 and myindex-2.

An index pattern can also simply be the name of a single index.

To create an index pattern to connect to Elasticsearch:

  1. Go to the Management > Indexes and Relations tab.

  2. Click Add Index Pattern.

  3. Specify an index pattern that matches the name of one or more of your Elasticsearch indices. By default, Siren Investigate guesses that you are working with log data being fed into Elasticsearch by Logstash.

    When you switch between top-level tabs, Siren Investigate remembers where you were. For example, if you view a particular index pattern from the Settings tab, switch to the Discover tab, and then go back to the Settings tab, Siren Investigate displays the index pattern you last looked at. To get to the create pattern form, click Add in the Index Patterns list.
  4. If your index contains a timestamp field that you want to use to perform time-based comparisons, select the Index contains time-based events option and select the index field that contains the timestamp. Siren Investigate reads the index mapping to list all of the fields that contain a timestamp.

  5. By default, Siren Investigate restricts wildcard expansion of time-based index patterns to indices with data within the currently selected time range. Click Do not expand index pattern when search to disable this behavior.

  6. Click Create to add the index pattern.

  7. To designate the new pattern as the default pattern to load when you view the Discover tab, click the favorite button.

When you define an index pattern, indices that match that pattern must exist in Elasticsearch. Those indices must contain data.

To use an event time in an index name, enclose the static text in the pattern and specify the date format using the tokens described in the following table.

For example, [logstash-]YYYY.MM.DD matches all indices whose names have a timestamp of the form YYYY.MM.DD appended to the prefix logstash-, such as logstash-2015.01.31 and logstash-2015-02-01.

Date Format Tokens
M

Month - cardinal: 1 2 3 …​ 12

Mo

Month - ordinal: 1st 2nd 3rd …​ 12th

MM

Month - two digit: 01 02 03 …​ 12

MMM

Month - abbreviation: Jan Feb Mar …​ Dec

MMMM

Month - full: January February March …​ December

Q

Quarter: 1 2 3 4

D

Day of Month - cardinal: 1 2 3 …​ 31

Do

Day of Month - ordinal: 1st 2nd 3rd …​ 31st

DD

Day of Month - two digit: 01 02 03 …​ 31

DDD

Day of Year - cardinal: 1 2 3 …​ 365

DDDo

Day of Year - ordinal: 1st 2nd 3rd …​ 365th

DDDD

Day of Year - three digit: 001 002 …​ 364 365

d

Day of Week - cardinal: 0 1 3 …​ 6

do

Day of Week - ordinal: 0th 1st 2nd …​ 6th

dd

Day of Week - 2-letter abbreviation: Su Mo Tu …​ Sa

ddd

Day of Week - 3-letter abbreviation: Sun Mon Tue …​ Sat

dddd

Day of Week - full: Sunday Monday Tuesday …​ Saturday

e

Day of Week (locale): 0 1 2 …​ 6

E

Day of Week (ISO): 1 2 3 …​ 7

w

Week of Year - cardinal (locale): 1 2 3 …​ 53

wo

Week of Year - ordinal (locale): 1st 2nd 3rd …​ 53rd

ww

Week of Year - 2-digit (locale): 01 02 03 …​ 53

W

Week of Year - cardinal (ISO): 1 2 3 …​ 53

Wo

Week of Year - ordinal (ISO): 1st 2nd 3rd …​ 53rd

WW

Week of Year - two-digit (ISO): 01 02 03 …​ 53

YY

Year - two digit: 70 71 72 …​ 30

YYYY

Year - four digit: 1970 1971 1972 …​ 2030

gg

Week Year - two digit (locale): 70 71 72 …​ 30

gggg

Week Year - four digit (locale): 1970 1971 1972 …​ 2030

GG

Week Year - two digit (ISO): 70 71 72 …​ 30

GGGG

Week Year - four digit (ISO): 1970 1971 1972 …​ 2030

A

AM/PM: AM PM

a

am/pm: am pm

H

Hour: 0 1 2 …​ 23

HH

Hour - two digit: 00 01 02 …​ 23

h

Hour - 12-hour clock: 1 2 3 …​ 12

hh

Hour - 12-hour clock, 2 digit: 01 02 03 …​ 12

m

Minute: 0 1 2 …​ 59

mm

Minute - two-digit: 00 01 02 …​ 59

s

Second: 0 1 2 …​ 59

ss

Second - two-digit: 00 01 02 …​ 59

S

Fractional Second - 10ths: 0 1 2 …​ 9

SS

Fractional Second - 100ths: 0 1 …​ 98 99

SSS

Fractional Seconds - 1000ths: 0 1 …​ 998 999

Z

Timezone - zero UTC offset (hh:mm format): -07:00 -06:00 -05:00 .. +07:00

ZZ

Timezone - zero UTC offset (hhmm format): -0700 -0600 -0500 …​ +0700

X

Unix Timestamp: 1360013296

x

Unix Millisecond Timestamp: 1360013296123

Setting the default index pattern

The default index pattern is loaded automatically when you view the Discover tab. Siren Investigate displays a star to the left of the name of the default pattern in the Index Patterns list on the Management > Indexes and Relations tab. The first pattern you create is automatically designated as the default pattern.

To set a different pattern as the default index pattern:

  1. Go to the Management > Indexes and Relations tab.

  2. Select the pattern you want to set as the default in the Index Patterns list.

  3. Click the pattern’s Favorite button.

You can also manually set the default index pattern in Advanced > Settings.

Reloading the index fields list

When you add an index mapping, Siren Investigate automatically scans the indices that match the pattern to display a list of the index fields. You can reload the index fields list to pick up any newly-added fields.

Reloading the index fields list also resets Siren Investigate’s popularity counters for the fields. The popularity counters keep track of the fields you have used most often within Siren Investigate and are used to sort fields within lists.

To reload the index fields list:

  1. Go to the Management > Indexes and Relations tab.

  2. Select an index pattern from the Index Patterns list.

  3. Click the pattern’s Reload button.

Deleting an index pattern

  1. Go to the Management > Indexes and Relations tab.

  2. Select the pattern you want to remove in the Index Patterns list.

  3. Click the pattern’s Delete button.

  4. Confirm that you want to remove the index pattern.

Elasticsearch supports the ability to run search and aggregation requests across multiple clusters using a module called cross cluster search.

In order to take advantage of cross cluster search, you must configure your Elasticsearch clusters accordingly. Review the corresponding Elasticsearch documentation before attempting to use cross cluster search in Kibana.

After your Elasticsearch clusters are configured for cross cluster search, you can create specific index patterns in Kibana to search across the clusters of your choosing. Using the same syntax that you would use in a raw cross cluster search request in Elasticsearch, create your index pattern in Kibana with the convention <cluster-names>:<pattern>.

For example, if you want to query logstash indices across two of the Elasticsearch clusters that you set up for cross cluster search, which were named cluster_one and cluster_two, you would use cluster_one:logstash-*,cluster_two:logstash-* as your index pattern in Kibana.

Just like in raw search requests in Elasticsearch, you can use wildcards in your cluster names to match any number of clusters, so if you wanted to search logstash indices across any clusters named cluster_foo, cluster_bar, and so on, you would use cluster_*:logstash-* as your index pattern in Kibana.

If you want to query across all Elasticsearch clusters that have been configured for cross cluster search, then use a standalone wildcard for your cluster name in your Kibana index pattern: *:logstash-*.

After an index pattern is configured using the cross cluster search syntax, all searches and aggregations using that index pattern in Kibana take advantage of cross cluster search.

Relations

In this panel, you can define relationships between index patterns. These relationships ultimately form a graph of index patterns. This graph is used

in conjunction with the Siren Federate plugin For Elasticsearch, allowing to perform join operations between dashboards, that is filtering a dashboard’s documents with regards to another.

Graph of index patterns

A relationship is defined as a join operation between two indices with the following fields:

  • Left Index Pattern: the left index of the join;

  • Left Type: the type of the left index of the join;

  • Left Field: the field of the left index to join on;

  • Right Index Pattern: the right index of the join;

  • Right Type: the type of the right index of the join;

  • Right Field: the field of the right index to join with; and

  • Label: the label of the relation.

The following image displays a graph of four index patterns, where three relationships have been defined. You can add a new relationship by clicking the "Add relation" button.

Graph of Index Patterns

Join task timeout

kibi:joinLimit is deprecated and will have no effect. If a timeout is needed, use siren:joinTaskTimeout.

Clicking Edit (Edit pencil) opens the advanced setting for each relation where you can set the maximum time spent by each join task for that relation in milliseconds. Once the timeout has expired, the task will pass the documents accumulated at that point onto the next task.

This is a per-task time limit and as each join contains several tasks, the overall response to the request can be a number of multiples of the joinTaskTimeout.

As a semi-join, these documents will be filtered based on the presence of a non-empty value for the join field in the other index pattern in the relation.

The index pattern in question is then filtered by the values returned.

Setting the limit here to -1 here sets the limit to the default siren:joinTaskTimeout set in the Kibi Advanced Settings and setting the limit to 0 here disables the limit entirely.

Join type

Siren Federate provides two types of join algorithms; the plugin will try to pick the best algorithm for a given join automatically, however you can force the selection by choosing one of the available options:

  • HASH_JOIN - Distributed join using hash join algorithm

  • BROADCAST_JOIN - Broadcast join

A detailed description of each algorithm can be found in the Siren Federate plugin documentation.

Index Patterns Management icon

A relation is also indicated on the Index Patterns Management application thanks to the Index relation icon icon. If you hover the mouse over it, a tooltip message is displayed indicating the index pattern and field with which that field is joined.

For example, in the following image, that icon is displayed next to the field "companies" of the "articles" index, which is joined with the field "id" of the "companies" index.

Investor Index

Datasources

For an overview of datasources, read the JDBC datasources and Legacy REST datasources chapters.

Queries

For an overview of queries, read the Legacy REST datasources chapter.

Templates

You can define templates to format the results of a query on an external datasource and the results of an Elasticsearch query in a Enhanced search results visualization.

Siren Investigate supports three template engines:

There are four pre-defined templates:

  • kibi-json-jade: this template presents the query results as a pretty-printed JSON object using the jade engine. This is useful to test queries while writing them.

  • kibi-table-jade: this template displays the query results in a table, using the jade engine.

  • kibi-table-handlebars: like kibi-table-jade, using the handlebars engine instead.

  • kibi-html-angular: this template for each document displays a panel

  • populated with all property values (Currently supported only in Enhanced

  • search results visualisation)

You can define your own custom template by clicking the Settings / Templates tab.

Then, pick the engine you prefer and write the template; to see a preview, click Save and select a query from the list; depending on the query you selected, the EntityURI may need to be set.

Query template editor

Managing fields

The fields for the index pattern are listed in a table. Click a column header to sort the table by that column. Click Controls in the rightmost column for a given field to edit the field’s properties. You can manually set the field’s format from the Format drop-down. Format options vary based on the field’s type.

You can also set the field’s popularity value in the Popularity text entry box to any desired value. Click Update Field to confirm your changes or Cancel to return to the list of fields.

Siren Investigate has field formatters for the following field types:

String field formatters

String fields support the String and Url formatters.

The String field formatter can apply the following transformations to the field’s contents:

  • Convert to lowercase

  • Convert to uppercase

  • Convert to title case

  • Apply the short dots transformation, which replaces the content before a . character with the first character of that content, as in the following example:

Original

Becomes

com.organizations.project.ClassName

c.o.p.ClassName

The Url field formatter can take on the following types:

  • The Link type turn the contents of the field into an URL.

  • The Image type can be used to specify an image folder where a specified image is located.

You can customize either type of URL field formats with templates. A URL template enables you to add specific values to a partial URL. Use the string {{value}} to add the contents of the field to a fixed URL.

For example, when:

The resulting URL replaces {{value}} with the user ID from the field.

The {{value}} template string URL-encodes the contents of the field. When a field encoded into a URL contains non-ASCII characters, these characters are replaced with a % character and the appropriate hexadecimal code. For example, field contents users/admin result in the URL template adding users%2Fadmin.

When the formatter type is set to Image, the {{value}} template string specifies the name of an image at the specified URI.

In order to pass unescaped values directly to the URL, use the {{rawValue}} string.

A Label Template enables you to specify a text string that displays instead of the raw URL. You can use the {{value}} template string normally in label templates. You can also use the {{url}} template string to display the formatted URL.

Date field formatters

Date fields support the Date, Url, and String formatters.

The Date formatter enables you to choose the display format of date stamps using the moment.js standard format definitions.

The String field formatter can apply the following transformations to the field’s contents:

  • Convert to lowercase

  • Convert to uppercase

  • Convert to title case

  • Apply the short dots transformation, which replaces the content before a . character with the first character of that content, as in the following example:

Original

Becomes

com.organizations.project.ClassName

c.o.p.ClassName

The Url field formatter can take on the following types:

  • The Link type turn the contents of the field into an URL.

  • The Image type can be used to specify an image folder where a specified image is located.

You can customize either type of URL field formats with templates. A URL template enables you to add specific values to a partial URL. Use the string {{value}} to add the contents of the field to a fixed URL.

For example, when:

The resulting URL replaces {{value}} with the user ID from the field.

The {{value}} template string URL-encodes the contents of the field. When a field encoded into a URL contains non-ASCII characters, these characters are replaced with a % character and the appropriate hexadecimal code. For example, field contents users/admin result in the URL template adding users%2Fadmin.

When the formatter type is set to Image, the {{value}} template string specifies the name of an image at the specified URI.

In order to pass unescaped values directly to the URL, use the {{rawValue}} string.

A Label Template enables you to specify a text string that displays instead of the raw URL. You can use the {{value}} template string normally in label templates. You can also use the {{url}} template string to display the formatted URL.

Geographic point field formatters

Geographic point fields support the String formatter.

The String field formatter can apply the following transformations to the field’s contents:

  • Convert to lowercase

  • Convert to uppercase

  • Convert to title case

  • Apply the short dots transformation, which replaces the content before a . character with the first character of that content, as in the following example:

Original

Becomes

com.organizations.project.ClassName

c.o.p.ClassName

Numeric field formatters

Numeric fields support the Url, Bytes, Duration, Number, Percentage, String, and Color formatters.

The Url field formatter can take on the following types:

  • The Link type turn the contents of the field into an URL.

  • The Image type can be used to specify an image folder where a specified image is located.

You can customize either type of URL field formats with templates. A URL template enables you to add specific values to a partial URL. Use the string {{value}} to add the contents of the field to a fixed URL.

For example, when:

The resulting URL replaces {{value}} with the user ID from the field.

The {{value}} template string URL-encodes the contents of the field. When a field encoded into a URL contains non-ASCII characters, these characters are replaced with a % character and the appropriate hexadecimal code. For example, field contents users/admin result in the URL template adding users%2Fadmin.

When the formatter type is set to Image, the {{value}} template string specifies the name of an image at the specified URI.

In order to pass unescaped values directly to the URL, use the {{rawValue}} string.

A Label Template enables you to specify a text string that displays instead of the raw URL. You can use the {{value}} template string normally in label templates. You can also use the {{url}} template string to display the formatted URL.

The String field formatter can apply the following transformations to the field’s contents:

  • Convert to lowercase

  • Convert to uppercase

  • Convert to title case

  • Apply the short dots transformation, which replaces the content before a . character with the first character of that content, as in the following example:

Original

Becomes

com.organizations.project.ClassName

c.o.p.ClassName

The Duration field formatter can display the numeric value of a field in the following increments:

  • Picoseconds

  • Nanoseconds

  • Microseconds

  • Milliseconds

  • Seconds

  • Minutes

  • Hours

  • Days

  • Weeks

  • Months

  • Years

You can specify these increments with up to 20 decimal places for both input and output formats.

The Color field formatter enables you to specify colors with specific ranges of values for a numeric field.

When you select the Color field formatter, Siren Investigate displays the Range, Font Color, Background Color, and Example fields.

Click Add Color to add a range of values to associate with a particular color. You can click in the Font Color and Background Color fields to display a color picker. You can also enter a specific hex code value in the field. The effect of your current color choices are displayed in the Example field.

colorformatter

The Bytes, Number, and Percentage formatters enable you to choose the display formats of numbers in this field using the numeral.js standard format definitions.

Scripted fields

Scripted fields compute data on the fly from the data in your Elasticsearch indices. Scripted field data is shown on the Discover tab as part of the document data, and you can use scripted fields in your visualizations. Scripted field values are computed at query time so they are not indexed and cannot be searched.

Siren Investigate cannot query scripted fields.
Computing data on the fly with scripted fields can be very resource intensive and can have a direct impact on Siren Investigate’s performance. Keep in mind that there’s no built-in validation of a scripted field. If your scripts are buggy, you will get exceptions whenever you try to view the dynamically generated data.

When you define a scripted field in Siren Investigate, you have a choice of scripting languages. Starting with 5.0, the default options are Lucene expressions and Painless. While you can use other scripting languages if you enable dynamic scripting for them in Elasticsearch, this is not recommended because they cannot be sufficiently sandboxed.

Use of Groovy, Javascript, and Python scripting is deprecated starting in Elasticsearch 5.0, and support for those scripting languages will be removed in the future.

You can reference any single value numeric field in your expressions, for example:

doc['field_name'].value

For more background on scripted fields and additional examples, refer to this blog: Using Painless in Kibana scripted fields

Creating a scripted field

  1. Go to Settings > Indices

  2. Select the index pattern you want to add a scripted field to.

  3. Go to the pattern’s Scripted Fields tab.

  4. Click Add Scripted Field.

  5. Enter a name for the scripted field.

  6. Enter the expression that you want to use to compute a value on the fly from your index data.

  7. Click Save Scripted Field.

For more information about scripted fields in Elasticsearch, see Scripting.

Modifying a scripted field

  1. Go to Settings > Indices

  2. Click Edit for the scripted field you want to change.

  3. Make your changes and then click Save Scripted Field to update the field.

Keep in mind that there’s no built-in validation of a scripted field. If your scripts are buggy, you will get exceptions whenever you try to view the dynamically generated data.

Deleting a scripted field

  1. Go to Settings > Indices

  2. Click Delete for the scripted field you want to remove.

  3. Confirm that you really want to delete the field.

Setting advanced options

The Advanced Settings page enables you to directly edit settings that control the behavior of the Siren Investigate application. For example, you can change the format used to display dates, specify the default index pattern, and set the precision for displayed decimal values.

To set advanced options:

  1. Go to Settings > Advanced.

  2. Click Edit for the option you want to modify.

  3. Enter a new value for the option.

  4. Click Save.

Modifying the following settings can significantly affect Siren Investigate’s performance and cause problems that are difficult to diagnose. Setting a property’s value to a blank field will revert to the default behavior, which may not be compatible with other configuration settings. Deleting a custom setting removes it from Siren Investigate permanently.
Common Settings Reference
query:queryString:options

Options for the Lucene query string parser.

sort:options

Options for the Elasticsearch sort parameter.

dateFormat

The format to use for displaying pretty-formatted dates.

dateFormat:tz

The timezone that Siren Investigate uses. The default value of Browser uses the timezone detected by the browser.

dateFormat:scaled

These values define the format used to render ordered time-based data. Formatted timestamps must adapt to the interval between measurements. Keys are ISO8601 intervals.

dateFormat:dow

This property defines what day weeks should start on.

defaultIndex

Default is null. This property specifies the default index.

defaultColumns

Default is _source. Defines the columns that appear by default on the Discover page.

metaFields

An array of fields outside of _source. Siren Investigate merges these fields into the document when displaying the document.

discover:sampleSize

The number of rows to show in the Discover table.

discover:aggs:terms:size

Determines how many terms will be visualized when clicking the "visualize" button, in the field drop downs, in the discover sidebar. The default value is 20.

doc_table:highlight

Highlight results in Discover and Saved Searches Dashboard. Highlighting makes request slow when working on big documents. Set this property to false to disable highlighting.

doc_table:highlight:all_fields

Improves highlighting by using a separate highlight_query that uses all_fields mode on query_string queries. Set to false if you are using a default_field in your index.

courier:maxSegmentCount

Siren Investigate splits requests in the Discover app into segments to limit the size of requests sent to the Elasticsearch cluster. This setting constrains the length of the segment list. Long segment lists can significantly increase request processing time.

courier:ignoreFilterIfFieldNotInIndex

Set this property to true to skip filters that apply to fields that do not exist in a visualization’s index. Useful when dashboards consist of visualizations from multiple index patterns.

fields:popularLimit

This setting governs how many of the top most popular fields are shown.

histogram:barTarget

When date histograms use the auto interval, Siren Investigate attempts to generate this number of bars.

histogram:maxBars

Date histograms are not generated with more bars than the value of this property, scaling values when necessary.

visualization:tileMap:maxPrecision

The maximum geoHash precision displayed on tile maps: 7 is high, 10 is very high, 12 is the maximum. Explanation of cell dimensions.

visualization:tileMap:WMSdefaults

Default properties for the WMS map server support in the coordinate map.

visualization:colorMapping

Maps values to specified colors within visualizations.

visualization:loadingDelay

Time to wait before dimming visualizations during query.

visualization:dimmingOpacity

When part of a visualization is highlighted, by hovering over it for example, ths is the opacity applied to the other elements. A higher number means other elements will be less opaque.

visualization:regionmap:showWarnings

Whether the region map show a warning when terms cannot be joined to a shape on the map.

csv:separator

A string that serves as the separator for exported values.

csv:quoteValues

Set this property to true to quote exported values.

history:limit

In fields that have history, such as query inputs, the value of this property limits how many recent values are shown.

shortDots:enable

Set this property to true to shorten long field names in visualizations. For example, instead of foo.bar.baz, show f.b.baz.

truncate:maxHeight

This property specifies the maximum height that a cell occupies in a table. A value of 0 disables truncation.

indexPattern:fieldMapping:lookBack

The value of this property sets the number of recent matching patterns to query the field mapping for index patterns with names that contain timestamps.

indexPattern:placeholder

The default placeholder value used when adding a new index pattern to Kibana.

format:defaultTypeMap

A map of the default format name for each field type. Field types that are not explicitly mentioned use "default".

format:number:defaultPattern

Default numeral format for the "number" format.

format:bytes:defaultPattern

Default numeral format for the "bytes" format.

format:percent:defaultPattern

Default numeral format for the "percent" format.

format:currency:defaultPattern

Default numeral format for the "currency" format.

savedObjects:perPage

The number of objects shown on each page of the list of saved objects. The default value is 5.

timepicker:timeDefaults

The default time filter selection.

timepicker:refreshIntervalDefaults

The time filter’s default refresh interval.

dashboard:defaultDarkTheme

Set this property to true to make new dashboards use the dark theme by default.

filters:pinnedByDefault

Set this property to true to make filters have a global state by default.

filterEditor:suggestValues

Set this property to true to have the filter editor suggest values for fields, instead of just providing a text input. This may result in heavy queries to Elasticsearch.

notifications:banner

You can specify a custom banner to display temporary notices to all users. This field supports Markdown.

notifications:lifetime:banner

Specifies the duration in milliseconds for banner notification displays. The default value is 3000000. Set this field to Infinity to disable banner notifications.

notifications:lifetime:error

Specifies the duration in milliseconds for error notification displays. The default value is 300000. Set this field to Infinity to disable error notifications.

notifications:lifetime:warning

Specifies the duration in milliseconds for warning notification displays. The default value is 10000. Set this field to Infinity to disable warning notifications.

notifications:lifetime:info

Specifies the duration in milliseconds for information notification displays. The default value is 5000. Set this field to Infinity to disable information notifications.

metrics:max_buckets

The maximum numbers of buckets that cannot be exceeded. For example, this can arise when the user selects a short interval like (for example 1s) for a long time period (for example 1 year)

timelion:showTutorial

Set this property to true to show the Timelion tutorial to users when they first open Timelion.

timelion:es.timefield

Default field containing a timestamp when using the .es() query.

timelion:es.default_index

Default index when using the .es() query.

timelion:target_buckets

Used for calculating automatic intervals in visualizations, this is the number of buckets to try to represent.

timelion:max_buckets

Used for calculating automatic intervals in visualizations, this is the maximum number of buckets to represent.

timelion:default_columns

The default number of columns to use on a timelion sheet.

timelion:default_rows

The default number of rows to use on a timelion sheet.

timelion:graphite.url

[experimental] Used with graphite queries, this it the URL of your host

timelion:quandl.key

[experimental] Used with quandl queries, this is your API key from www.quandl.com

state:storeInSessionStorage

[experimental] Kibana tracks UI state in the URL, which can lead to problems when there is a lot of information there and the URL gets very long. Enabling this will store parts of the state in your browser session instead, to keep the URL shorter.

context:defaultSize

Specifies the initial number of surrounding entries to display in the context view. The default value is 5.

context:step

Specifies the number to increment or decrement the context size by when using the buttons in the context view. The default value is 5.

context:tieBreakerFields

A comma-separated list of fields to use for tiebreaking between documents that have the same timestamp value. From this list the first field that is present and sortable in the current index pattern is used.

Siren Investigate Specific Settings Reference
siren:timePrecision

Set to generate time filters with certain precision. Possible values are: s, m, h, d, w, M, y.

siren:relations

Relations between index patterns and dashboards.

siren:joinLimit

Maximum number of unique source values in a relation returned to filter the target documents

siren:session_cookie_expire

Set duration of cookie session (in seconds).

siren:enableAllDashboardsCounts

Enable counts on all dashboards.

siren:enableAllRelBtnCounts

Enable counts on all relational buttons.

siren:defaultDashboardTitle

The dashboard that is displayed when clicking the Dashboard tab for the first time.

siren:graphUseWebGl

Set to false to disable WebGL rendering

siren:graphExpansionLimit

Limit the number of elements to retrieve during the graph expansion.

siren:graphMaxConcurrentCalls

Limit the number of concurrent calls done by the Graph Browser.

siren:graphRelationFetchLimit

Limit the number of relations to retrieve after the graph expansion.

siren:shieldAuthorizationWarning

Set to true to show all authorization warnings.

Managing saved searches, visualizations, and dashboards

You can view, edit, and delete saved searches, visualizations, and dashboards from Settings > Objects. You can also export or import sets of searches, visualizations, and dashboards.

Viewing a saved object displays the selected item in the Discover, Visualize, or Dashboard page. To view a saved object:

  1. Go to Settings > Objects.

  2. Select the object you want to view.

  3. Click the View button.

Editing a saved object enables you to directly modify the object definition. You can change the name of the object, add a description, and modify the JSON that defines the object’s properties.

If you attempt to access an object whose index has been deleted, Siren Investigate displays its Edit Object page. You can:

  • Recreate the index so you can continue using the object.

  • Delete the object and recreate it using a different index.

  • Change the index name referenced in the object’s kibanaSavedObjectMeta.searchSourceJSON to point to an existing index pattern. This is useful if the index you were working with has been renamed.

No validation is performed for object properties. Submitting invalid changes will render the object unusable. Generally, you should use the Discover, Visualize, or Dashboard pages to create new objects instead of directly editing existing ones.

To edit a saved object:

  1. Go to Settings > Objects.

  2. Select the object you want to edit.

  3. Click the Edit button.

  4. Make your changes to the object definition.

  5. Click the Save Object button.

To delete a saved object:

  1. Go to Settings > Objects.

  2. Select the object you want to delete.

  3. Click the Delete button.

  4. Confirm that you really want to delete the object.

To export a set of objects:

  1. Go to Settings > Objects.

  2. Select the type of object you want to export. You can export a set of dashboards, searches, or visualizations.

  3. Click the selection box for the objects you want to export, or click the Select All box.

  4. Click Export to select a location to write the exported JSON.

Exported dashboards do not include their associated index patterns. Re-create the index patterns manually before importing saved dashboards to a Siren Investigate instance running on another Elasticsearch cluster.

To import a set of objects:

  1. Go to Settings > Objects.

  2. Click Import to navigate to the JSON file representing the set of objects to import.

  3. Click Open after selecting the JSON file.

  4. If any objects in the set would overwrite objects already present in Siren Investigate, confirm the overwrite.

Authentication and access control

Siren Investigate can be integrated with Elasticsearch clusters protected by either Search Guard or Elastic X-Pack.

In this scenario, both Siren Investigate and Gremlin Server (the backend component used by the graph browser visualization) must be configured to serve requests over HTTPS.

Enabling HTTPS in Siren Investigate

You should protect your Siren Investigate installation by using a reverse proxy. Some example configurations follow, but other reverse proxies may also be used.

Using NginX as a reverse proxy with HTTPS (Linux)

Add the following virtual server to your configuration: Here we assume letsencrypt has been used to provide the certificate.

server {
    listen 443 ssl;
    listen [::]:443 ssl;
    server_name siren.example.com;

    root /var/www/html;
    index index.html index.htm;

    ssl_certificate /etc/letsencrypt/live/siren.example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/siren.example.com/privkey.pem;

    access_log /var/log/nginx/siren-ssl.access.log;
    error_log /var/log/nginx/siren-ssl.error.log error;

    include snippets/ssl-params.conf;

    location / {
        proxy_pass http://127.0.0.1:5606;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection 'upgrade';
        proxy_set_header Host $host;
        proxy_cache_bypass $http_upgrade;
    }
}

In /etc/nginx/snippets/ssl-params.conf configure:

ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
ssl_prefer_server_ciphers on;
ssl_ciphers "EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH";
ssl_ecdh_curve secp384r1;
ssl_session_cache shared:SSL:10m;
#ssl_session_tickets off;
ssl_stapling on;
ssl_stapling_verify on;
resolver 8.8.8.8 8.8.4.4 valid=300s;
resolver_timeout 5s;
# Disable preloading HSTS for now. You can use the commented out header line that includes
# the "preload" directive if you understand the implications.
# Also do not include subdomains by default
#add_header Strict-Transport-Security "max-age=63072000; includeSubdomains; preload";
add_header Strict-Transport-Security "max-age=63072000";
add_header X-Frame-Options DENY;
add_header X-Content-Type-Options nosniff;
ssl_dhparam /etc/ssl/certs/dhparam.pem;

The SSL configuration in ssl-params.conf can be shared among multiple virtual servers.

Now generate a unique set of Diffie-Helman parameters (this mitigates the LOGJAM vulnerability):

openssl dhparam 2048 -out /etc/ssl/certs/dhparam.pem

Note that this constitutes a MINIMUM RECOMMENDED LEVEL of security. Your installation’s requirements may be more stringent.

Using Apache as a reverse proxy with HTTPS (Linux)

Add the following virtual host to your configuration. Here we assume letsencrypt has been used to provide the certificate.

<VirtualHost *:443>
    ServerName siren.example.com
    DocumentRoot /var/www/html
    DirectoryIndex index.html index.htm

    CustomLog /var/log/apache2/siren-ssl.access.log combined
    ErrorLog /var/log/apache2/siren-ssl.error.log

    SSLEngine on
    SSLStrictSNIVHostCheck off
    SSLCertificateFile /etc/letsencrypt/live/siren.example.com/fullchain.pem
    SSLCertificateKeyFile /etc/letsencrypt/live/siren.example.com/privkey.pem
    SSLCACertificateFile /etc/letsencrypt/live/siren.example.com/chain.pem

    <location / >
        ProxyPass http://127.0.0.1:5606
        ProxyPassReverse http://127.0.0.1:5606
    </Location>
</VirtualHost>

Now configure /etc/apache2/conf.d/security.conf:

Header unset X-Powered-By
Header set X-Frame-Options: "sameorigin"
Header set X-Content-Type-Options: "nosniff"
TraceEnable Off
ServerTokens Prod
ServerSignature Off

And /etc/apache2/mods-available/ssl.conf:

<IfModule mod_ssl.c>

SSLRandomSeed startup builtin
SSLRandomSeed startup file:/dev/urandom 512
SSLRandomSeed connect builtin
SSLRandomSeed connect file:/dev/urandom 512

AddType application/x-x509-ca-cert .crt
AddType application/x-pkcs7-crl	.crl

SSLPassPhraseDialog  exec:/usr/share/apache2/ask-for-passphrase
SSLSessionCache		shmcb:${APACHE_RUN_DIR}/ssl_scache(512000)
SSLSessionCacheTimeout  300

SSLProtocol all -SSLv2 -SSLv3
SSLHonorCipherOrder on
SSLCipherSuite \
  "EECDH+ECDSA+AESGCM EECDH+aRSA+AESGCM EECDH+ECDSA+SHA384 \
  EECDH+ECDSA+SHA256 EECDH+aRSA+SHA384 EECDH+aRSA+SHA256 \
  EECDH EDH+aRSA !3DES \
  !aNULL !eNULL !LOW !MD5 !EXP !PSK !SRP !KRB5 !DSS !RC4 !DES"
SSLCompression off

## Strict Transport Security
Header set Strict-Transport-Security "max-age=15768000"

## Apache 2.4 only
SSLUseStapling on
SSLStaplingResponderTimeout 5
SSLStaplingReturnResponderErrors off
SSLStaplingCache shmcb:/var/run/ocsp(128000)

## Apache >=2.4.8 + OpenSSL >=1.0.2 only
SSLOpenSSLConfCmd DHParameters /etc/ssl/certs/dhparam.pem

</IfModule>

You must enable mod_headers for the SSL security settings to take effect.

Now generate a unique set of Diffie-Helman parameters (this mitigates the LOGJAM vulnerability):

openssl dhparam 2048 -out /etc/ssl/certs/dhparam.pem

Note that this constitutes a MINIMUM RECOMMENDED LEVEL of security. Your installation’s requirements may be more stringent.

Native SSL support

While you should always run Siren Investigate behind an SSL reverse proxy, it is sometimes necessary to also enable SSL support on the Siren Investigate server itself - for example, when the reverse proxy is an appliance, or is installed on a separate server.

Native SSL support can be enabled by copying the certificate and key files to a location readable by the Siren Investigate process and setting the following parameters in config/investigate.yml:

  • server.ssl.enabled: set to true to enable SSL.

  • server.ssl.certificate: path to a certificate.

  • server.ssl.key: path to the certificate key.

  • server.ssl.keyPassphrase: the passphrase of the certificate key; if the key is not encrypted the parameter can be omitted.

The certificate and key files must be PEM encoded.

For example:

server.ssl.enabled: true
server.ssl.certificate: "pki/server.crt"
server.ssl.key: "pki/server.key"

The Siren Investigate demo distribution includes a sample certificate and key in the pki folder.

For additional SSL settings, refer to the settings chapter.

Enabling HTTPS in Gremlin Server

HTTPS must be enabled in Gremlin Server to secure requests from Siren Investigate, even if Siren Investigate is configured behind a reverse SSL proxy.

To enable HTTPS in the Gremlin Server, set the following parameters in the investigate_core.gremlin_server section of the config/investigate.yml file:

  • url: the URL of the Gremlin Server endpoint; ensure that the protocol is set to https.

  • ssl.key_store: the path to the Gremlin Server certificate in Java KeyStore format.

  • ssl.key_store_password: the password of the Gremlin Server certificate keystore.

  • ssl.ca: the path of the certification authority chain bundle that can be used to validate requests from Siren Investigate to the Gremlin API; you can omit this parameter if the certificates for the Siren Investigate HTTPS interface have been issued and signed by a public authority.

For example:

investigate_core:
  gremlin_server:
    url: https://127.0.0.1:8061
    ssl:
      key_store: "pki/gremlin.jks"
      key_store_password: "password"
      ca: "pki/cacert.pem"

After restarting Siren Investigate, click Settings, then click Datasources, and ensure that the URL of the Siren Investigate Gremlin Server datasource is equal to the URL set in investigate.yml.

The Siren Investigate demo distribution includes a sample keystore and CA bundle in the pki folder.

Search Guard Integration and Siren Investigate access control

This section offers an overview of how to integrate Search Guard with Siren Investigate; for further reference and detailed options, consult the Search Guard documentation.

Before proceeding, ensure that:

  • Siren Investigate is either running with HTTPS enabled or behind a reverse SSL proxy.

  • The Gremlin Server is running with HTTPS enabled.

Refer to the Authentication and access control section for instructions on how on to enable HTTPS in both components.

SSL Certificates

All the Elasticsearch nodes in a cluster secured by Search Guard are required to use SSL to encrypt all network traffic.

In addition, changing the Search Guard configuration requires the use of a client SSL certificate to perform administrative actions.

To setup a Search Guard cluster, you must generate the following files:

  • A truststore file, common to all nodes, containing the CA certificate chain.

  • A keystore file, for each node, containing the certificate for the node.

  • A keystore file, for each administrative user, containing a certificate bundle that identifies the user.

  • A keystore file containing an SSL certificate for the Elasticsearch HTTP server.

These files can be either Java KeyStore files or PKCS12 bundles.

Issuing certificates in an existing PKI infrastructure

If your organization has a PKI infrastructure in place, you can generate Java KeyStore files from a PEM bundle by using the keytool command, for example:

$ keytool  \
  -importcert \
  -file ca.pem  \
  -keystore truststore.jks

The command will store the contents of ca.pem into a file named truststore.jks in the current folder.

The same command can be used to convert certificates signed by your CA for nodes, administrative users and the REST API.

Node certificates must include oid:1.2.3.4.5.5 as a Subject Alternative Name entry to work correctly with Search Guard; for details on how to customize the OID, consult the Search Guard documentation.

If you want to enable hostname verification, ensure that at least one Subject Alternative Name is equal to the DNS name of the node.

Client certificates for administrative users must contain a unique Distinguished Name to identify the user, for example:

CN=admin,DC=siren,DC=solutions

Certificates for the Elasticsearch HTTP server can be used across multiple nodes by setting multiple hostnames in the Subject Alternative Name attribute or by using a wildcard certificate.

Issuing certificates using the TLS certificate generator

Floragunn GmbH provides a TLS certificate generation service which can be used to create a bundle of certificates for evaluation purposes.

To try the certificates in a single node setup, it is possible to just specify localhost as the first hostname and submit the form.

The bundle has the following contents:

  • README.txt: provides an overview of the bundle and the passwords for all the keystores.

  • truststore.jks: the CA certificate chain in Keystore format.

  • node-certificates: the transport certificates for the nodes in several formats; these certificates can also be used for the Elasticsearch HTTP server.

  • client-certificates: client certificates and private keys.

  • root-ca: the root CA bundle in PEM format.

  • signing-ca: the signing CA bundle in PEM format.

In addition to the online generator, Floragunn provides a {searchguard-tls-tool-ref}[TLS tool] which can be used to manage a private certification authority.

Search Guard installation

Install the search-guard-5 plugin on each node in the Elasticsearch cluster by changing to the node folder and running the following commands; to find the most recent version of the plugins for your Elasticsearch version, consult the Search Guard version matrix.

$ bin/elasticsearch-plugin install -b com.floragunn:search-guard-5:<version>

Then, copy the following files to the config folder of each node:

  • The truststore file (for example truststore.jks).

  • The keystore file containing the node certificate (for example CN=localhost-keystore.jks)

  • The keystore file containing the certificate for the Elasticsearch HTTP server, only if different than the node certificate.

Open the config/elasticsearch.yml file and set the following Search Guard options:

Node to node transport options:

  • searchguard.ssl.transport.enabled: needs to be set to true for Search Guard to work.

  • searchguard.ssl.transport.keystore_filepath: the filename of the keystore file that contains the node certificate.

  • searchguard.ssl.transport.keystore_password: the password of the keystore file that contains the node certificate.

  • searchguard.ssl.transport.truststore: the filename of the truststore file that contains the root certificate chain.

  • searchguard.ssl.transport.truststore_password: the password of the truststore file that contains the root certificate chain.

  • searchguard.ssl.transport.enforce_hostname_verification: set to true to enable hostname verification, false otherwise.

REST API options:

  • searchguard.ssl.http.enabled: set to true to enable SSL on the HTTP interface.

  • searchguard.ssl.http.keystore_filepath: the filename of the keystore file that contains the certificate for the HTTP interface.

  • searchguard.ssl.http.keystore_password: the password of the keystore file that contains the certificate for the HTTP interface.

  • searchguard.ssl.http.truststore: the filename of the truststore file that contains the root certificate chain for the HTTP certificate.

  • searchguard.ssl.http.truststore_password: the password of the truststore file that contains the root certificate chain for the HTTP certificate.

Administrative user options:

  • searchguard.authcz.admin_dn: a list of Distinguished Names in SSL client certificates which are authorized to submit administrative requests.

Client certificate authentication options:

  • searchguard.ssl.http.clientauth_mode: set to OPTIONAL to enable optional client certificate authentication on the REST endpoint.

For example:

searchguard.ssl.transport.enabled: true
searchguard.ssl.transport.truststore_filepath: truststore.jks
searchguard.ssl.transport.truststore_password: <password>
searchguard.ssl.transport.keystore_filepath: CN=localhost-keystore.jks
searchguard.ssl.transport.keystore_password: <password>
searchguard.ssl.transport.enforce_hostname_verification: false
searchguard.ssl.http.enabled: true
searchguard.ssl.http.keystore_filepath: CN=localhost-keystore.jks
searchguard.ssl.http.keystore_password: <password>
searchguard.ssl.http.truststore_filepath: truststore.jks
searchguard.ssl.http.truststore_password: <password>
searchguard.authcz.admin_dn:
  - CN=sgadmin
searchguard.ssl.http.clientauth_mode: OPTIONAL
Make sure that all the files in the configuration folder and the certificate files are readable only by the user running Elasticsearch.

Start Elasticsearch:

$ bin/elasticsearch

If either a certificate or a password is incorrect, Elasticsearch will not start.

Access control configuration

Access control configuration (users, roles and privileges) is stored in an Elasticsearch index which can be modified through the sgadmin.sh script.

The script reads the configuration from a local folder containing YAML files and uploads it to the index; the request is authenticated through a client SSL certificate.

After the configuration has been uploaded, it will be available to all the nodes in the cluster, so it is not necessary to copy the Search Guard configuration folder to all the Elasticsearch nodes, just on the node from where sgadmin is run.

sgadmin.sh is available in the plugins/search-guard-5/tools folder in each Elasticsearch instance in which Search Guard has been installed; a standalone version (sgadmin-standalone.zip) can be downloaded from this page.

After a Search Guard enabled cluster has been initialized, sgadmin can be used to upload new configurations.

Siren Investigate Certificates

Here we give an example of where to store client certificates and keystores on Siren Investigate. Note: These are examples for a fresh install using the TLS certificate generator.

In siren-investigate/pki (which was created earlier for https support) a new folder searchguard was created with the following:

  • CN=sgadmin.crtfull.pem: a certificate bundle with administrative privileges over the Search Guard Management REST API. Copied from client-certificates in TLS certificate generator bundle.

  • CN=sgadmin.key.pem: the key of the administrative certificate. Copied from client-certificates in TLS certificate generator bundle.

  • ca.pem: the cluster CA certificate chain in PEM format. Copy of root-ca.pem from top level folder TLS certificate generator bundle.

  • CN=sgadmin-keystore.jks: Keystore containing the admin certificate. Used with sgadmin. Copied from client-certificates in TLS certificate generator bundle.

The password of all Java keystores can be found in the README.txt from top level folder of TLS certificate generator bundle.

Search Guard configuration

A Search Guard configuration folder contains the following files:

  • sg_config.yml: contains the general configuration.

  • sg_action_groups.yml: contains named groups of permissions.

  • sg_roles.yml: contains the definition of roles.

  • sg_internal_users.yml: the Search Guard internal users database.

  • sg_roles_mapping.yml: contains the mapping between users and roles.

The following sample configuration is used for Elasticsearch with no data, for example your own Elasticsearch or using our no-data-no-security package. Further examples are available in the config/sgconfig folder in the Elasticsearch instance included in the demo distribution; the contents of the files are explained in the next sections and can be used as a general guideline.

For additional configuration options, refer to the official Search Guard documentation.

General configuration (sg_config.yml)

searchguard:
  dynamic:
    http:
      anonymous_auth_enabled: false
      xff:
        enabled: false
    authc:
      transport_auth_domain:
        enabled: true
        order: 2
        http_authenticator:
          type: basic
        authentication_backend:
          type: internal
      basic_internal_auth_domain:
        enabled: true
        http_authenticator:
          type: basic
          challenge: true
        authentication_backend:
          type: intern

The sg_config.yml file contains the configuration of the authentication mechanisms and backends; this configuration:

  • Disables the anonymous role (anonymous_auth_enabled: false)

  • Disables support for external proxies (xff.enabled: false)

  • Enables HTTP basic authentication on the internal Search Guard user database.

Action groups (sg_action_groups.yml)

UNLIMITED:
  - '*'

###### INDEX LEVEL ######

INDICES_ALL:
  - 'indices:*'

# for backward compatibility
ALL:
  - INDICES_ALL

MANAGE:
  - 'indices:monitor/*'
  - 'indices:admin/*'

CREATE_INDEX:
  - 'indices:admin/create'
  - 'indices:admin/mapping/put'

MANAGE_ALIASES:
  - 'indices:admin/aliases*'

# for backward compatibility
MONITOR:
  - INDICES_MONITOR

INDICES_MONITOR:
  - 'indices:monitor/*'

DATA_ACCESS:
  - 'indices:data/*'
  - CRUD

WRITE:
  - 'indices:data/write*'
  - 'indices:admin/mapping/put'

READ:
  - 'indices:data/read*'
  - 'indices:admin/mappings/fields/get*'

DELETE:
  - 'indices:data/write/delete*'

CRUD:
  - READ
  - WRITE

SEARCH:
  - 'indices:data/read/search*'
  - 'indices:data/read/msearch*'
  - 'indices:siren/plan*'
  - 'indices:siren/mplan*'
  - SUGGEST

SUGGEST:
  - 'indices:data/read/suggest*'

INDEX:
  - 'indices:data/write/index*'
  - 'indices:data/write/update*'
  - 'indices:admin/mapping/put'
  - 'indices:data/write/bulk*'

GET:
  - 'indices:data/read/get*'
  - 'indices:data/read/mget*'

###### CLUSTER LEVEL ######

CLUSTER_ALL:
  - 'cluster:*'

CLUSTER_MONITOR:
  - 'cluster:monitor/*'

CLUSTER_COMPOSITE_OPS_RO:
  - 'indices:data/read/mget'
  - 'indices:data/read/msearch'
  - 'indices:siren/mplan'
  - 'indices:data/read/mtv'
  - 'indices:admin/aliases/exists*'
  - 'indices:admin/aliases/get*'

CLUSTER_COMPOSITE_OPS:
  - 'indices:data/write/bulk'
  - 'indices:admin/aliases*'
  - CLUSTER_COMPOSITE_OPS_RO

##### SIREN #####

SIREN_CLUSTER:
  - 'indices:data/read/scroll'
  - 'cluster:data/read/lock/create'
  - 'indices:data/read/msearch*'
  - 'cluster:internal/data/create'
  - 'cluster:internal/data/transfer*'
  - 'cluster:internal/data/delete'
  - 'cluster:internal/data/update/metadata'
  - 'cluster:siren/internal*'
  - 'indices:siren/mplan*'
  - 'cluster:admin/plugin/siren/license/get'
  - CLUSTER_COMPOSITE_OPS_RO

SIREN_COMPOSITE:
  - 'indices:siren/mplan*'
  - 'indices:siren/plan*'
  - 'indices:data/read/search-join'
  - 'indices:data/read/lock/release'

SIREN_READONLY:
  - 'indices:data/read/field_stats*'
  - 'indices:data/read/field_caps*'
  - 'indices:data/read/get*'
  - 'indices:data/read/mget*'
  - 'indices:data/read/search*'
  - 'indices:data/read/lock*'
  - 'indices:siren/mplan'
  - 'indices:siren/plan'
  - 'indices:admin/mappings/get*'
  - 'indices:admin/mappings/fields/get*'
  - 'indices:admin/validate/query*'
  - 'indices:admin/get*'
  - 'indices:admin/version/get*'
  - 'indices:data/siren/connector/*'
  - SIREN_COMPOSITE

SIREN_READWRITE:
  - 'indices:admin/exists*'
  - 'indices:admin/mapping/put*'
  - 'indices:admin/refresh*'
  - 'indices:data/write/delete*'
  - 'indices:data/write/index*'
  - 'indices:data/write/update*'
  - 'indices:data/write/bulk*'
  - SIREN_READONLY

This file contains named groups of permissions which can be used in the roles configuration file; this configuration includes Search Guard default groups plus three Siren Investigate specific groups:

  • SIREN_READWRITE: groups all the permissions needed to search and update the main Siren Investigate index (.siren); the group has to be assigned on the main index to all roles that can modify the Siren Investigate configuration.

  • SIREN_READONLY: groups all the permissions needed to search any Elasticsearch index from Siren Investigate. The group has to be assigned on all indices that a role has access to.

  • SIREN_CLUSTER: sets the permission to read results from scrolling searches and send composite requests.

  • SIREN_COMPOSITE: groups all the permissions to execute composite requests not recognized by Search Guard; the group has to be granted on all indices to roles that have access only to a subset of indices (for example sirennoinvestor).

Roles (sg_roles.yml)

# Allows any action on the cluster.
sg_all_access:
  cluster:
    - '*'
  indices:
    '*':
      '*':
        - '*'

# Allows reading data from all indices.
sg_readall:
  indices:
    '*':
      '*':
        - READ

# Permissions for a Logstash client.
logstash:
  cluster:
    - 'indices:data/write/bulk*'
    - 'indices:admin/template/*'
    - CLUSTER_MONITOR
    - SIREN_CLUSTER
  indices:
    'logstash-*':
      '*':
        - CRUD
        - CREATE_INDEX
    '*beat*':
      '*':
        - CRUD
        - CREATE_INDEX

# Permissions for the Siren server process.
sirenserver:
  cluster:
      - 'cluster:monitor/nodes/info'
      - 'cluster:monitor/health'
      - 'cluster:monitor/main'
      - 'cluster:monitor/state'
      - 'cluster:monitor/nodes/stats'
      - SIREN_CLUSTER
      - CLUSTER_COMPOSITE_OPS
  indices:
    '*':
      '*':
        - indices:admin/get
    '?siren':
      '*':
        - ALL
    '?sirenaccess':
      '*':
        - ALL

# Permissions for the internal federate user
federateserver:
  cluster:
    - 'indices:admin/aliases'
  indices:
    ?siren-federate-datasources:
      '*':
        - ALL
    ?siren-federate-indices:
      '*':
        - ALL
    ?siren-federate-target:
      '*':
        - ALL

# Permissions for a Siren Investigate administrator
sirenadmin:
  cluster:
    - SIREN_CLUSTER
    - CLUSTER_MONITOR
    - 'cluster:admin/plugin/siren/license/put'
    - 'cluster:admin/plugin/siren/license/delete'
    - 'indices:data/write/bulk'
    - 'cluster:admin/siren/connector/*'
  indices:
    'watcher_alarms*':
      '*':
        - SIREN_READONLY
    'data-*':
        - SIREN_READONLY
    'db-*':
      '*':
        - SIREN_READWRITE
        - indices:admin/create
        - indices:admin/siren/connector/*
    '*':
      '*':
        - INDICES_MONITOR
        - SIREN_COMPOSITE
        - 'indices:admin/siren/connector/*'
        - 'indices:data/siren/connector/get'
        - 'indices:data/siren/connector/mappings/get'
        - 'indices:admin/get'

# Permissions for a Siren Investigate regular user.
sirenuser:
  cluster:
    - SIREN_CLUSTER
  indices:
    'watcher_alarms*':
      '*':
        - SIREN_READONLY
    'data-*':
        - SIREN_READONLY
    'db-*':
        - SIREN_READONLY
    '*':
      '*':
        - SIREN_COMPOSITE

# Permissions for a Siren Alert user.
sirenalert:
  cluster:
    - 'indices:data/write/bulk'
    - 'indices:admin/template/*'
    - SIREN_CLUSTER
  indices:
    '*':
      '*':
        - SIREN_READONLY
    'watcher_alarms*':
      '*':
        - SIREN_READWRITE
        - CREATE_INDEX

A permission is defined by the following syntax:

<username>:
  <indices or cluster>:
    '<index name or regular expression>':
      '<type name or regular expression>':
        - <list of permissions or action group names>

The index name can contain the simple expansion characters * and ? to match any sequence of character/any single character; for further information about defining permissions, refer to the Search Guard configuration documentation.

This sample configuration defines the following roles:

  • sg_all_access: enables every action on the cluster.

  • sg_readall: enables you to search data on all the indices in the cluster.

  • logstash: defines the permission for a Logstash client with all write and creation privileges enabled on Logstash and Elastic Beats templates and indices.

  • monitoring: defines the permissions for an X-Pack monitoring agent.

  • sirenserver: defines the permissions for the Siren Investigate server process, with read/write access to the internal Siren Investigate indices.

  • sirenuser: defines the permissions for a Siren Investigate user with readonly access to all indices whose name starts with data-* and to the Siren Alert alarm history (indices whose name starts with watcher_alarms*).

  • sirenadmin: defines the permissions for a Siren Investigate user that has the same permissions as sirenuser and in addition can create virtual indices whose name starts with db. This user has also additional permissions to upload the Siren Investigate license, get monitoring information from the cluster and managed JDBC datasources.

  • sirenalert: defines the permission for the Siren Alert user; this role has read access to all indices and is also authorized to create, search and delete indices whose name starts with watcher_alarms* (the alarm history).

Users (sg_internal_users.yml)

# Internal user database
# The hash value is a bcrypt hash and can be generated with plugin/tools/hash.sh
admin:
  hash: $2a$12$zMeFc6Xi.pcgDVHsvtCV9ePNteVwTE5uGxcKdf7XQcKB9.VkD8iOy
federateserver:
  hash: $2a$12$zMeFc6Xi.pcgDVHsvtCV9ePNteVwTE5uGxcKdf7XQcKB9.VkD8iOy
sirenserver:
  hash: $2a$12$zMeFc6Xi.pcgDVHsvtCV9ePNteVwTE5uGxcKdf7XQcKB9.VkD8iOy
sirenadmin:
  hash: $2a$12$zMeFc6Xi.pcgDVHsvtCV9ePNteVwTE5uGxcKdf7XQcKB9.VkD8iOy
sirenuser:
  hash: $2a$12$zMeFc6Xi.pcgDVHsvtCV9ePNteVwTE5uGxcKdf7XQcKB9.VkD8iOy
logstash:
  hash: $2a$12$zMeFc6Xi.pcgDVHsvtCV9ePNteVwTE5uGxcKdf7XQcKB9.VkD8iOy
sirenalert:
  hash: $2a$12$zMeFc6Xi.pcgDVHsvtCV9ePNteVwTE5uGxcKdf7XQcKB9.VkD8iOy

The file defines the credentials for Search Guard internal users; passwords are stored as hashes in the hash attribute beneath each username.

The password for all the accounts in the example is password.

To change the password of a user, you must generate the corresponding hash; this can be done by executing the plugins/search-guard-5/tools/hash.sh script as follows:

$ bash plugins/search-guard-5/tools/hash.sh -p password

The script will output the hash for the password specified after the -p switch.

It is also possible to change passwords for internal users from the Access Control application in the Siren Investigate UI once configured.

Role mappings (sg_roles_mapping.yml)

sg_all_access:
  users:
    - admin

federateserver:
  users:
    - federateserver

sirenserver:
  users:
    - sirenserver

sirenadmin:
  users:
    - sirenadmin

sirenuser:
  users:
    - sirenuser

logstash:
  users:
    - logstash

sirenalert:
  users:
    - sirenalert

The file defines the list of users assigned to each role using the following form:

<role name>:
  users:
    - <username>
    - <username>
Uploading the configuration to the cluster

To upload the configuration defined in the previous steps, go to the Elasticsearch folder and execute the plugins/search-guard-5/tools/sgadmin.sh script as follows:

$ bash plugins/search-guard-5/tools/sgadmin.sh \
  -cd config/sgconfig \
  -cn siren-distribution \
  -ts config/truststore.jks \
  -tspass password \
  -ks ../siren-investigate/pki/searchguard/CN\=sgadmin-keystore.jks \
  -kspass password \
  -h localhost \
  -p 9330 \
  -nhnv

To reload the configuration you have to use the same same command with the -rl flag instead of -cd, for example:

$ bash plugins/search-guard-5/tools/sgadmin.sh \
  -rl
  -cn siren-distribution \
  -ts config/truststore.jks \
  -tspass password \
  -ks ../siren-investigate/pki/searchguard/CN\=sgadmin-keystore.jks \
  -kspass password \
  -h localhost \
  -p 9330 \
  -nhnv

You must specify the following arguments based on your environment configuration:

  • -cd: the path to the folder containing the Search Guard access control configuration.

  • -cn: the name of the Elasticsearch cluster.

  • -ts: the path to the truststore file.

  • -tspass: the password of the truststore file.

  • -ks: the path to the administrative client certificate keystore.

  • -kspass: the password of the client certificate keystore file.

  • -h: the hostname of a node in the cluster.

  • -p: the transport port of the node specified in the -h option.

  • -nhnv: disables host name verification; remove this option if you installed node certificates with the correct hostname (recommended in production).

  • -rl: reloads the configuration and flushes the internal cache.

By default the number of replicas for the searchguard index will be set at creation time to the number of data nodes - 1.

For additional information on how to set replication settings and sgadmin in general, refer to the sgadmin documentation.

If the command is executed successfully, a list of the actions executed and their outcome will be printed on screen, for example:

Clustername: elasticsearch
Clusterstate: YELLOW
Number of nodes: 1
Number of data nodes: 1
searchguard index does not exists, attempt to create it ... done
Populate config from /elasticsearch/sg_config
Will update 'config' with sg_config/sg_config.yml
   SUCC: Configuration for 'config' created or updated
Will update 'roles' with sg_config/sg_roles.yml
   SUCC: Configuration for 'roles' created or updated
Will update 'rolesmapping' with sg_config/sg_roles_mapping.yml
   SUCC: Configuration for 'rolesmapping' created or updated
Will update 'internalusers' with sg_config/sg_internal_users.yml
   SUCC: Configuration for 'internalusers' created or updated
Will update 'actiongroups' with sg_config/sg_action_groups.yml
   SUCC: Configuration for 'actiongroups' created or updated
Done with success

You can then verify that SSL and authentication are enabled by making an authenticated request with wget, for example:

$ wget --ca-certificate=../siren-investigate/pki/searchguard/ca.pem --http-user=sirenserver --http-password=password -qO - https://localhost:9220

To display information about the certificate as seen by a client you can execute the following command:

$ echo | openssl s_client -servername localhost -connect localhost:9220 -showcerts | openssl x509 -text -inform pem -text -noout

Siren Investigate access control configuration

Edit config/investigate.yml and specify the credentials of the sirenserver user, for example:

elasticsearch.username: 'sirenserver'
elasticsearch.password: 'password'

If HTTPS is enabled for the Elasticsearch REST API, ensure that the elasticsearch.url setting contains a URL starting with https, for example:

elasticsearch.url: 'https://localhost:9220'

If the certificate is not signed by a public authority, you will also need to set the elasticsearch.ssl.certificateAuthorities to the path of the CA chain bundle in PEM format, for example:

elasticsearch.ssl.certificateAuthorities: 'pki/searchguard/ca.pem'

If you are using the certificates generated by the TLS generator service, the PEM file containing the certification bundles is available in root-ca/root-ca.pem.

To enable certificate verification, set elasticsearch.ssl.verificationMode to full, for example:

elasticsearch.ssl.verificationMode: full

If you want to validate the certificate but not the hostname, set elasticsearch.ssl.verificationMode to certificate, for example:

elasticsearch.ssl.verificationMode: certificate

Set the investigate_core.elasticsearch.auth_plugin option to searchguard:

investigate_core:
  elasticsearch:
    auth_plugin: searchguard

To enable the Siren Investigate access control plugin, specify the following configuration values in the investigate_access_control section:

  • enabled: set to true to enable the Siren Investigate access control plugin. Defaults to false.

  • backend: the authentication backend installed in the cluster; valid values are searchguard and xpack. Defaults to searchguard.

  • cookie.password: a 32 characters long alphanumeric string used to derive the key used to encrypt and sign cookies.

  • cookie.secure: if set to true, the cookie will be transmitted only if the request is being served over HTTPS. You must set this to false if Siren Investigate is behind an SSL proxy or if you are running Siren Investigate without HTTPS (which is not advised). Defaults to true.

  • admin_role: the name role that will have access to the access control management UI . This user will not be subject to any permission check by Siren Investigate, but will still be subject to permission checks when issuing queries to Elasticsearch. Defaults to sirenadmin.

  • acl.enabled: set to true to disable access control on saved objects. Defaults to false.

Example minimal configuration:

investigate_access_control:
  enabled: true
  acl:
    enabled: true
  cookie:
    secure: true
    password: '12345678123456781234567812345678'

Make sure to personalize the session cookie password.

Additional configuration options:

  • session.ttl: the lifetime of the session in milliseconds. If not set, the session will last as long as the session cookie is valid. Defaults to 3600000 (1 hour).

  • session.keepAlive: if set to true, every time a request is received within the session lifetime, the session lifetime will be extended by session.ttl. Defaults to true.

  • cookie.password: a 32 characters long alphanumeric string used to derive the key used to encrypt and sign cookies.

  • cookie.ttl: the lifetime of the session cookie in milliseconds. If not set, the cookie will expire when the browser is closed, which is the recommended setting. Note that browsers may not remove session cookies when a tab is closed or even across restarts, so you should set session.ttl for additional protection. Defaults to null.

  • cookie.name: the name of the session cookie. Defaults to kac.

  • acl.index: the Elasticsearch index in which access control rules and saved objects metadata will be stored (.sirenaccess by default).

If Siren Investigate is running behind a reverse SSL proxy like Nginx, remember to set cookie.secure to false otherwise the cookie will not be sent, for example:

investigate_access_control:
  enabled: true
  acl:
    enabled: true
  cookie:
    password: '12345678123456781234567812345678'
    secure: false

If you want to use the Siren Alert plugin, you must specify the Siren Alert user credentials in the investigate_access_control.sirenalert section, For example:

investigate_access_control:
  enabled: true
  acl:
    enabled: true
  cookie:
    password: '12345678123456781234567812345678'
    secure: false
  sirenalert:
    elasticsearch:
      username: sirenalert
      password: password

If Siren Alert credentials are not specified, Siren Alert will use the backend credentials to execute the watchers.

Restart Siren Investigate after changing the configuration file; if the configuration is correct, you should see an authentication dialog when browsing to Siren Investigate.

Authentication dialog
Figure 37. Authentication dialog

Saved objects access control

Siren Investigate features an access control system on saved objects that enables you to filter dashboards and visualizations visible to end users; this was enabled in the previous section by setting investigate_access_control.acl.enabled to true.

When the system is enabled, normal users should not have access to the .siren and .sirenaccess indices, as these will be managed by the backend user (.sirenserver).

Search Guard management UI Setup

Siren Investigate includes an optional user interface for the Search Guard REST Management API add-on ; in order to use it, the Siren Investigate backend has to connect to the Elasticsearch cluster using a PEM client certificate with administrative privileges.

Add-on installation

To install the Search Guard REST Management API add-on it is required to download the correct jar for your Elasticsearch / Search Guard version from this page and copy it to the plugins/search-guard-5 folder of each node in the cluster.

To access the API it is required to use a client certificate with administrative privileges; to enable optional client certificate authentication on the REST interface, ensure that the following option is present in elasticsearch.yml:

searchguard.ssl.http.clientauth_mode: OPTIONAL

After the plugin has been copied and the configuration updated, the nodes must be restarted; a rolling restart is enough to install the add-on.

When using this add-on, ensure that the sgadmin configuration folder contains only the sg_config.yml file, otherwise sgadmin will replace users, roles, action groups and mappings that may have been modified through the API.

Siren Investigate configuration

Copy the client certificate and its key to a folder readable by Siren Investigate (for example pki); then add the following parameters to the investigate_access_control configuration section:

  • admin_role: the Search Guard role that has access to the Search Guard management UI (sirenadmin by default).

  • backends.searchguard.admin.ssl.cert: the path to the administrative client certificate bundle in PEM format.

  • backends.searchguard.admin.ssl.key: the path to the administrative client certificate key in PEM format.

  • backends.searchguard.admin.ssl.keyPassphrase: the passphrase of the administrative client certificate key. Not required if the key is not encrypted.

For example:

investigate_access_control:
investigate_access_control:
  enabled: true
  acl:
    enabled: true
  admin_role: sirenadmin
  cookie:
    password: '12345678123456781234567812345678'
    secure: false
  backends:
    searchguard:
      admin.ssl.cert: pki/searchguard/CN=sgadmin.crtfull.pem
      admin.ssl.key: pki/searchguard/CN=sgadmin.key.pem
      admin.ssl.keyPassphrase: password

Note that the administrative client certificate bundle must contain both the full CA chain and the client certificate; if using certificates generated by the TLS generation service, the file name will be CN=sgadmin.crtfull.pem, otherwise it is possible to generate the bundle manually by using cat, for example:

$ cat user.crt.pem ca-chain.pem > user.crtfull.pem

Access Control: Authentication UI

After the certificate is setup, restart Siren Investigate, login with a user having an administrative role, click the apps button, then click Access control and finally on Authentication.

The Access control app
Figure 38. The Access control app

The Authentication section enables you to browse, edit and create the following Search Guard resources:

  • Internal users

  • Roles

  • Role mappings

  • Action groups

To verify that the application is working correctly, click Roles then click the Open button; you should see the list of roles defined during the initial Search Guard setup or an authorization error if the certificate is incorrect:

Browsing Search Guard roles
Figure 39. Browsing Search Guard roles

If you get an error upon opening the Authentication app, most probably the client certificate does not contain the full CA chain or the add-on has not been installed correctly, check Elasticsearch and Siren Investigate logs for related errors.

If you experience a Siren Investigate crash when opening the application, ensure that the option investigate_access_control.backends.searchguard.admin.ssl.keyPassphrase is set to the correct password.

Access Control: ACL UI

The ACL section
Figure 40. The ACL section

The ACL Roles panel in the ACL section enables you to define Siren Investigate roles, which are collections of permissions on saved objects and UI elements. The main purpose of this system is to hide and block access to:

  • UI elements - applications, for example: Timelion, Access control, Siren Alert

  • UI elements - specific functionalities, for example: export CSV feature

  • UI elements - Siren Investigate sections, for example: discover, management

  • Saved objects on unauthorized indices, for example: dashboards, searches

to end users and avoid unauthorized changes to configuration objects or use of certain parts of the system.

There are two kinds of rules:

  • rules - to set permissions for saved objects

  • ui rules - to set permissions to view different UI elements

The everyone role defines permissions for all the users in the system, and is mapped by default to any user logged in Siren Investigate; by default it grants all users read only access to the Siren Investigate configuration (Advanced settings), saved searches and index patterns as well as permission to view all applications and UI elements.

The everyone role
Figure 41. The everyone role

Denying access to certain saved objects like saved search using the first sets of rules is usually transparent to the user which means that he will simply not see the objects anywhere in Siren Investigate.

Usually it is not required to create explicit UI rules for the dashboard application as access to specific dashboards can be restricted through saved object rules.

Denying access to an application like Timelion or a Siren Investigate section like management will hide the navigation menu elements, block access at the route level and display an error.

Blocked Timelion application and Siren Investigate management section
Figure 42. Blocked Timelion application and Siren Investigate management section

When the user tries to access app/timelion, the following error is shown.

Blocked timelion error
Figure 43. Blocked Timelion error

When the user tries to access /app/kibana#/management, the following error is shown.

Blocked Siren Investigate management section error
Figure 44. Blocked Siren Investigate management section error

For most setups it makes sense to grant view permissions on visualizations as well, then set specific permissions on dashboards and dashboard groups for each role.

To define a new role, click the Create role button, then set the following parameters:

  • Role ID: the ID of the role (for example sirenuser); must be a lowercase alphanumeric string.

  • Backend roles: a list of Search Guard roles that will be mapped to this Siren Investigate role (for example sirenuser)

  • Rules: a list of rules on saved object types.

Each rule is defined by three parameters:

  • Action: allow or deny

  • Permission: the permission to allow or deny

  • Context: the saved object type on which the permission must be enforced.

The Create role button
Figure 45. The Create role button
Saving a role
Figure 46. Saving a role

Object permissions

In addition to role level permissions, it is possible to define permissions on specific objects by visiting Settings > Objects and clicking the permissions button next to an object:

The object permissions button
Figure 47. The object permissions button

The object permissions form enables you to set the owner of the object and custom access rules.

By default the owner is set to the user that created the object; the owner has all permissions on the created object; it is possible to unset the owner of an object by leaving the field blank and clicking the Save button.

Custom access rules can be used to grant access to an object that would be otherwise hidden; for example, if everyone is not granted to display dashboards but you want to display the Overview dashboard to all users, visit the object permissions form for the Overview dashboard and set the View permission for everyone to Allow.

If everyone can see dashboards but you would like to hide the IT dashboard to users, set the View permission for everyone to Deny.

The object permissions form
Figure 48. The object permissions form

Notes

Although users are not allowed to view or edit the following types unless they have permission to do so, they will be retrieved and executed by the backend if used by a visualization:

  • Query

  • Query templates

  • Data source

Logstash configuration

To enable authentication in Logstash, set the following parameters in the output.elasticsearch section:

  • user: the username of the user having the logstash role.

  • password: the password of the user having the logstash role.

  • ssl: set to true to enable SSL.

  • truststore: the path to the CA truststore file.

  • truststore_password: the password of the CA truststore file.

For example:

output {
    elasticsearch {
       hosts => ['localhost:9220']
       user => logstash
       password => password
       ssl => true
       truststore => '/etc/pki/logstash/truststore.jks'
       truststore_password => password
    }
}

The truststore file must be copied on all nodes running Logstash.

Beats configuration

To enable authentication in a beat which connects directly to Elasticsearch, set the following parameters in the output.elasticsearch section:

  • protocol: set to https.

  • username: the username of the user having the logstash role.

  • password: the password of the user having the logstash role.

  • tls.certificate_authorities: an array containing the path to the CA truststore file in PEM format.

For example:

output:

  elasticsearch:
    hosts: ['localhost:9220']

    protocol: 'https'
    username: 'logstash'
    password: 'password'

    tls:
      certificate_authorities: ['/etc/pki/filebeat/ca.pem']

The root certification authority in PEM format must be copied to all nodes running one or more beats.

Console configuration

In order to successfully submit queries from console to a cluster secured by Search Guard set the following parameters in config/investigate.yml:

console.proxyConfig:
  - match:
      protocol: 'https'

    ssl:
      ca: 'pki/searchguard/ca.pem'

console.proxyConfig.ssl.ca must point to the CA certificate bundle, so it can be set to the same value as the elasticsearch.ssl.ca parameter.

X-Pack monitoring configuration

In order to store monitoring data in a cluster secured by Search Guard it is required to configure agent exporters to submit data over an authenticated HTTPS connection.

The exporter configuration in elasticsearch.yml must include the following parameters:

  • type: http.

  • host: an array of URLs that will be contacted by the exporter.

  • auth.username: the username of the Marvel agent user.

  • auth.password: the password of the Marvel agent user.

  • ssl.truststore.path: the path to the CA certificate truststore (this will usually be the same as the one specified in the Search Guard configuration).

  • ssl.truststore.password: the password of the CA certificate truststore.

For example, the following configuration defines an exporters which sends data to the cluster at https://localhost:9220, authenticating as the monitoring user:

xpack.monitoring.exporters:
  id1:
    type: http
    host: ['https://localhost:9220']

    auth:
      username: monitoring
      password: password

    ssl:
      truststore.path: truststore.jks
      truststore.password: password

X-Pack security integration

Create a sirenserver role with the following definition and map it to a sirenserver user:

{
  "cluster": [
    "cluster:admin/plugin/siren/license/get",
    "monitor"
  ],
  "indices" : [
    {
      "names" : [ "*" ],
      "privileges" : [ "indices:admin/get" ]
    },
    {
      "names" : [ ".siren" ],
      "privileges" : [ "all" ]
    },
    {
      "names" : [ ".sirenaccess" ],
      "privileges" : [ "all" ]
    }
  ]
}

If using a custom configuration, replace the configuration index name (.siren by default) and access control index name (.sirenaccess by default) with the correct names.

Set elasticsearch.username and elasticsearch.password to the credentials of the sirenserver user, for example:

elasticsearch.username: sirenserver
elasticsearch.password: password

If HTTPS is enabled for the Elasticsearch REST API, ensure that the elasticsearch.url setting contains a URL starting with https, for example:

elasticsearch.url: 'https://localhost:9220'

If the certificate is not signed by a public authority, you will also need to set the elasticsearch.ssl.certificateAuthorities to the path of the CA chain bundle in PEM format, for example:

elasticsearch.ssl.certificateAuthorities: 'pki/searchguard/ca.pem'

To enable certificate verification, set elasticsearch.ssl.verificationMode to full, for example:

elasticsearch.ssl.verificationMode: full

Set the investigate_core.elasticsearch.auth_plugin option to xpack:

investigate_core:
  elasticsearch:
    auth_plugin: xpack

Then, set the backend parameter of the investigate_access_control section of the investigate.yml to xpack:

investigate_access_control:
  enabled: true
  backend: xpack
  acl:
    enabled: true
  cookie:
    secure: true
    password: '12345678123456781234567812345678'

For a complete description of the options, see Siren Investigate access control configuration.

All users with access to Siren Investigate should have the following role definition:

Example standard user role with access to all indices starting with data- and to all virtual indices starting with db-:

{
  "cluster": [
    "cluster:admin/plugin/siren/license/get",
    "cluster:siren/internal"
  ],
  "indices" : [
    {
      "names" : [ "*" ],
      "privileges" : [ "indices:siren/mplan" ]
    },
    {
      "names" : [ "data-*" ],
      "privileges" : [
        "read",
        "view_index_metadata",
        "indices:data/read/search-join",
        "indices:siren",
        "indices:admin/version/get",
        "indices:admin/get"
      ]
    },
    {
      "names" : [ "db-*" ],
      "privileges" : [
        "read",
        "view_index_metadata",
        "indices:data/read/search-join",
        "indices:data/siren",
        "indices:siren/plan",
        "indices:admin/version/get",
        "indices:admin/get"
      ]
    }
  ]
}

For administrative user, ensure you have admin_role configured in the investigate_access_control section in investigate.yml. e.g

investigate_access_control:
  admin_role: sirenadmin

Example administrative user with access to all indices starting with data-, to all virtual indices starting with db-, license management, and permission to manage external datasources and virtual indices starting with db-:

{
  "cluster": [
    "cluster:admin/plugin/siren/license",
    "cluster:admin/siren/connector",
    "cluster:siren/internal",
    "manage"
  ],
  "indices" : [
    {
      "names" : [ "*" ],
      "privileges" : [ "indices:siren/mplan" ]
    },
    {
      "names" : [ "data-*" ],
      "privileges" : [
        "read",
        "view_index_metadata",
        "indices:data/read/search-join",
        "indices:siren",
        "indices:admin/version/get",
        "indices:admin/get"
      ]
    },
    {
      "names" : [ "db-*" ],
      "privileges" : [
        "read",
        "create_index",
        "view_index_metadata",
        "indices:data/read/search-join",
        "indices:data/siren",
        "indices:siren",
        "indices:admin/version/get",
        "indices:admin/get",
        "indices:admin/siren/connector"
      ]
    }
  ]
}

For additional information on datasources configuration, check the JDBC datasources section.

Kerberos/SPNEGO authentication support

This section offers an overview of how to enable Kerberos/SPNEGO authentication in Siren Investigate.

Before enabling Kerberos support you should setup Siren Investigate and Search Guard as described in the Search Guard Integration and Siren Investigate access control chapter.

Limitations

The current implementation requires disabling the Kerberos replay cache in Search Guard, as the Siren Investigate backend needs to make multiple requests to the Elasticsearch cluster on behalf of the user in several places without the ability to generate new service tickets.

As long as all the traffic to Siren Investigate is encrypted and the service ticket lifetime is short (the default in most system is 5 to 10 minutes) this should not pose a significant security risk.

Pre requisites

Service Principal

In order to enable Kerberos authentication, you need to create a service Principal to identify the Elasticsearch REST interface; usually the principal name is HTTP/<public DNS name of the cluster> (for example HTTP/es.ad.local).

Active Directory

On an Active Directory domain controller it is possible to use the setspn command to set a Service Principal Name for a domain user; for example, the following command run in an elevated command prompt associates the Service Principal Name HTTP/es.ad.local to a user named elasticsearch:

setspn -A HTTP/es.cluster.local elasticsearch

Refer to the Active Directory documentation for more details about setspn and Kerberos integration.

Keytab

After the service Principal is defined, you need to generate a keytab file that will be used by the Kerberos add-on to authenticate with the KDC.

Active Directory

On an Active Directory domain controller you can generate a keytab by running the ktpass command in an elevated command prompt as follows:

ktpass -out es.keytab -princ <principal name>@<domain> /mapuser <principal user> /pass "<principal user password>" /kvno 0

For example, to generate a keytab for the SPN HTTP/es.ad.local, associated to elasticsearch user in the AD.LOCAL domain, you need to run the following command:

ktpass -out es.keytab -princ HTTP/es.ad.local@AD.LOCAL /mapuser elasticsearch /pass "password" /kvno 0

Verification

This verification step is optional but it is useful to ensure that the keytab is correct before configuring Search Guard.

To verify that the keytab works correctly, copy it to a different machine with access to the KDC / Domain controller; the keytab contains the credentials of the service principal user so it should be removed from any intermediate machine used to transfer the file the transfer and from the target machine after the test is complete.

Create a file named krb5.conf in the same folder as the keytab with the contents below; replace AD.LOCAL with your domain name and DC.AD.LOCAL with the name or IP address of your KDC or domain controller, keeping the case of domains as in the example:

[libdefaults]
default_realm = AD.LOCAL
forwardable=true
default_tkt_enctypes = rc4-hmac,aes256-cts-hmac-sha1-96,aes128-cts-hmac-sha1-96
default_tgs_enctypes = rc4-hmac,aes256-cts-hmac-sha1-96,aes128-cts-hmac-sha1-96

[realms]
AD.LOCAL = {
kdc = dc.ad.local:88
default_domain = ad.local
}

[domain_realm]
.ad.local = AD.LOCAL
ad.local = AD.LOCAL
Linux and macOS

On Linux and macOS systems, set the KRB5_CONFIG variable temporarily to point to the absolute path of the file created before and run kinit -t <keytab> <principal>, for example:

KRB5_CONFIG=./krb5.conf kinit -t es.keytab HTTP/es.ad.local

If the keytab is correct, kinit should exit immediately and not show a password prompt; to verify that the ticket has been issued, execute the klist -v command and check that it outputs the details of the ticket:

klist -v
Credentials cache: API:123
        Principal: HTTP/es.ad.local@ES.AD.LOCAL
    Cache version: 0

Server: krbtgt/AD.LOCAL@AD.LOCAL
Client: HTTP/es.ad.local@AD.LOCAL
Ticket etype: aes256-cts-hmac-sha1-96, kvno 2
Session key: arcfour-hmac-md5
Ticket length: 1194
Auth time:  May 12 19:59:10 2017
End time:   May 13 05:59:10 2017
Ticket flags: enc-pa-rep, pre-authent, initial, forwardable
Addresses: addressless

You can then destroy the ticket by executing the kdestroy command.

Windows systems

If you are running Elasticsearch nodes on Windows, you can use the Kerberos tools bundled with the Java Runtime Environment to verify the keytab.

If the JRE folder is not in the system path, prepend it to each command.

Execute kinit <principal> -t <keytab> -J-Djava.security.krb5.conf=<path to krb5.conf> to get a ticket, for example:

kinit HTTP/es.ad.local -t es.keytab -J-D"java.security.krb5.conf=C:\Users\test\krb5.conf"

If the keytab is correct kinit will print the path to the file where the ticket has been saved, for example:

New ticket is stored in cache file C:\Users\test\krb5cc_test

Execute klist to see the details of the ticket; to destroy the ticket you can simply remove the file create by kinit.

Setup and configuration

Search Guard add-on

Kerberos authentication support requires the installation of the commercial Search Guard Kerberos HTTP Authentication add-on; to install it, download the correct jar for your Search Guard version from this page and copy it to the plugins/search-guard-5 folder on each node.

Kerberos configuration file

Create a file named krb5.conf in the config folder of each node with the following contents; replace AD.LOCAL with your domain name and DC.AD.LOCAL with the name or IP address of your KDC/domain controller, keeping the case of domains as in the example:

[libdefaults]
default_realm = AD.LOCAL
forwardable=true
default_tkt_enctypes = rc4-hmac,aes256-cts-hmac-sha1-96,aes128-cts-hmac-sha1-96
default_tgs_enctypes = rc4-hmac,aes256-cts-hmac-sha1-96,aes128-cts-hmac-sha1-96

[realms]
AD.LOCAL = {
kdc = dc.ad.local:88
default_domain = ad.local
}

[domain_realm]
.ad.local = AD.LOCAL
ad.local = AD.LOCAL

Keytab

Copy the keytab file for the service principal to the configuration folder of each Elasticsearch node.

Elasticsearch configuration

Add the following options to the elasticsearch.yml file of each node:

  • searchguard.kerberos.krb5_filepath: the path to the Kerberos configuration file, usually krb5.conf.

  • searchguard.kerberos.acceptor_keytab_filepath: the path to the keytab file relative to the configuration folder of the Elasticsearch node. It is mandatory to store the keytab in this folder.

  • searchguard.kerberos.acceptor_principal: the name of the principal stored in the keytab (for example HTTP/es.ad.local).

Example configuration:

searchguard.kerberos.krb5_filepath: 'krb5.conf'
searchguard.kerberos.acceptor_keytab_filepath: 'es.keytab'
searchguard.kerberos.acceptor_principal: 'HTTP/es.ad.local'

To disable the Kerberos replay cache in Search Guard, you must set the sun.security.krb5.rcache JVM property to none; this can be done by setting the following line in config/jvm.options:

-Dsun.security.krb5.rcache=none

For information on where to set/modify this variable, refer to Running as a service on Linux or Running as a service on Windows.

Cluster restart

After the previous steps have been completed on all nodes, perform a rolling restart of the cluster.

Search Guard authenticator configuration

To complete the Kerberos configuration you need to modify your sg_config.yml file and upload it to the cluster using sgadmin; if you are using the Search Guard management API ensure to include only the sg_config.yml in the sgadmin configuration folder or you will overwrite internal users, actiongroups, roles and mappings defined through the API.

To enable Kerberos authentication over HTTP, you need to:

  • Add a Kerberos authenticator stanza to searchguard.authc

  • Disable challenge in the existing HTTP Basic authenticator if enabled

Example sg_config.yml:

searchguard:
  dynamic:
    http:
      anonymous_auth_enabled: false
      xff:
        enabled: false
    authc:
      kerberos_auth_domain:
        enabled: true
        order: 2
        http_authenticator:
          type: kerberos
          challenge: true
          config:
            krb_debug: false
            strip_realm_from_principal: true
        authentication_backend:
          type: noop
      basic_internal_auth_domain:
        enabled: true
        order: 1
        http_authenticator:
          type: basic
          challenge: false
        authentication_backend:
          type: intern

With this configuration, if the user is not authenticated Search Guard will reply with a 401 challenge; SPNEGO compatible browsers will then repeat the request automatically with Kerberos credentials if the cluster is in a trusted network or display an authentication popup where the user can enter its domain credentials.

If an HTTP request to the cluster contains an HTTP Basic authorization header, it will still be authenticated by the HTTP authenticator defined in basic_internal_auth_domain; it is necessary to leave this enabled as the Siren Investigate backend uses this method to authenticate with the cluster.

It is possible to enable only a single HTTP challenge; if your browser is configured to automatically send Kerberos credentials in a trusted zone it is possible to disable the challenge attribute by setting kerberos_auth_domain.http_authenticator.challenge to false.

For more details about configuring Search Guard authenticator, refer to the official documentation.

Verification

After sg_config.yml has been loaded you can verify if the authentication is working by mapping a username in the Active Directory / Kerberos domain to a Search Guard role mapping, for example:

sirenuser:
  users:
    - sirenuser
    - domainuser

After the mapping is loaded to the cluster, logon to a machine in the domain with the domain user and open the cluster URL in a Kerberos enabled browser (for example Chrome on Windows).

If everything is setup correctly you should see the default JSON response of Elasticsearch in the browser without having to enter credentials, for example:

{
  "name" : "Node",
  "cluster_name" : "cluster",
  "cluster_uuid" : "nimUDAyBQWSskuHoAQG06A",
  "version" : {
    "number" : "5.4.0",
    "build_hash" : "fcbb46dfd45562a9cf00c604b30849a6dec6b017",
    "build_timestamp" : "2017-01-03T11:33:16Z",
    "build_snapshot" : false,
    "lucene_version" : "5.5.2"
  },
  "tagline" : "You Know, for Search"
}

If you are getting an authentication popup, ensure that the Elasticsearch cluster URL is in a trusted zone.

To add a site to the trusted zone on Windows you need to:

  • open Internet Explorer and click Internet options.

  • click the Security tab.

  • click Local Intranet.

  • click Sites.

  • click Advanced.

  • add the URL of the cluster to the list (the port can be omitted).

After the cluster is in the trusted zone try to open the cluster URL again.

Internet Explorer options are also used by Chrome on Windows.

Trusted sites setup
Figure 49. Trusted sites

Troubleshooting

To check why a request is not authenticated you can check the Elasticsearch logs of the client node serving the REST API.

The most common issues are:

  • cluster URL not present in the trusted sites list.

  • a keytab containing an incorrect Service Principal Name and/or a wrong password for the user account associated to the SPN.

  • an incorrect address of the domain controller / KDC in the krb5.conf file.

To get additional debugging information you can set krb_debug to true temporarily in sg_config.yml and upload it to the cluster using sgadmin.

Siren Investigate configuration

To enable SPNEGO support in Siren Investigate, set the investigate_access_control.backends.searchguard.authenticator option to http-negotiate, in investigate.yml, for example:

investigate_access_control:
  #... existing options
  backends:
    searchguard:
      #... existing options
      authenticator: 'http-negotiate'

Then restart Siren Investigate and verify that you can login from a browser in the domain using a user defined in Search Guard.

When SPNEGO support is enabled, cookie based authentication will be disabled; if you need to provide both authentications for different networks, it is possible to start an additional Siren Investigate instance with investigate_access_control.backend.searchguard.authenticator set to http-basic or not set at all.

JWT authentication support

This section offers an overview of how to integrate Siren Investigate with the Search Guard JWT authenticator when Siren Investigate is embedded into an iframe by another application.

Before enabling JWT support you should setup Siren Investigate and Search Guard as described in the Search Guard integration chapter and ensure that it works as expected.

Prerequisites

Search Guard add-on

JWT authentication support require the installation of the commercial Search Guard Kerberos JWT HTTP Authentication add-on; to install it, download the correct jar for your Search Guard version from this page and copy it to the plugins/search-guard-5 folder on each node, then perform a rolling restart of the cluster.

Siren Investigate proxy

It is required that Siren Investigate and the container application are published on the same domain to allow cross frame communication; this can be achieved by implementing a proxy to Siren Investigate in the container application routes or configuring a reverse proxy on a path in the application server configuration.

JWT token issuance

The application that embeds Siren Investigate is responsible for generating JWT tokens; jwt.io provides a good overview of the technology, a browser based debugging tool and a list of libraries for several platforms.

The Search Guard documentation provides an overview of all the claims supported by the add-on and a list of all the configuration options.

The application must specify an expiration date claim (exp) to avoid creating tokens with unlimited duration.

Configuration

After the add-on has been installed in the cluster, you need to modify sg_config.yml file and upload it to the cluster using sgadmin; if you are using the Search Guard management API ensure to include only the sg_config.yml in the sgadmin configuration folder or you will overwrite internal users, actiongroups, roles and mappings defined through the API

To enable JWT authentication over HTTP, you need to add a JWT authenticator stanza to searchguard.authc; an example sg_config.yml follows:

searchguard:
  dynamic:
    http:
      anonymous_auth_enabled: false
      xff:
        enabled: false
    authc:
      jwt_auth_domain:
        enabled: true
        order: 1
        http_authenticator:
          type: jwt
          challenge: false
          config:
            signing_key: "cGFzc3dvcmQ="
            jwt_header: "Authorization"
        authentication_backend:
          type: noop
      basic_internal_auth_domain:
        enabled: true
        order: 2
        http_authenticator:
          type: basic
          challenge: true
        authentication_backend:
          type: internal

With this configuration, Search Guard will check if the Authorization header contains a JWT token signed with the signing key specified in http_authenticator.signing_key.

The signing key must be encoded using the base64 algorithm; in the example the decoded key is password; when using RSA public keys, it is also possible to write them on multiple lines as follows:

searchguard:
    ...
    authc:
      jwt_auth_domain:
        ...
        http_authenticator:
          ...
          config:
            signing_key: |-
              -----BEGIN PUBLIC KEY-----
              123123abcbc
              -----END PUBLIC KEY-----

If the token is decoded successfully, Search Guard will validate the following claims:

  • iat - Issued At: the date when the token was issued (optional).

  • exp - Expiration Time: the date after which the token should expired; this claim is optional but you should set it, otherwise tokens will have unlimited duration.

  • nbf - Not Before: the date before which the token should be rejected (optional).

All dates are expressed as seconds since the Epoch in UTC.

If time claims are validated, Search Guard will get the username from the Subject claim (sub), assign role mappings and evaluate role permissions.

If an HTTP request to the cluster contains an HTTP Basic authorization header it will be authenticated by the HTTP authenticator defined in basic_internal_auth_domain; it is necessary to leave this enabled as the Siren Investigate backend uses this method to authenticate with the cluster.

It is possible to customize the claim used to retrieve the username through the parameter subject_key, e.g.:

searchguard:
  dynamic:
    http:
      anonymous_auth_enabled: false
      xff:
        enabled: false
    authc:
      jwt_auth_domain:
        enabled: true
        order: 1
        http_authenticator:
          type: jwt
          challenge: false
          config:
            signing_key: |-
              -----BEGIN PUBLIC KEY-----
              123123abcbc
              -----END PUBLIC KEY-----
            subject_key: "service:username"
            jwt_header: "Authorization"
        authentication_backend:
          type: noop
User cache

When using the JWT authentication mechanism it is recommended to disable the Search Guard user cache as each token contains the complete description of the user; this can be done by adding the following setting to elasticsearch.yml:

searchguard.cache.ttl_minutes: 0

Each node must be restarted after writing the setting.

Roles

It is possible to specify user roles in a token claim by setting the roles_key attribute in the authenticator configuration to the desired claim name, for example:

#...
      jwt_auth_domain:
        enabled: true
        order: 1
        http_authenticator:
          type: jwt
          challenge: false
          config:
            roles_key: "roles"
            signing_key: "cGFzc3dvcmQ="
            jwt_header: "Authorization"
#...

After the attribute is set and the configuration is updated, it is possible to assign backend roles to the user by setting the claim defined in http_authenticator.config.roles_key in the token payload, for example :

{
  "sub": "sirenuser",
  "exp": 1495711765,
  "roles": "sales,marketing"
}

Note that in order to map roles set in the JWT token to Search Guard roles you must define a role mapping such as the following:

JWT role mapping
Figure 50. JWT role mapping

Verification

To verify that Search Guard JWT authentication is working correctly you can generate a JWT token from your application and pass it to Elasticsearch using curl’s -H option, for example:

curl -k -H "Authorization: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJraWJpdXNlciJ9.tqCYxJsORvro59Q01J9HUeFpQtauc81CcTlS5bVl93Y" https://localhost:9200/_searchguard/authinfo

To test if it is working correctly before the application is ready, you can use the jwt.io debugger to generate tokens using the signing key defined in sg_config.yml.

Siren Investigate configuration

To enable JWT support in Siren Investigate, set the investigate_access_control.backends.searchguard.authenticator option to http-jwt, in investigate.yml, for example:

investigate_access_control:
  #... existing options
  backends:
    searchguard:
      #... existing options
      authenticator: 'http-jwt'

Then restart Siren Investigate and open it in a browser; you should get a blank page and the URL should end with login.

To test JWT authentication, open your browser console (CTRL+SHITF+I on Chrome and Firefox) and call setJWTToken of the sireninvestigate object, for example:

.sireninvestigate
.setJWTToken(yourtoken)
.then(function() {
  console.log('JWT token set.');
})
.catch(function(error) {
  console.log('An error occurred setting the token.');
});

After the token is set, Siren Investigate will store it in an encrypted cookie and send it in every request to the backend; the backend will then forward the JWT token to Search Guard to authenticate the user.

After the token is set, you can switch to the desired Siren Investigate URL by simply changing location.href.

When the user is logged out from the main application, sessionStorage and localStorage should be cleared.

For more information on how to call setJWTToken from the parent frame, refer to the cross frame communication section.

Indexes and relations

Indexes and Relations

In Indexes and Relations you can define relationships between your data tables, on the local Siren Elasticsearch nodes or mapped remotely to JDBC databases, and from there you will be able to create dashboards where you can have "relational pivoting" buttons (going from the set of currently selected records to the set of connected records in another table).

Operations that you can do:

  • Configure which Elasticsearch Index(es) or Virtual Index(es) you are going to have available inside Siren Investigate. With Elasticsearch indexes, you can also create new "scripted fields".

  • Define relations between these Indexes - This effectively defines a data model, also known as an ontology (this effectively makes it so that "Indexes" are now treated as "Classes" and the records can be seen as "Entities") . The ontology also specifies properties of these indexes/classes, for example icons, labels etc.

  • Define "Entity Identifiers" - these are Classes of strings or integers you may have here and there in the data representing an entity which are "well understood as such" but you do not (yet?) have a specific index listing them. Typical Entity identifiers are things like IP Address: It is an entity (and you want to join on it) but you do not have an "index of all the IPs". Other examples are normalized phone numbers, hashfunctions, userids, name of locations or cities, tags or labels etc.

In summary, from now on with "Classes" we will refer to either Index Patterns or EIDs and for Entities we will refer to either the individual records that are in Index Patterns or the individual EID values (for example an IP addresses)

Creating an index pattern or entity identifier

Just use the Add Index Pattern or Add Entity Identifier button.

Here, you can choose a name for the entity, add a long and short descriptions and, using an icon picker and cokour picker, choose an icon and colour that will be associated with that entity in graph views, for example when you select the Graph View tab.

For more details on creating an index pattern, see Creating an Index Pattern.

Creating relationships

Relationships are defined from a Class to other Classes (But it is not possible to define a relationships between 2 EIDs).

A relationship is defined as a join operation between two indices with the following fields:

  • The Left Field: the field of the local index to join on;

  • Right Class: (the EID or Index pattern) to connect to;

  • Right Field (only if the Right Class is an Index Pattern): the field of the right index to join with; and

  • Label: the label of the relation.

Indexes and Relations

New relations are created with the Add Relation button. Relations do NOT need to be created in both originating and target classes as they appear automatically in both edit screens when created.

Pressing the visualize data model as a graph button will show it in a visual representation where the currently selected class is highlighted, for example in this case

Relations Graph

How to use entity identifiers

Siren 10 introduces the concept of an "Entity Identifier" (EID). Previously, in Siren, to be able to join between two indexes you had to specify that there existed a direct connection between them. For example, if you had 2 logs which could be connected by the IP value, you would have specified a direct connection, thus creating a relational button between the two.

But what if you have many indexes having IPs (or anything else: MAC Addresses, User IDs, URLs, Port Numbers, Transaction IDs, etc) that are in multiple roles (Source IP, Destination IP) and it may be useful to join from any of these roles and indexes to any other role and index?

Our new relational model enables this. Automatically.

For example, in this configuration, we have defined the IP concept as an EID and tied it in with other indexes where "IPs" show up. For each connection, we specify the name of the relation that describes the role of the IP in that index (Is it the source IP in that log or the blocked IP?).

Relations Graph

With just this configuration, you can now have buttons that explore the ontology and show you all possible matches across your data. At this point, one click and you will be pivoting to the target dashboard, with the right relational filter applied.

For example, to see the records of the Apache logs where the Agent IP matches the Destination IP in the current log, just navigate from "Destination IP" as per the picture:

Automatic relational buttons

EIDs are obviously great for anything that identifies "things" across indexes but does not have an index per se (otherwise, you would pivot to it). Things like Phone Numbers, but also Tags, Labels from standalone indexes, etc. In practice a single Excel spreadsheet can be seen as a "knowledge graph" if you consider labels as identifiers that interconnect records. Here is an example with EIDs (Tissue and Organism) in a Life Science deployment.

Knowledge Graph

Note that the automatic connections between dashboards are seen when using the new relational button. The old one will still require manual inputs on which relation to show where.

Visualise

Again, this is how the new relational button appears in action.

Automatic relational buttons

How to name relations

It is well known that naming is a very hard problem in any domain. In Siren naming entities and relationships wrong will result in hard to navigate dashboards.

When naming things, put yourself into the shoes of the user in the moment where the relational navigation is performed. Say that I am looking at "companies", how would i refer to "investments" ?

A possibly natural way is to say a "company" received an "investment". On the other hand if I am thinking of investment, I can say it has been "secured by" a company.

In the UI, look at the directions of the arrows and think of the sentences "X relationship Y and Y relationship X", for example.

How to name relations

In this case we are using two different verbs, but often the simple solution is to use active/passive, for example "saw" and "seen by". Sometimes the inverse is the same property is the same, for example "brother of" or "spouse".

As a rule of thumb , it is always best to keep things quite short. For example, "source" "is source of" and the like.

For more information about the Relational Navigator, the component which provides the navigation between relationally connected dashboards, see chapter Relational Navigator

Relational browsing

Siren Investigate enables you to filter documents on a dashboard by showing only those that have a relation with documents displayed on a different dashboard, possibly stored in different indices.

Relational Navigator

The Relational Navigator component typically requires no configuration. It automatically discovers all possible destinations based on relationships between indices defined in the Management > Indexes and Relations.

It can be reused across any dashboard (there is no need to create different ones for different dashboards).

Each dashboard has an associated "Saved Search" which is based on an underlying Index Pattern. If the index patterns underlying two dashboards have a relational connection, this relationship will be shown by the Relational Navigator, by default.

It is really much much simpler than it sounds. Just drop this component into a dashboard which is associated with a Saved Search and it will show you all the possible relational connections with other dashboards which have related entities.

The relational filter visualization requires the Siren Federate plugin 5.6.9-10.0.0 for Elasticsearch.

Index to index relations

When two entities are directly connected

For example, let’s take the following indices:

article

an index containing articles; each document in the index has a field called companies which is an array that contains the ID of companies mentioned in the article. This index is displayed on the dashboard entitled Articles.

company

an index containing information about companies; each document in the index has a field called id that contains the ID of the company. This index is displayed on the dashboard entitled Companies.

Both indices are configured so that they are joined on the field companies of article with the field id of company. Then, it is possible to use that configuration in order to create a relational filter that would filter companies based on connected articles (or vice-versa).

In the Articles dashboard, the relational navigator visualization is displayed as a button which indicates the number of documents in the Companies dashboard that are mentioned in the articles of the current dashboard.

The following image shows the button for the relation described in the example; there are two possible locations from 646,896 articles currently displayed

  • 18508 companies on all companies dashboard

  • 2 companies on Companies timeline analysis dashboard

Relational Navigator on the Articles dashboard

Clicking the first button will switch you to the Companies dashboard and display the 18508 companies; the relational filter is displayed in the filter bar:

Relational Navigator filter on the Companies dashboard

Index to Entity Identifier relations

When two entities are connected using Entity Identifier

For example, let’s take the following indices:

company

an index containing information about companies; each document in the index has a field called city that contains the name of the city this company is located in

investor

an index containing articles; each document in the index has a field called city that contains the name of the city this investor is located in

city

an entity identifier which represent the concept of a city

There is no direct relation between investor and company but thanks to the city entity identifier the Relational Navigator is able to join the two. The following image shows a light-blue button for a relation using entity identifier described previously.

Relational Navigator on the Investor dashboard

There are two possible locations using is located in relation from 15k investors currently displayed

  • 54028 companies on all companies dashboard

  • 3 companies on Companies timeline analysis dashboard

Ordering

Use the symbol to switch the ordering

Relational filter button on the Articles dashboard

There are two possible ways to navigate the tree of possible destinations, we can order by:

dashboards - where you first see the relation and then the destination dashboards

Order by dashboards first

relations - where you first see the destination dashboards and then which relation you can reach it with

Order by relation first

Configuration

All possible relations are discovered automatically.

Relational Navigator settings

Hierarchy types (layout)

There are two possible values:

Normal - the entity identifiers subtree layout is EIDs → relations → dashboards or EID → dashboards → relations

Light - the entity identifiers subtree layout is collapsed into EIDs → relations → dashboards or EID → dashboards → relations

The following image shows the Light layout activated

Relational Navigator settings, layout

Visibility

In the component setting, one can change the visibility status of each individual connection at multiple levels.

The 3 state eye component enables you to specify "never show", "always show" or "inherit show from the previous".

User can restrict which destinations are visible by clicking the eyes symbols next to listed destinations

For example, lets hide one company dashboard destination when we are on "articles" index

Relational Navigator settings, hide one destination

As we can see now the Companies timeline analysis is no longer displayed on Articles dashboard

Relational Navigator on the Articles dashboard, one destination hiddedn

Relational filter

Deprecated replaced by Relational Navigator

The relational filter visualization enables you to "pivot" from a dashboard to another by creating a join between multiple indices based on their relations. This enables you to interactivelly build the sequence of dashboards to join.

The relational filter visualization is configured based on the relationships between indices defined in the settings tab. For example, let’s take the following indices:

article

an index containing articles; each document in the index has a field called companies which is an array that contains the ID of companies mentioned in the article. This index is displayed on the dashboard entitled Articles.

company

an index containing information about companies; each document in the index has a field called id that contains the ID of the company. This index is displayed on the dashboard entitled Companies.

Both indices are configured so that they are joined on the field companies of article with the field id of company. Then, it is possible to use that configuration in order to create a relational filter that would filter companies based on connected articles (or vice-versa).

In the Articles dashboard, the relational filter visualization is displayed as a button which indicates the number of documents in the Companies dashboard that are mentioned in the articles of the current dashboard.

The following image shows the button for the relation described in the example; there are 18508 companies mentioned in the 646,896 articles currently displayed:

Relational filter button on the Articles dashboard

Clicking the button will switch you to the Companies dashboard and display the 18508 companies; the relational filter is displayed in the filter bar:

Relational filter on the Companies dashboard
The relational filter visualization requires the Siren Federate plugin 5.6.9-10.0.0 for Elasticsearch.

Configuration

To edit the Relational Filter configuration, click Edit on the Dashboard top navigation bar.

Then click Edit (Visualization Edit Button) on the Relational Filter visualization.

The filter is defined by the following parameters:

  • Button label: the label of the button that will be displayed inside the visualization, for example Companies -→.

  • Custom filter label: the label of the filter that will be displayed in the filter bar, which by default is …​ related to ($COUNT) from $DASHBOARD.. Several variables are available for customizing the label:

    • $COUNT is a number of items on source dashboard,

    • $DASHBOARD is a source dashboard name.

  • Source dashboard: optional parameter that indicates on which dashboard the relational filter should appear in.

  • Target dashboard: the dashboard to join the current dashboard with. The current dashboard is equal to the previous field if set.

  • Relation: the label of the relation between indices to use for this relational filter. This is set in the relations settings tab.

The following image shows the configuration of a relation from the Articles dashboard to the Companies dashboard, using the mentions relation:

Relational filter configuration

It is possible to define multiple relations in a single Siren Investigate relational filter visualization; the visualization will display only buttons applicable to the currently displayed dashboard.

Usage

When clicking a button in the relational filter visualization, the current state of the source dashboard is added to the relational filter and applied to the target dashboard. Just move the mouse over relational filter to see an explanation of what is being joined.

Walkthrough example

We start on the Articles dashboard, search for pizza and click the relational filter to switch to the Companies dashboard.

Relational filter explanation

Hovering over the blue filter displays an explanation. It indicates that the relational filter involves only one join, that is the one from Articles to Companies with pizza filtering the articles.

Relational filter explanation

Next, we add a regular filter to the Companies dashboard by clicking Positive Filter (Positive Filter) in the USA row of the Companies by Country visualization.

Relational filter explanation

Now, we click the Investment rounds -→ button which takes us to the Investment rounds dashboard. The explanation on that filter shows that the investment rounds are filtered as follows:

  • the current investments rounds are joined with companies from the USA; and

  • those companies are joined with articles which match the term pizza.

Relational filter explanation
The sequence of the joins in the explanation are shown in reverse, that is the last join is on top.

Viewing detailed information

To display the raw data behind the visualization, click Spy Open (Spy Open Button) at the bottom left of the container. Tabs with detailed information about the raw data replace the visualization, as in this example:

Spy panel of the relational filter visualization

This panel provides two kinds of data: information about the query behind the relational filter in the Multi Search tab, and details about the visualization object in the Debug tab.

This pane presents information about the msearch request executed to perform the joins. A relational filter corresponds to one query of the msearch.

On the top, the time reported in Multi search request duration informs on how long the msearch request took. There is also additional information about each query of the msearch:

  • Query Duration: The time spent for this particular query.

  • Hits: the total number of documents resulting from the query.

  • Index: the index pattern used to execute the query.

  • Type: the type of the indices matched by the index pattern.

For a particular relational filter, you can get additional information about the query that got executed.

Raw Request

The filterjoin query as sent by Siren Investigate. This uses the internal API for defining the join.

Translated Request

The filterjoin query as sent to the Elasticsearch cluster, presented in JSON format.

Response

The raw response from the server, presented in JSON format.

Debug

The Debug tab presents the JSON object that Siren Investigate uses for this relational filter.

Debug spy panel of the relational filter visualization

Join task limit

The number of unique values returned from the source of the relation is limited by the kibi:joinTaskTimeout Advanced Setting in the management section. These source values are then used to filter the documents on the destination. In general, the destination is the current dashboard.

For more on this and how to set the limit for each relation individually, see the Join Limit section of the Relation Panel documentation.

JDBC datasources

Setting up Siren Investigate to work with JDBC datasources.

Siren Investigate can analyze data by directly querying remote JDBC datasources using the Siren Federate plugin.

To create dashboards on JDBC datasources you must:

You can then configure index patterns on your virtual indices, display them in Discover and configure dashboards and visualizations for the supported aggregations.

Siren Federate plugin configuration

The Federate plugin stores its configuration in two Elasticsearch indices:

  • .siren-federate-datasources: used to store the JDBC configuration parameters of remote datasources.

  • .siren-federate-indices: used to store the configuration parameters of virtual indices.

You should restrict access to these indices only to the Federate user, as explained later in the document.

Settings

In order to send queries to virtual indices the Elasticsearch cluster must contain at least one node enabled to issue queries over JDBC; it is advised to use a coordinating only node for this role, although this is not a requirement for testing purposes.

JDBC node settings

In order to enable JDBC on a node where the Siren Federate plugin is installed, add the following setting to elasticsearch.yml:

node.attr.connector.jdbc: true

Then, create a folder named jdbc-drivers inside the configuration folder of the node (for example elasticsearch/config or /etc/elasticsearch).

Finally, copy the JDBC driver for your remote datasource and its dependencies to the jdbc-drivers folder you created and restart the node; see the JDBC driver installation and compatibility section for a list of compatible drivers and dependencies.

Common configuration settings

Encryption

JDBC passwords are encrypted by default using a predefined 128 bit AES key; before creating datasources, it is advised to generate a custom key by running the keygen.sh script included in the siren-federate plugin folder as follows:

bash plugins/siren-federate/tools/keygen.sh -s 128

The command will output a random base64 key; it is also possible to generate keys longer than 128 bit if your JVM supports it.

To use the custom key, the following parameters must be set in elasticsearch.yml on master nodes and on all the JDBC nodes:

  • siren.connector.encryption.enabled: true by default, can be set to false to disable JDBC password encryption.

  • siren.connector.encryption.secret_key: a base64 encoded AES key used to encrypt JDBC passwords.

Example elasticsearch.yml settings for a master node with a custom encryption key:

siren.connector.encryption.secret_key: "1zxtIE6/EkAKap+5OsPWRw=="

Example elasticsearch.yml settings for a JDBC node with a custom encryption key:

siren.connector.encryption.secret_key: "1zxtIE6/EkAKap+5OsPWRw=="
node.attr.connector.jdbc: true

Restart the nodes after changing the configuration to apply the settings.

Cluster wide settings

The following parameters can be set in elasticsearch.yml on JDBC nodes or by using the Elasticsearch cluster update settings API:

  • siren.connector.siren.timeout.connection: the maximum amount of seconds to wait when establishing or acquiring a JDBC connection (30 by default).

  • siren.connector.timeout.query: the maximum execution time for JDBC queries, in seconds (30 by default).

  • siren.connector.enable_union_aggregations: true by default, can be set to false to disable the use of unions in nested aggregations.

Authentication

The Federate server role

If your cluster is protected by Search Guard or Elastic X-Pack, it is required to define a role with access to the Federate indices and internal operations and to create a user having this role.

For interoperability with these plugins, whenever a virtual index is created the Federate plugin creates a concrete Elasticsearch index with the same name as the virtual index; when starting up, the Federate plugin will check for missing concrete indices and will attempt to create them automatically.

Sample Search Guard role definition:

federateserver:
  cluster:
    - "indices:admin/aliases"
  indices:
    ?siren-federate-datasources:
      '*':
        - ALL
    ?siren-federate-indices:
      '*':
        - ALL
    ?siren-federate-target:
      '*':
        - ALL

Sample X-Pack role definition:

{
  "cluster": [
    "monitor",
    "cluster:admin/siren/connector"
  ],
  "indices" : [
    {
      "names" : [ "*" ],
      "privileges" : [ "create_index", "indices:data/read/get", "indices:admin/siren/connector" ]
    },
    {
      "names" : [ ".siren-federate-*" ],
      "privileges" : [ "all", "indices:admin/siren/connector" ]
    }
  ]
}

Then create a user with that role, for example a user called federateserver.

Example elasticsearch.yml settings for a master node in a cluster with authentication and federateserver user:

siren.connector.username: federateserver
siren.connector.password: password
siren.connector.encryption.secret_key: "1zxtIE6/EkAKap+5OsPWRw=="

Example elasticsearch.yml settings for a JDBC node in a cluster with authentication and federateserver user:

siren.connector.username: federateserver
siren.connector.password: password
siren.connector.encryption.secret_key: "1zxtIE6/EkAKap+5OsPWRw=="
node.attr.connector.jdbc: true

Restart the nodes after setting the appropriate configuration parameters.

Administrative role

In order to manage datasources and virtual indices, it is required to grant the cluster:admin/siren/connector/* permissions at the cluster level.

In addition, the user must have the indices:admin/siren/connector/* and indices:data/siren/connector/* permissions on all the index names that he’s allowed to define, in addition to create, write, read and search permissions.

Write permissions are required because when a virtual index is defined the plugin will create a concrete Elasticsearch index with the same name for interoperability with authentication plugins, unless such index already exists.

Example Search Guard role allowed to manage virtual indices starting with db-:

sirenadmin:
  cluster:
    - SIREN_CLUSTER
    - cluster:admin/plugin/siren/license/put
    - cluster:admin/plugin/siren/license/get
    - cluster:admin/siren/connector/*
  indices:
    'db-*':
      '*':
        - SIREN_READWRITE
        - indices:admin/create
        - indices:admin/siren/connector/*
    '*':
      '*':
        - SIREN_COMPOSITE

Example X-Pack role allowed to manage virtual indices starting with db-:

{
  "cluster": [
    "cluster:admin/siren/connector"
    "cluster:admin/plugin/siren/license",
    "cluster:siren/internal",
    "manage"
  ],
  "indices" : [
    {
      "names" : [ "*" ],
      "privileges" : [ "indices:siren/mplan" ]
    },
    {
      "names" : [ "db-*" ],
      "privileges" : [
        "read",
        "create_index",
        "view_index_metadata",
        "indices:data/siren",
        "indices:siren",
        "indices:admin/version/get",
        "indices:admin/get",
        "indices:admin/siren/connector"
      ]
    }
  ]
}
Search role

In order to search virtual indices, users must have the indices:data/siren/connector/* permissions on these indices in addition to standard standard read and search permissions.

Example Search Guard role allowed to search virtual indices starting with db-:

sirenuser:
  cluster:
    - SIREN_CLUSTER
  indices:
    '*':
      '*':
        SIREN_COMPOSITE
    'db-*':
      '*':
        - SIREN_READONLY
        - indices:data/siren/connector/*

Example X-Pack role allowed to search virtual indices starting with db-:

{
  "cluster": [
    "cluster:admin/plugin/siren/license/get",
    "cluster:siren/internal"
  ],
  "indices" : [
    {
      "names" : [ "*" ],
      "privileges" : [ "indices:siren/mplan" ]
    },
    {
      "names" : [ "db-*" ],
      "privileges" : [
        "read",
        "view_index_metadata",
        "indices:data/siren",
        "indices:siren",
        "indices:admin/version/get",
        "indices:admin/get"
      ]
    }
  ]
}

JDBC driver installation and compatibility

The JDBC driver for your remote datasource and its dependencies must be copied to the jdbc-drivers subfolder inside the configuration folder of JDBC nodes (for example elasticsearch/config/jdbc-drivers).

It is not required nor recommended to copy these drivers to nodes which are not enabled to execute queries.

Table 3. List of supported JDBC drivers
Name JDBC class Notes

PostgreSQL

org.postgresql.Driver

Download the latest JDBC 4.2 driver from https://jdbc.postgresql.org/download.html and copy the postgresql-<version>.jar file to the jdbc-drivers folder.

MySQL

com.mysql.jdbc.Driver

Download the latest GA release from https://dev.mysql.com/downloads/connector/j/, extract it, then copy mysql-connector-java-<version>.jar to the jdbc-drivers plugin folder.

When writing the JDBC connection string, set the useLegacyDatetimeCode parameter to false to avoid issues when converting timestamps.

Microsoft SQL Server 2014 or greater

com.microsoft.sqlserver.jdbc.SQLServerDriver

Download sqljdbc_<version>_enu.tar.gz from https://www.microsoft.com/en-us/download/details.aspx?id=55539, extract it, then copy mssql-jdbc-<version>.jre8.jar to the jdbc-drivers folder.

Sybase ASE 15.7+

com.sybase.jdbc4.jdbc.SybDriver

OR

net.sourceforge.jtds.jdbc.Driver

To use the FreeTDS driver, download the latest version from https://sourceforge.net/projects/jtds/files/, extract it, then copy jtds-<version>.jar to the jdbc-drivers folder.

To use the jConnect driver, copy jConnect-<version>.jar from your ASE folder to the jdbc-drivers folder.

Oracle 12c+

oracle.jdbc.OracleDriver

Download the latest ojdbc8.jar from http://www.oracle.com/technetwork/database/features/jdbc/jdbc-ucp-122-3110062.html and copy it to the jdbc-drivers plugin folder.

Presto

com.facebook.presto.jdbc.PrestoDriver

Download the latest JDBC driver from https://prestodb.io/docs/current/installation/jdbc.html and copy it to the jdbc-drivers plugin folder.

Spark SQL 2.2+

com.simba.spark.jdbc41.Driver

The Magnitude JDBC driver for Spark can be purchased at https://www.simba.com/product/spark-drivers-with-sql-connector/; once downloaded, extract the bundle, then extract the JDBC 4.1 archive and copy the following jars to the jdbc-drivers plugin folder:

SparkJDBC41.jar

commons-codec-<version>.jar

hive_metastore.jar

hive_service.jar

libfb303-<version>.jar

libthrift-<version>.jar

ql.jar

TCLIServiceClient.jar

zookeeper-<version>.jar

In addition, copy your license file to the jdbc-drivers plugin folder.

Dremio

com.dremio.jdbc.Driver

Download the jar at https://download.siren.io/dremio-jdbc-driver-1.4.4-201801230630490666-6d69d32.jar and copy it to the jdbc-drivers plugin folder.

Impala

com.cloudera.impala.jdbc41.Driver

Download the latest JDBC bundle from https://www.cloudera.com/downloads/connectors/impala/jdbc/2-5-42.html, extract the bundle, then extract the JDBC 4.1 archive and copy the following jars to the jdbc-drivers plugin folder:

ImpalaJDBC41.jar

commons-codec-<version>.jar

hive_metastore.jar

hive_service.jar

libfb303-<version>.jar

libthrift-<version>.jar

ql.jar

TCLIServiceClient.jar

zookeeper-<version>.jar

Restart the JDBC node after copying the drivers.

Siren Investigate Datasource Configuration

Open Siren Investigate in your browser, then go to Management/Datasource:

Navigate to Management/Datasource

Select the JDBC choice in the dropdown:

Select JDBC option

The datasource configuration supports the following parameters:

  • Database name: the name of the default database / catalog on the remote datasource (usually optional).

  • Datasource name: the name of the datasource (for example mysql-sales)

  • Driver class: the JDBC driver class name (for example com.mysql.jdbc.Driver)

  • Username and Password: the credentials of the user that will be used by the plugin to open connections.

  • Timezone: if date and timestamp fields are stored in a timezone different that UTC, specifying this parameter will instruct the plugin to convert dates and times to/from the specified timezone when performing queries and retrieving results.

  • Connection string: the JDBC connection string; see the JDBC driver installation and compatibility section for information about database specific connection string parameters.

Fill in the required parameters, then press Save in the top right corner.

Fill in connection parameters

Check the configuration by pressing Test Connection. If the settings are properly configured you should get the following feedback:

Test connection

Press Yes, take me there to map a table from the DB to a virtual index:

Virtual Index Configuration

The virtual index configuration supports the following parameters:

  • Datasource name: the name of an existing datasource.

  • Resource name: the name of a table or view on the remote datasource.

  • Virtual index name: the name of the virtual index; this must be a valid lowercase Elasticsearch index name. You should start virtual indices with a common prefix to simplify handling of permissions.

  • Primary key: the name of a unique column; if a virtual index has no primary key it will be possible to perform aggregations, however visualizations that require a unique identifier such as the graph browser will not be useable on the index.

  • Catalog and Schema: the catalog and schema containing the table specified before; these are usually required only if the connection does not specify a default catalog or schema.

  • Search fields: an optional list of field names that will be searched using the LIKE operator when processing queries written in the search bar.

After the virtual index is configured, press Save in the top right corner; press Yes take me there to create an index pattern pointing to the virtual index.

Virtual Index Configuration Success

Press Add Index Pattern and fill in the name with the same name used for the Virtual Index, in this example indexfromdb, and press Create.

Index Pattern Configuration

From this point, the indexfromdb index pattern can be used in Discovery, Visualize etc.

Operations on virtual indices

The plugin supports the following operations on virtual indices:

  • get mapping

  • get field capabilities

  • search

  • msearch

  • get

  • mget

Search requests involving a mixture of virtual and normal Elasticsearch indices (for example when using a wildcard) are not supported and will be rejected; it is however possible to issue msearch requests containing requests on normal Elasticsearch indices and virtual indices.

When creating a virtual index, the plugin will create an empty Elasticsearch index for interoperability with Search Guard and X-Pack; if an Elasticsearch index with the same name as the virtual index already exists and it is not empty, the virtual index creation will fail.

When deleting a virtual index, the corresponding Elasticsearch index will not be removed.

Type conversion

The plugin converts JDBC types to their closest Elasticsearch equivalent:

  • String types are handled as keyword fields.

  • Boolean types are handled as boolean fields.

  • Date and timestamp are handled as date fields.

  • Integer types are handld as long fields.

  • Floating point types are handled as double fields.

Complex JDBC types which are not recognized by the plugin are skipped during query processing and resultset fetching.

Supported search queries

The plugin supports the following queries:

  • match_all

  • term

  • terms

  • range

  • exists

  • prefix

  • wildcard

  • ids

  • bool

At this time the plugin provides no support for datasource specific full text search functions, so all these queries will work as if they were issued against keyword fields.

Supported aggregations

Currently the plugin provides support for the following aggregations:

Metric:

  • Average

  • Cardinality

  • Max

  • Min

  • Sum

Bucket:

  • Date histogram

  • Histogram

  • Date range

  • Range

  • Terms

  • Filters

Only terms aggregations can be nested inside a parent bucket aggregation.

Known Limitations

  • Cross backend join currently supports only integer keys.

  • Cross backend support has very different scalability according to the direction of the Join, a join which involves sending IDs to a remote system will be possibly hundreds of times less scalable (for example thousands instead of millions) to one where the keys are fetched from a remote system.

  • Only terms aggregations can be nested inside a parent bucket aggregation.

  • The missing parameter in bucket aggregations is not supported.

  • Scripted fields are not supported.

  • When issuing queries containing string comparisons, the plugin does not force a specific collation, if a table behind a virtual indices uses a case insensitive collation, string comparisons will be case insensitive.

  • Wildcards on virtual index names are not supported by any API; a wildcard search will silently ignore virtual indices.

  • Currently cross cluster searches on virtual indices are not supported.

Troubleshooting

Cannot reconnect to datasource by hostname after DNS update

When the Java security manager is enabled, the JVM will cache name resolutions indefinitely; if the system you are connecting to uses round-robin DNS or the IP address of the system changes frequently, you must modify the following Java Security Policy properties:

  • networkaddress.cache.ttl: the number of seconds to cache a successful DNS lookup. Defaults to -1 (forever).

  • networkaddress.cache.negative.ttl: the number of seconds to cache an unsuccessful DNS lookup. Defaults to 10, set to 0 to avoid caching.

Legacy REST datasources

Siren Investigate provides visualizations and aggregations to integrate data from REST APIs. This section explains how to configure queries and query templates.

Configuration

To create a new external datasource navigate to "Settings/Datasources".

First fill the datasource title and name then select REST, then set the following parameters:

  • timeout: connection timeout in milliseconds.

  • cache_enabled: enable server side cache for this datasource.

  • max_age: the max age of an object in the cache, in milliseconds.

  • url: the URL of the REST API.

  • response_type: API results format, currently Siren Investigate supports only JSON.

  • username: if set, the username to specify in HTTP Basic credentials.

  • password: optional password to specify in HTTP Basic credentials if a username is set.

  • auth_token: optional token to set in Token Authentication headers.

To control the maximum number of query results kept in cache, set the investigate_core.datasource_cache_size parameter in investigate.yml and restart Siren Investigate.

Parameters encryption

Sensitive datasource parameters like passwords are encrypted before being stored in the backend.

Before creating datasources containing sensitive parameters, ensure to set a custom encryption key by running the replace_encryption_key command:

bin/investigate replace_encryption_key [options] <current_key> <new_key> <new_cipher>
  • current_key: a base64 encoded string containing the current encryption key.

  • new_key: a base64 encoded string containing the new encryption key.

  • new_cipher: the cipher algorithm to use (currently only AES-GCM is supported).

The current encryption key can be read from the investigate.yml file in the datasource_encryption_key parameter.

Keys can have a length of 16, 24 or 32 bytes; a quick way to encode a plaintext string to base64 is to use the base64 utility from the coreutils package:

$ echo -n changemechangemechangemechangeme | base64
Y2hhbmdlbWVjaGFuZ2VtZWNoYW5nZW1lY2hhbmdlbWU=
Make sure to set the configuration file as readable only by the user running the Siren Investigate process.

Datasource entity selection

Selected Entities can be used as source of parameters for queries. Each selected entity is uniquely identified by an URI:

  • INDEX/TYPE/ID where INDEX is an index pattern, TYPE is a type of a document, and ID is document ID.

As explained in the following sections, queries on external datasources can extract variables from the selected entity URI; in order to allow the user to select an entity, you must add an Enhanced search results visualization to a dashboard and configure at least one click handler to select an entity.

After the visualization is configured, clicking the cell will display a purple box in the filter bar, and the variables stored in the entity URI will be available to queries and query templates.

The following image shows the effect of clicking a cell configured with an entity selection handler; after selecting an entity, the Company Info template viewer shows the information about the company fetched by a query.

Entity selection
Entity selection configuration example

To disable or cancel the selection, click the icons displayed inside the entity selection widget when the mouse is over it, as displayed in following image:

Entity selection options

Queries

Queries can be used to provide data to Templates, tag and filter Elasticsearch documents.

To create a new query, click to the "Settings/Queries" tab.

You need then to set the following fields to define a query:

  • Title: the title of the query.

  • Datasource: the name of a configured datasource.

  • Results query: the query declaration.

You may also set a description for the query and one or more tags.

The following is an example configuration of a query on a SQL database called Top 50 companies (HR count) that returns the Top 50 companies by number of employees in a table called company.

Configuration of a SQL endpoint

The preview section will display the results of the query as a table or as a JSON object.

Template rendering is currently a blocking operation, therefore queries returning a large number of results may make the backend unresponsive for an indeterminate amount of time.

Query variables

One of the most useful features of queries is that it is possible to set some of their parameters before execution by using datasource specific variables, which can be set at runtime by configuring click handlers in the Enhanced search results visualization to select an entity.

Variable values are taken from Elasticsearch document selected using selected entity URI.

All properties from selected document can be accessed using the following syntax: @doc[PATH_ELEMENT_1][PATH_ELEMENT_2]…​[PATH_ELEMENT_N]@

  • to get the document id use: @doc[_id]@

  • to get the value of property called action use: @doc[_source][action]@

  • to get the value of nested property called person.age use: @doc[_source][person][age]@

In order to view the results of the query, you have to specify an entity URI manually in the field on the top right;

The following is an example of configuration for a query named Company Info using a variable to get the value of property called id of currently selected entity In the example, @doc[_source][id]@ is replaced with an id taken from selected company. In the Selected Entity box we see that the selected company is from index: company, has a type: Company and has the id AVgfaYQ0Q2VQXwxDgyfY

SQL query with variables

Activation query

An activation query can be specified to conditionally execute the results query.

For example, if you have a table called Vehicles but some of the queries are only relevant to "Motorcycles" and not to "Cars", the activation query could be used to determine if the results query should be executed when an entity in Vehicles by looking at its type. If the query is not executed, any template or aggregator using the query will be automatically disabled.

On SQL datasources, activation queries will trigger results query execution when returning at least one record.

Example:

SELECT id
FROM Vehicles
WHERE id='@doc[_source][id]@' AND vehicle_type='Motorcycle'

Use cases

After you have configured query templates and queries, you can use them in the following visualizations:

It is also possible to use queries as aggregations as explained as follows.

External query terms filters aggregation

The query results from an external data source can be used as an aggregation in visualizations.

This enables you to compute metrics on Elasticsearch documents joined with query results.

To use a query as an aggregation, select a bucket type and select External Query Terms Filter in the Aggregation dropdown; then, click the Add an external query terms filter button.

You can then configure how to join the query results with the Elasticsearch documents by setting the following parameters:

  • Source query id: the name of the query on the external datasource.

  • Source query variable: the name of the variable in query results which contains the first value used in the join.

  • Target field: the name of the field in the target index which contains the second value used in the join.

The aggregation will return only documents in the Elasticsearch index whose target field value is equal to the source query variable value in at least one of the results returned by the query; if Negate the query is checked, the aggregation will return only documents in the Elasticsearch index whose target field value is not equal to any of the values of the source query variable in the results returned by the query.

For example, the following image show the configuration of a Data table visualization with three aggregations based on external queries:

  • A query that selects the labels of the competitors of the currently selected company

  • A query that selects the labels of all the companies which have a competitor

  • A query that selects the IDs of the top 500 companies by number of employees

If a query requires a selected entity, and no entity is selected, the computed aggregation will return 0, also the controls to select Selected entity will indicate (red borders arround) that it is necessary to select one.

Configuration of an external query terms filter aggregation on a data table visualization

The following image shows the configuration of two external query terms filter aggregation on a pie chart visualization:

Configuration of an external query terms filter aggregation on a pie chart visualization

Siren Investigate Gremlin Server

The Siren Investigate Gremlin Server component is a backend component required by the Siren Investigate Graph Browser visualization.

In order to enable the Gremlin Server, ensure that investigate.yml contains the following configuration:

investigate_core:
  gremlin_server:
    url: http://127.0.0.1:8061
    path: gremlin_server/gremlin-es2-server.jar

To use Gremlin Server with an authentication enabled cluster, refer to the Authentication and access control section.

Log4J file configuration path:

Log4J configuration file is optional for the Gremlin server. If you want to use your own custom configuration, you can specify the path to your file with the investigate_core.gremlin_server.log_conf_path parameter inside your investigate.yml file. Here is an example of how to configure the log4j.properties file for your Gremlin server:

# For the general syntax of property based configuration files see
# the documentation of org.apache.log4j.PropertyConfigurator.

# The root category uses two appenders: A1 and FILE.
# Both gather all log output starting with the priority INFO.
log4j.rootLogger=INFO, A1, FILE

log4j.appender.A1=org.apache.log4j.ConsoleAppender
log4j.appender.A1.layout=org.apache.log4j.PatternLayout
log4j.appender.A1.threshold=INFO
# Print the date in ISO 8601 format
log4j.appender.A1.layout.ConversionPattern=%d [%t] %-5p %c - %m%n

log4j.appender.FILE=org.apache.log4j.FileAppender
log4j.appender.FILE.append=true
log4j.appender.FILE.file=log/gremlin-server.log
log4j.appender.FILE.threshold=INFO
log4j.appender.FILE.layout=org.apache.log4j.PatternLayout
log4j.appender.FILE.layout.ConversionPattern=%-5p %c: %m%n


# Print only messages of level WARN or higher in the package org.springframework
log4j.logger.org.springframework=WARN

Cross frame communication

To allow cross frame communication, Siren Investigate exposes an object at window.sireninvestigate; the object can be called only if both Siren Investigate and the container page are in the same domain.

Methods

generateShortUrl(shareAsEmbed, displayNavBar)

Generates a shortened URL containing the current Siren Investigate state and returns a promise fulfilled with the URL.

Parameters:

  • shareAsEmbed: if set to true, the top navigation bar and dashboard tabs will be hidden when opening the shortened URL.

  • displayNavBar: if set to true, the dashboard tabs will not be hidden when sharedAsEmbed is set to true.

Sample usage:

Put the following code in the container page, replacing investigateframe with the ID of the frame in which Siren Investigate is embedded:

document.getElementById('investigateframe')
.contentWindow
.sireninvestigate
.generateShortUrl(true, true)
.then(function(url) {
  console.log("Generated URL: " + url);
})
.catch(function(error) {
  console.log("An error occurred while generating the URL");
});

If possible, you should purge old documents of type url from the .siren index periodically; old documents can be identified by looking at the createDate attribute.

setJWTToken(token)

Sets or updates the JWT token for the current session if JWT authentication support is enabled; returns a Promise after the token has been sent to the backend.

Parameters:

  • jwtToken: a base64 encoded JWT token.

Sample usage:

Put the following code in the container page, replacing investigateframe with the ID of the frame in which Siren Investigate is embedded:

document.getElementById('investigateframe')
.contentWindow
.sireninvestigate
.setJWTToken(`eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJraWJpdXNlciJ9.kZhLu15FwxrX4hPE1ciyzw_NufZ_oH7aSGpLZHachPg`)
.then(function() {
  console.log('JWT token set.');
})
.catch(function(error) {
  console.log('An error occurred setting the token.');
});

After the token is set, you can change the Siren Investigate URL and the user should be authenticated; the application should call the method again with an updated token before the current one expires.

Loading data into Elasticsearch

This chapter contains basic information on how to load data into Elasticsearch for evaluation purposes.

From a SQL database using Logstash

The indices in the Siren Platform demo distribution have been populated by running four Logstash configurations over the SQLite database in siren-investigate/crunchbase.db.

The database has the following schema:

SQLite database schema

Index setup

Before loading data, we need to setup indices and mappings; for example, let’s create an index called company-minimal in the Elasticsearch cluster at http://localhost:9220.

Create the index by running the following command in a terminal window:

curl -X PUT http://localhost:9220/company-minimal

If curl is not available on your system, download it from http://curl.haxx.se/download.html .

If the index is created correctly, Elasticsearch will return the following response:

{"acknowledged":true}

If you want to destroy the index and start from scratch, execute the following command:

curl -X DELETE http://localhost:9220/company-minimal

Mapping definition

Mappings allow the user to configure how documents are stored in the index. For example, they allow you to define how fields are matched by the search engine and set their type (string, dates, numbers, locations and so on).

for detailed documentation about indices and mappings we recommend reading the Elasticsearch Reference.

Let’s define a simple mapping to describe a company. The mapping will define the following fields:

  • id: the id of the company in the SQLite database

  • name: the name of the company

  • description: a description of the company

  • homepage: the URL of the company homepage

  • number_of_employees: the number of employees

  • location: the geographical coordinates of the company

Open a text editor and paste the following text:

{
    "CompanyMinimal": {
        "properties": {
            "id": {
                "type": "keyword"
            },
            "number_of_employees": {
                "type": "long"
            },
            "name": {
                "type": "text"
            },
            "description": {
                "type": "text"
            },
            "homepage": {
                "type": "keyword"
            },
            "location": {
                "type": "geo_point"
            }
        }
    }
}

CompanyMinimal is the name of the mapping; properties contains the options for each field.

Save the file to demo/example/CompanyMinimal.mapping inside the folder where you extracted the demo distribution.

To apply the mapping, execute the following command:

curl -X PUT "http://localhost:9220/company-minimal/_mapping/CompanyMinimal" -d "@demo/example/CompanyMinimal.mapping"

If the mapping is created correctly, Elasticsearch will return the following response:

{"acknowledged":true}

SQL query definition

To extract the values that will be loaded to the index by Logstash, we need to write a SQL query. Open a text editor and paste the following one:

SELECT id,
  label AS name,
  description,
  homepage_url as homepage,
  number_of_employees,
  CASE WHEN lat IS NULL THEN
    NULL
  ELSE
    lat || ', ' || lng
  END AS location
  FROM company
  LEFT JOIN company_geolocation ON company.id = company_geolocation.companyid

Save the file to demo/example/company-minimal.sql inside the folder where you extracted the demo distribution.

Logstash configuration

We now need to write a Logstash configuration to process the records returned by the query and populate the company-minimal index.

Support for SQL databases is provided by the Logstash jdbc input plugin; You must download logstash to demo/example folder and install the required plugin

Open a text editor and paste the following:

input {
  jdbc {
    jdbc_driver_library => "sqlitejdbc-v056.jar"
    jdbc_driver_class => "org.sqlite.JDBC"
    jdbc_connection_string => "jdbc:sqlite:crunchbase.db"
    jdbc_user => ""
    jdbc_password => ""
    statement_filepath => "company-minimal.sql"
    jdbc_paging_enabled => true
    jdbc_page_size => 10000
  }
}

filter {
  mutate {
    remove_field => ["@timestamp", "@version"]
  }
}

output {
  elasticsearch {
    hosts => "localhost:9220"
    manage_template => false
    action => "index"
    index => "company-minimal"
    document_type => "CompanyMinimal"
  }
}

The statement_filepath parameter specifies the path to the file containing the SQL query; the jdbc_* parameters set the database connection string and authentication options.

The mutate filter is configured to remove default Logstash fields which are not needed in the destination index.

The output section specifies the destination index; manage_template is set to false as the index mapping has been explicitly defined in the previous steps.

Save the file to demo/example/company-minimal.conf

Copy the SQLite database to demo/example/crunchbase.db, then go to the demo/example folder and run the following command:

cd demo/example
logstash/bin/logstash -f company-minimal.conf

Logstash will execute the query and populate the index.

for more information about Logstash, we recommend reading the Logstash reference and the jdbc input plugin documentation.

Browsing the index in Siren Investigate

Open http://localhost:5606 in your browser, click the Management tab then on Index Patterns.

Deselect Index contains time-based events, then write company-minimal in the Index name or pattern field:

Adding the company-minimal index

Click Create to create the index reference, then click the Discover tab and select company-minimal in the dark gray dropdown:

Discovering the company-minimal index

Click the right arrow at the beginning of each row to expand it and see all the loaded fields:

Viewing all the fields in a document

Script to load the demo data

The complete demo data loading process can be repeated by running the demo/sql/bin/index_crunchbase_sqlite.sh script. The script performs the following actions:

  • Creates a copy of the database in the folder containing Logstash configurations

  • Creates the indices article, company, investor and investment

  • Sets the mappings for each index

  • Runs the logstash configuration for each index

The Logstash configurations and Elasticsearch mappings are available in the demo/sql/crunchbase/conf/logstash_sqlite folder.

Acknowledgements

CentOS and Red Hat are trademarks of Red Hat Inc., registered in the U.S. and other countries. CrunchBase is a trademark of CrunchBase Inc., registered in the U.S. and other countries. Elasticsearch, Kibana, Logstash, and Packetbeat are trademarks of Elasticsearch BV, registered in the U.S. and other countries. Excel, SQL Server, and Windows are trademarks of Microsoft Corporation, registered in the U.S. and other countries. Java, Javascript, and Oracle are trademarks of Oracle Corporation, registered in the U.S. and other countries. macOS is a trademark of Apple Inc., registered in the U.S. and other countries. OpenSUSE is a trademark of SUSE LLC, registered in the U.S. and other countries. Search Guard is a trademark of floragunn GmbH, registered in the U.S. and in other countries.

All other trademarks are the property of their respective owners. All trademarks, registered trademarks and copyrighted terms in the Siren Investigate demo dataset are the property of their respective owners.

Siren Investigate plugins

Add-on functionality for Siren Investigate/Kibana is implemented with plugin modules. You can use the bin/investigate-plugin command to manage these modules. You can also install a plugin manually by moving the plugin file to the plugins folder and unpacking the plugin files into a new folder.

Generally Kibana plugins are compatible with Siren Investigate provided the plugin is compatible with the Kibana version mentioned in the Management section.

Plugin Compatibility
Plugin compatibility

The Kibana plugin interfaces are in a state of constant development. We cannot provide backwards compatibility for plugins due to the high rate of change. Kibana enforces that the installed plugins match the version of Kibana itself. Plugin developers will have to release a new version of their plugin for each new Kibana release as a result.

Installing plugins

Use the following command to install a plugin:

bin/investigate-plugin install <package name or URL>

When you specify a plugin name without a URL, the plugin tool attempts to download an official Elastic plugin, such as:

$ bin/investigate-plugin install x-pack

Installing plugins from an arbitrary URL

You can download official Elastic plugins simply by specifying their name. You can alternatively specify a URL to a specific plugin, as in the following example:

$ bin/investigate-plugin install https://artifacts.elastic.co/downloads/packs/x-pack/x-pack-10.0.0.zip

You can specify URLs that use the HTTP, HTTPS, or file protocols.

Installing plugins to an arbitrary folder

Use the -d or --plugin-dir option after the install command to specify a folder for plugins, as in the following example:

$ bin/investigate-plugin install file:///some/local/path/x-pack.zip -d path/to/folder
This command creates the specified folder if it does not already exist.

Installing plugins with Linux packages

The Siren Investigate server needs to be able to write to files in the optimize folder. If you are installing plugins using sudo or su you will want to ensure these commands are ran as the user myuser. This user is already added for you as part of the package installation.

$ sudo -u myuser bin/investigate-plugin install x-pack

If plugins were installed as a different user and the server is not starting, then you must change the owner of these files:

$ chown -R myuser:myuser /path/to/siren-investigate/optimize

Updating and removing plugins

To update a plugin, remove the current version and reinstall the plugin.

To remove a plugin, use the remove command, as in the following example:

$ bin/investigate-plugin remove

You can also remove a plugin manually by deleting the plugin’s subfolder under the plugins/ folder.

Removing a plugin will result in an "optimize" run which will delay the next start of Siren Investigate.

Disabling plugins

Use the following command to disable a plugin:

./bin/investigate --<plugin ID>.enabled=false (1)
Disabling or enabling a plugin will result in an "optimize" run which will delay the start of Siren Investigate.
1 You can find a plugin’s plugin ID as the value of the name property in the plugin’s package.json file.

Configuring the plugin manager

By default, the plugin manager provides you with feedback on the status of the activity you have asked the plugin manager to perform. You can control the level of feedback with the --quiet and --silent options. Use the --quiet option to suppress all non-error output. Use the --silent option to suppress all output.

By default, plugin manager requests do not time out. Use the --timeout option, followed by a time, to change this behavior, as in the following examples:

Waits for 30 seconds before failing
bin/investigate plugin --install username/sample-plugin --timeout 30s
Waits for 1 minute before failing
bin/investigate plugin --install username/sample-plugin --timeout 1m

Plugins and custom Siren Investigate configurations

Use the -c or --config options to specify the path to the configuration file used to start Siren Investigate. By default, Siren Investigate uses the configuration file config/investigate.yml. When you change your installed plugins, the bin/investigate plugin command restarts the Siren Investigate server. When you are using a customized configuration file, you must specify the path to that configuration file each time you use the bin/investigate plugin command.

Plugin manager exit codes

0

Success

64

Unknown command or incorrect option parameter

74

I/O error

70

Other error

Limitations

Siren Investigate currently has the following limitations.

Nested objects

Siren Investigate cannot perform aggregations across fields that contain nested objects. It also cannot search on nested objects when Lucene Query Syntax is used in the query bar.

Using include_in_parent or copy_to as a workaround is not supported and may stop functioning in future releases.

Release notes

This section summarizes the changes in each release.

Siren Investigate 10.0.0

Siren Investigate Changes

Added:

  • Added Elasticsearch 5.6.9 compatibility.

  • Added a JDBC datasource browser that allows the user to browse the datasource that is used when creating a virtual index and to select which table to import.

  • Now the system offers to automatically add a saved search when creating an index pattern.

  • After index creation, the user is now taken to the new index’s edit page for modification, if needed.

  • EID buttons now reflect changes to counts in the data, for example after applying a filter.

  • Added a user confirmation to the CLI upgrade procedure to check if the user has backed up their .siren index.

  • Investigate now handles empty index patterns more gracefully.

  • The relational graph in the Indexes and Relations section is moved to a tab.

  • Total Duration time of a request is now displayed on the Spy Panel.

  • Added config file migration for investigate.ymls to allow migration between post-10 versions.

  • Added migrations for custom configuration .ymls or .ymls in custom folders.

Fixed:

  • Fixed bug in the Relational Navigator when creating a dashboard without a saved search.

  • Fixed bug where the Relational Navigator would show an EID button, even though there was no destination dashboard.

  • A number of fixes to the upgrade backup process:

    • Now the backup files are backed up to the /data folder

    • Allow the user to specify a custom backup folder

    • Changed backup folder names to use ISO datetimes for timestamp

    • The index is removed and restored from scratch if there is a problem to prevent extra objects from the new index remaining.

  • Fixed missing docs link in time filter creator.

  • Fixed visibility toggle on the Relational Navigator - now buttons are hidden when configured in the visualization.

  • Autoselect now does not discard multifields if their parent is unselectable, for example it is not aggregatable

  • Fixed Dashboard sidebar drag and drop UI to make it clearer the dashboard is being dragged when grabbed with the cursor.

  • Fixed explanation when a filter was negated - now says NOT …​.

  • Fixed bug in the Relational Navigator where the buttons were not shown on an index pattern with no relations.

  • Fixed a bug with filters being merged with the state unnecessarily causing issues on dashboard reload.

  • Now deleting an index pattern in Indexes and Relations updates the list so the deleted index pattern is removed.

  • Fixed bug in rendering the TagCloud visualization that would cause a browser crash on tag cloud load.

  • Sorting is now possible again in the Enhanced Table visualization.

  • Fixed filter selection icons showing in each column of a row when hovering over a cell.

  • Fixed a bug where multiple filters from individual relational buttons could be added to the elasticsearch request.

  • Now the names of the datasources can be edited after they have been saved.

  • Now returning more explanation if your query fails because of an Out Of Memory exception.

  • A wildcard query on a dashboard no longer shows a filter icon on the dashboard sidebar.

  • Fixed a bug in relational buttons that would remove parts of the state if a request from the button was null.

  • Completely refactored how automatically generated buttons are rendered to handle the number of requests sent on dashboard navigation.

  • Fixed 'Hide Borders' function. It now hides the borders.

  • Text filter to search relations and edit relational buttons now responds to text input.

  • Now the date is reset when the user cancels an edit in a saved dashboard.

  • Relations with no destination other than the EID are not listed in the automatic relational buttons.

  • Fixed a crash when filtering visualisations.

  • Added support for siren:timePrecision back in.

  • URL shortener in Dashboard Share panel now generating shortened URLs correctly.

  • Fixed intermittent error where dashboard ID was not passed correctly to relational buttons.

  • Allow creation of index pattern directly from create virtual index page without manually editing index pattern name.

  • Fixed bug in saving dashboard in Saved Objects after making no