Siren Platform User Guide

Query based aggregations

It is possible to get additional information about companies by using the results of queries on SQL databases (or any of the datasources supported by Siren Investigate) as aggregations on Elasticsearch documents.

For example, in the Query on Companies visualization you can see that 41 of the 96 companies have competitors, and eight of them are in the top 500 companies by number of employees:

SQL based aggregations

Companies "With competitors" and Top 500 companies (HR count) are queries on the SQLite database. The records returned by the queries are used to filter Elasticsearch documents, which can be then aggregated in a metric.

To better understand this feature, let’s examine the Top 500 companies (HR count) query. To see the query, click Saved Objects in the Management menu.

view_saved_object.png

The query returns the idlabel and number_of_employees columns from the company table for the top 500 companies by number of employees:

select id, label, number_of_employees
from company
where number_of_employees>0
order by number_of_employees desc
limit 500

Click Dashboard, then Edit, and then Edit (fa-pencil.png) in the heading of the Query on Companies visualization to customize its configuration. The metrics section defines the aggregations on Elasticsearch documents, displayed as columns in the table. The buckets section defines the groups of Elasticsearch documents aggregated by metrics, displayed as row headers in the table.

dashboard_edit_query_vis_agg.png

By expanding the Split Rows section inside buckets you can see how the queries are used to define groups of Elasticsearch documents. Scroll down to see the configuration of the fourth filter:

query_vis_filter_agg.png

The filter is configured to execute the query Top 500 companies (HR count) on the SQLite database and return the group of Elasticsearch documents from the current search whose ID is equal to one of the IDs in the query results. The documents are then processed by the Count metric.

Let’s add a new aggregation to show the average number of employees. Click Add metrics inside the metrics section, then select Metric as the metric type; select Average as the aggregation and number_of_employees as the field, then click Apply Changes (fa-play.png).

Save the visualization by clicking Save, then click the Dashboard tab to see the updated visualization in the Companies dashboard:

add_metric.png

Click Add sub-buckets at the bottom, then select Split Rows as the bucket type. Choose the Terms aggregation and the countrycode field from the boxes. Click Apply Changes (fa-play.png) to add an external ring with the new results.

bucket_aggregation.png

For an in-depth explanation of aggregations, see Creating a visualization.

In addition to defining groups to aggregate, queries can be used as filters. To see this click Dashboard, then in the Query on Companies dashboard, move the mouse pointer over the row for Top-500-companies-(HR-count) and click the plus (+) icon that appears.

selecting_filter_queryoncompanies.png

Then you will see only the companies mentioned in the articles which are also in the top 500 by number of employees:

query_vis_filterbar.png