Siren Platform User Guide

Distributed joins between indices

Siren Federate extends the Elasticsearch DSL with a join query clause which enables a user to execute a join between indices (being virtual or not). The join capabilities are implemented on top of an in-memory distributed computing layer which scales with the number of nodes available in the cluster.

The join capability is currently limited to a (left) semi-join between two set of documents based on a common attribute, where the result only contains the attributes of one of the joined set of documents. This join is used to filter one set of documents with a second document set. It is equivalent to the EXISTS() operator in SQL. Joins on both numerical and textual fields are supported, but the joined attributes must be of the same type. You can also freely combine and nest multiple joins using Boolean operators (conjunction, disjunction, negation) to create complex query plans. It is fully integrated with the Elasticsearch API and is compatible with distributed environments.