GQL Overview and Examples
The Graph Query Language (GQL) is an ISO standard that formalizes interactions with a graph database. GQL can be used to query a graph to match patterns represented with an ASCII-art depiction. The Siren implementation of GQL is based on the data model definition of a graph, which allows to decouple the query logic from the data physical modeling. We present here various graph-querying use cases supported in Siren.
We base examples on the LDBC Social Network Benchmark (SNB). This dataset is based on a scenario of operating a social network that is characterized by its graph-shaped data. The data model is depicted with a UML diagram, sourced from the LDBC Github repository. We assume that class names and relationships from the UML diagram are translated into the data model previously presented. For clarity examples only present the GQL queries.
Selecting a pattern made of a single entity
In order to get entities labeled Person, use a node pattern:
SELECT x
FROM "snb"
MATCH (x:Person)
In order to get some properties of a Person entity, such as the lastName and the email:
SELECT x.lastName, x.email
FROM "snb"
MATCH (x:Person)
We can also filter the set of Person to return as result, e.g., only people speaking German:
SELECT x
FROM "snb"
MATCH (x:Person WHERE "speaks: German")
Selecting a pattern made of a single relation
In order to get the entities labeled Person and City connected with the relation labeled isLocatedIn, use an edge pattern:
SELECT x, y
FROM "snb"
MATCH (x:Person) -[:isLocatedIn]-> (y:City)
In order to get the name of entities Person and City connected with the isLocatedIn relationship:
SELECT x.lastName, y.name
FROM "snb"
MATCH (x:Person) -[:isLocatedIn]-> (y:City)
Selecting a pattern made of several relations
It is also possible to retrieve data from a path pattern, i.e., a pattern composed of several edges. Below we retrieve the firstName of person who is member of a forum having the tag comics:
SELECT x.firstName
FROM "snb"
MATCH (:Tag WHERE "name: comics") <-[:hasTag]- (:Forum) -[:hasMember]-> (x:Person)
Selecting data when some labels are unknown
The labels on a node or edge of a pattern may be missing. The data model is used to get the possible labels that fit the pattern. Below we get the id of entities that connect to a Tag named comics, without specifying the actual entities nor relations:
SELECT x.id
FROM "snb"
MATCH (x) -> (:Tag WHERE "name: comics")
Selecting data from a pattern of varying length
The length of a path can vary between user-defined bounds. Below we retrieve the name of persons who know Alice, possibly indirectly via common acquantances counting up to 5:
SELECT y.lastName
FROM "snb"
MATCH (:Person WHERE "firstName: Alice") -[:knows]->{1,5} (y:Person)
Matching a conjunction of path patterns
A GQL query can define several path patterns that connect to one another. This allows to express more complex graph patterns than a path, for example a pattern with branches.
In the query below, we match the path from a Forum to a Tag via a Person, and the Person must also work at a Company. The Person entity requires a branch in the pattern that can be only expressed using a conjunction of path patterns. The query has the following additional requirements:
-
The title of the
Forummust contain the termTolkien. -
The name of a
Tagmust besci-fi. -
The
Companyname must beSiren.
SELECT y.lastName
FROM "snb"
MATCH (:Forum WHERE "title: Tolkien") -[:hasModerator]-> (x:Person)
-[:hasInterest]-> (:Tag WHERE "name: 'sci-fi'"), (1)
(x) -[:workAt]-> (:Company WHERE "name: Siren") (2)
| 1 | The first path pattern matching a Forum to a Tag via a Person. |
| 2 | The second path pattern matching a Person to a Company. The Person from the previous pattern is referenced to via the use of the variable x. |