Siren Platform User Guide

Integrating Neo4j data

Neo4j is a graph database management system, which uses graph structures with nodes, relations, and properties to represent and store data. Siren can now ingest and reflect (periodically update) Neo4j data. The Neo4j Import Wizard (beta) makes this a straightforward process.

  • Nodes represent entities that are to be tracked, and correspond to a record in a relational database.

  • Relations are lines that connect nodes to other nodes, representing the relationship between them. Relations are the key concept in graph databases because they represent an abstraction that is not directly implemented in a conventional relational database.Properties describe information relevant to nodes.

  • Properties describe information relevant to nodes.

The following diagram shows two nodes with five relations between them. One of the nodes also has a relation with itself (node properties are not shown here).

neo4j_model.png
Using the Neo4j Import Wizard (beta)

The Neo4j Import Wizard (beta) makes it easy to import Neo4j data from a datasource and then configure and view the data model.

  1. Go to ManagementDatasources.

  2. In the New Datasource Type dropdown, click JDBC.

  3. Specify the details for the Neo4j as the datasource, then click Save.

  4. Go to Data Reflections and click Reflection Jobs; add a new job.

  5. Select the Neo4j as the datasource; you then get an option to use the Neo4j Importer.

    neo4j_importer.png
  6. Click Use Neo4j Importer. The Neo4j Import Wizard screen opens.

  7. Select the required nodes from the Select Nodes dropdown, and click Next.

    neo4j_step1.png

    All data reflection jobs for nodes and relations are displayed. Note that relations are also listed under Node Name; this is because Neo4j relations contain data, and Siren Investigate runs a reflection job for each relation, just like a node.

    neo4j_step2a.png

    Note

    Ensure that an appropriate primary key is selected. The use of stable unique identifiers as primary keys in your Neo4j data model is recommended.

  8. Click Confirm.

    Wait for the data to start reflecting, which means that all jobs have indexed at least one document, indicating that the fields are mapping successfully. The Continue button becomes enabled at this point.

    neo4j_reflect.png

    While a job is indexing, its status is shown as running; when a job completes, the status changes to successful.

    Tip

    You can see the document count and other information by hovering over the status.

    neo4j_tip.png

    You can leave the wizard at this point (by clicking ../ on top left of the wizard, or anywhere else in the application). This may be necessary if reflection jobs are failing and you have to go back to fix them.

  9. You can continue where you left off by following the notification in the Datasource Reflection Jobs page. Click Pending Jobs.

    neo4j_pending_jobs.png
  10. Click Continue Neo4j Job.

    neo4j_continue_job.png

    Note

    Pending jobs are stored in server cache, which will be wiped out if the Investigate server is restarted or shut down. Its behaviour is unpredictable when multiple Investigate instances are running behind a Load Balancer.

  11. On the Configure Saved Searches screen, you can modify the search name, and specify a color for the nodes and relations:

    neo4j_copy.png

    Click Next.

  12. On the Configure Relations screen, you can modify the Relation Label and the Inverse Relation Label for each relation:

    neo4j_relations.png
  13. Click Create Ontology.

  14. You can now go to the Data Model page to view the Neo4j data model:

    neo4j_end.png
Node behavior

You should note the following aspects of node behavior:

  • If a node has labels Location and Residence, then it will be reflected onto both indices.

  • A field node_labels is added to every document to denote all its labels.

  • Extra fields are added to all nodes, including node_id (containing the value of the primary key). Relation nodes have a few additional fields: start_node_id, end_node_id, relation_type.

  • In the case of relation-based reflection jobs, relation documents are only included for added nodes.

Other considerations
  • The same field mappings should not conflict for a single node.

  • All id()s are unique. Note that the use of stable unique identifiers in your Neo4j data model as primary keys is recommended.

    Neo4j does provide a unique id for each node and relationship, but they are not persistent. The id can be accessed by returning id(node) or id(relationship). While this id is unique, it can change if the database store is compacted.

Adding a Shortest Path script for Neo4j

The following steps add a 'Shortest Path' query script to the Graph Browser to find the shortest path between two or more (Neo4j reflected) data nodes.

  1. Use the Neo4j Import Wizard (beta) to import your Neo4j data.

  2. Locate the Shortest Path script in the demo data supplied with Siren 10.3.1; alternatively, you can find the script at this location.

  3. Navigate to ManagementScripts ① and create a new script.

    1. Enter a title and description for the script ②.

    2. Select contextual from the Type drop-down list ③.

    3. Paste the script into the Source section ④.

    4. Click Save ⑤.

    SPScript1numbered.png
  4. Go to the Graph Browser and click the Edit button on the top panel.

    1. On the Options tab on the left, click the Add Contextual Script button ① under the Contextual Function.

    2. Select the newly created script from the drop-down list ② to add it to the Graph Browser.

    3. Click Save ③.

    SPScript2numbered.png
Running the Shortest Path script

You can now use this script to compute shortest paths between your selected (Neo4j reflected) data nodes.

  1. Add the Neo4j reflected nodes to the Graph Browser.

    1. Select the required nodes ①.

    2. Right-click and select Neo4j Shortest Path (the name of the script) ②.

    SPScript3numbered.png
  2. In the dialog that opens, enter the maximum path length and click OK (both of the fields will be already populated).

    SPScript4.png
  3. The shortest path between the nodes is displayed.

    SPScript5.png
Neo4j Shortest Path limitations

The following limitations apply to running a Neo4j Shortest Path script:

  • The Neo4j Import Wizard must have been used for importing the data.

  • All selected nodes should belong to the same datasource (Neo4j cluster).

  • Neo4j field names have not been changed.

  • Neo4j reflection target index schema ${datasourceId}-${nodeType} has not been changed.

  • All documents for a single node type/relation type must be indexed to a single index each.