Sample transform pipelines
During the import process, you can specify an additional transform pipeline.
A pipeline is a definition of a series of processors that are to be executed in the same order as they are declared.
A pipeline consists of two main fields: a description and a list of processors. The pipeline is structured as follows:
{
  "description": "...",
  "processors": []
}
description: Contains a helpful description of what the pipeline does.
processors: Specifies a list of processors to be executed in order.
The following section contains some sample transform pipelines that will help you to get started.
Split fields
To split a string, separated by delimiter | into a list of sub-strings, and if no initial string exists, fill the target field with an empty string.
{
  "description": "_description",
  "processors": [
    {
      "split": {
        "on_failure": [
          {
            "set": {
              "field": "parents",
              "value": ""
            }
          }
        ],
        "field": "parents",
        "separator": "\\|"
      }
    }
  ]
}
Split fields to a "long"
To accomplish a similar goal, but this time convert each sub-string to a long, and if no value exists in the initial field, on failure set the target field to -1.
{
  "description": "_description",
  "processors": [
    {
      "split": {
        "on_failure": [
          {
            "set": {
              "field": "parents",
              "value": -1
            }
          }
        ],
        "field": "parents",
        "separator": "\\|"
      },
      "convert": {
        "field": "parents",
        "type": "long"
      }
    }
  ]
}
To extract text and create a new field (Using regex)
Extract the text between the first set of parentheses in the Title field and create a new field for it called Patent_ID.
| 
 You must first enable regex in the elasticsearch.yml file by setting the parameter to   | 
{
  "description": "extract the text between the first set of parentheses",
  "processors": [
    {
      "script": {
        "source": "def f = ctx['Title']; if(f != null){ def m= /\\((.*?)\\)/.matcher(f); m.find(); ctx.Patent_ID=m.group(1);)}"
      }
    }
  ]
}
Merge two fields to create a geo_point
Merge two fields that contain 'latitude' and 'longitude' values to create a single Elasticsearch geo_point field:
{
 "description": "Create geo point field",
 "processors": [
     {
      "drop": {
        "if": "ctx.latitude_field == null || ctx.longitude_field == null"
      }
    },
   {
     "set": {
       "field": "geo_location",
       "value": {
           "lat": "{{latitude_field}}",
           "lon": "{{longitude_field}}"
       }
     }
   }
 ]
}