Importing data by using Logstash
|
This process is for advanced users. For a simpler approach, try importing data from a spreadsheet. |
If you want to stream live data, such as logs, into the Elasticsearch cluster, then you can import data by using Logstash. You can also import CSV or JSON files in this way.
The following section contains an example of how to configure Logstash for importing data. You can adapt this example for use with your own data set.
|
The data sets used in the example contains millions of records. If you use these data sets, loading will take some time to complete. |
Before you begin
-
(Optional) To walk through this example before using your own data, download the following publicly-available files:
-
Download company data as one CSV file from this Web page.
-
Download 'person of significant control' data as one JSON file from this Web page.
-
-
Extract the
.csvand.txtfiles. -
Open the example scripts and edit them to match the path and file names.
Creating configuration files
-
Create a plain text file and enter the following content:
input { file { path => "<location of BasicCompanyDataAsOneFile-date.csv>" start_position => beginning } } filter { csv { separator => "," autodetect_column_names => true autogenerate_column_names => true } } output { elasticsearch { hosts => ["127.0.0.1:9220"] index => "company" } } -
Edit the path to match the location of the
.csvfile and save it aslogstash_csv.confin the same path as the data set. -
Create another plain text file and enter the following content:
input { file { type => "json" path => "<location of persons-with-significant-control-snapshot-date.txt>" start_position => beginning } } filter { json { source => "message" } mutate { uppercase => [ "data[name]" ] } } output { elasticsearch { hosts => ["127.0.0.1:9220"] index => "persons-control" } } -
Edit the path to match the location of the
.txtfile and save it aslogstash_json.confin the same path as the data set.
Loading the data
From a command prompt, navigate to the logstash/bin folder and run Logstash with the configuration files that you created.
For example, run the following commands:
logstash -f C:\data\logstash_csv.conf
logstash -f C:\data\logstash_json.conf
|
To speed up the import process, you can install a second instance of Logstash and run the imports concurrently. |
Next steps
Add relations between the entities in your data.