Elasticsearch Recon Ingestion Scripts (ERIS) 🔎

Go to file

acidvegas 88e0dbfea8 Added parallel bulk uploading, error handling, sniffing nodes for discovery, dynamic batch sizes, and more		2024-01-27 01:13:27 -05:00
ingestors	Added parallel bulk uploading, error handling, sniffing nodes for discovery, dynamic batch sizes, and more	2024-01-27 01:13:27 -05:00
LICENSE	Updated README, fixed issue using the wrong domain in records for zone file ingestion (woops)	2024-01-20 10:53:55 -05:00
README.md	Added parallel bulk uploading, error handling, sniffing nodes for discovery, dynamic batch sizes, and more	2024-01-27 01:13:11 -05:00

Elasticsearch Recon Ingestion Scripts (ERIS)

A utility for ingesting various large scale reconnaissance data logs into Elasticsearch

Prerequisites

python ingest_XXXX.py [options] <input>

Note: The <input> can be a file or a directory of files, depending on the ingestion script.

Argument	Description
`input_path`	Path to the input file or directory
`--dry-run`	Dry run (do not index records to Elasticsearch)
`--watch`	Watch the input file for new lines and index them in real time

Argument	Description	Default
`--host`	Elasticsearch host (Will sniff for other nodes in the cluster)	`localhost`
`--port`	Elasticsearch port	`9200`
`--user`	Elasticsearch username	`elastic`
`--password`	Elasticsearch password (if not provided, check environment variable `ES_PASSWORD`)
`--api-key`	Elasticsearch API Key for authentication
`--self-signed`	Elastic search instance is using a self-signed certificate	`true`
`--index`	Elasticsearch index name	`masscan-logs`
`--shards`	Number of shards for the index	`1`
`--replicas`	Number of replicas for the index	`1`

Argument	Description	Default
`--batch-max`	Maximum size in MB of a batch	`10`
`--batch-size`	Number of records to index in a batch	`5000`
`--batch-threads`	Number of threads to use when indexing in batches	`2`
`--retries`	Number of times to retry indexing a batch before failing	`10`
`--timeout`	Number of seconds to wait before retrying a batch	`30`

NOTE: Using --batch-threads as 4 and --batch-size as 10000 with 3 nodes would process 120,000 records before indexing 40,000 per node.