2024-01-20 07:04:50 +00:00
# Elasticsearch Recon Ingestion Scripts (ERIS)
2024-01-20 15:53:55 +00:00
> A utility for ingesting various large scale reconnaissance data logs into Elasticsearch
2024-01-20 07:04:50 +00:00
## Prerequisites
- [python ](https://www.python.org/ )
2024-01-20 15:53:55 +00:00
- [elasticsearch ](https://pypi.org/project/elasticsearch/ ) *(`pip install elasticsearch`)*
2024-03-06 02:40:34 +00:00
- [aiofiles ](https://pypi.org/project/aiofiles ) *(`pip install aiofiles`)*
- [aiohttp ](https://pypi.org/projects/aiohttp ) *(`pip install aiohttp`)*
2024-01-20 15:53:55 +00:00
## Usage
```shell
2024-02-02 05:11:18 +00:00
python eris.py [options] < input >
2024-01-20 15:53:55 +00:00
```
**Note:** The `<input>` can be a file or a directory of files, depending on the ingestion script.
2024-01-20 07:04:50 +00:00
2024-02-02 05:11:18 +00:00
### Options
2024-01-27 06:13:11 +00:00
###### General arguments
2024-03-06 02:40:34 +00:00
| Argument | Description |
|--------------|-----------------------------------------------|
| `input_path` | Path to the input file or directory |
| `--watch` | Create or watch a FIFO for real-time indexing |
2024-01-27 06:13:11 +00:00
###### Elasticsearch arguments
2024-03-06 02:40:34 +00:00
| Argument | Description | Default |
|-----------------|---------------------------------------------------------|---------------------|
| `--host` | Elasticsearch host | `http://localhost/` |
| `--port` | Elasticsearch port | `9200` |
| `--user` | Elasticsearch username | `elastic` |
| `--password` | Elasticsearch password | `$ES_PASSWORD` |
| `--api-key` | Elasticsearch API Key for authentication | `$ES_APIKEY` |
| `--self-signed` | Elasticsearch connection with a self-signed certificate | |
2024-01-27 09:28:30 +00:00
###### Elasticsearch indexing arguments
2024-03-06 02:40:34 +00:00
| Argument | Description | Default |
|--------------|--------------------------------------|---------------------|
| `--index` | Elasticsearch index name | Depends on ingestor |
| `--pipeline` | Use an ingest pipeline for the index | |
| `--replicas` | Number of replicas for the index | `1` |
| `--shards` | Number of shards for the index | `1` |
2024-01-27 06:13:11 +00:00
###### Performance arguments
2024-01-27 09:28:30 +00:00
| Argument | Description | Default |
|-------------------|----------------------------------------------------------|---------|
2024-03-06 02:40:34 +00:00
| `--chunk-max` | Maximum size in MB of a chunk | `100` |
| `--chunk-size` | Number of records to index in a chunk | `50000` |
| `--retries` | Number of times to retry indexing a chunk before failing | `100` |
| `--timeout` | Number of seconds to wait before retrying a chunk | `60` |
2024-02-02 05:11:18 +00:00
###### Ingestion arguments
2024-03-06 02:40:34 +00:00
| Argument | Description |
|-------------|--------------------------|
| `--certs` | Index Certstream records |
| `--httpx` | Index HTTPX records |
| `--masscan` | Index Masscan records |
| `--massdns` | Index massdns records |
| `--zone` | Index zone DNS records |
2024-01-20 15:53:55 +00:00
2024-02-02 05:11:18 +00:00
This ingestion suite will use the built in node sniffer, so by connecting to a single node, you can load balance across the entire cluster.
It is good to know how much nodes you have in the cluster to determine how to fine tune the arguments for the best performance, based on your environment.
2024-03-04 22:44:09 +00:00
## GeoIP Pipeline
Create & add a geoip pipeline and use the following in your index mappings:
```json
"geoip": {
"city_name": "City",
"continent_name": "Continent",
"country_iso_code": "CC",
"country_name": "Country",
"location": {
"lat": 0.0000,
"lon": 0.0000
},
"region_iso_code": "RR",
"region_name": "Region"
}
```
2024-02-02 05:11:18 +00:00
## Changelog
2024-03-06 02:40:34 +00:00
- Added ingestion script for certificate transparency logs in real time using websockets.
- `--dry-run` removed as this nears production level
- Implemented [async elasticsearch ](https://elasticsearch-py.readthedocs.io/en/latest/async.html ) into the codebase & refactored some of the logic to accomadate.
2024-02-02 05:11:18 +00:00
- The `--watch` feature now uses a FIFO to do live ingestion.
- Isolated eris.py into it's own file and seperated the ingestion agents into their own modules.
2024-03-04 22:44:09 +00:00
## Roadmap
2024-03-06 02:40:34 +00:00
- Fix issue with `ingest_certs.py` and not needing to pass a file to it
2024-03-05 16:52:06 +00:00
- WHOIS database ingestion scripts
2024-03-05 21:47:11 +00:00
- Dynamically update the batch metrics when the sniffer adds or removes nodes
2024-03-04 22:44:09 +00:00
2024-01-20 07:04:50 +00:00
___
2024-01-21 02:37:27 +00:00
###### Mirrors for this repository: [acid.vegas](https://git.acid.vegas/eris) • [SuperNETs](https://git.supernets.org/acidvegas/eris) • [GitHub](https://github.com/acidvegas/eris) • [GitLab](https://gitlab.com/acidvegas/eris) • [Codeberg](https://codeberg.org/acidvegas/eris)