pylcg/README.md

139 lines
3.7 KiB
Markdown
Raw Normal View History

2024-11-26 03:28:06 +00:00
# PyLCG
2024-11-26 05:13:30 +00:00
> Ultra-fast Linear Congruential Generator for IP Sharding
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
PyLCG is a high-performance Python implementation of a memory-efficient IP address sharding system using Linear Congruential Generators (LCG) for deterministic random number generation. This tool enables distributed scanning & network reconnaissance by efficiently dividing IP ranges across multiple machines while maintaining pseudo-random ordering.
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
## Features
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
- Memory-efficient IP range processing
- Deterministic pseudo-random IP generation
- High-performance LCG implementation
- Support for sharding across multiple machines
- Zero dependencies beyond Python standard library
- Simple command-line interface
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
## Installation
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
```bash
git clone https://github.com/acidvegas/pylcg
cd pylcg
chmod +x pylcg.py
```
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
## Usage
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
### Command Line
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
```bash
./pylcg.py 192.168.0.0/16 --shard-num 1 --total-shards 4 --seed 12345
```
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
### As a Library
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
```python
from pylcg import ip_stream
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
# Generate IPs for the first shard of 4 total shards
for ip in ip_stream('192.168.0.0/16', shard_num=1, total_shards=4, seed=12345):
print(ip)
```
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
## How It Works
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
### Linear Congruential Generator
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
PyLCG uses an optimized LCG implementation with carefully chosen parameters:
- Multiplier (a): 1664525
- Increment (c): 1013904223
- Modulus (m): 2^32
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
This generates a deterministic sequence of pseudo-random numbers using the formula:
```
next = (a * current + c) mod m
2024-11-26 03:28:06 +00:00
```
2024-11-26 05:13:30 +00:00
### Memory-Efficient IP Processing
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
Instead of loading entire IP ranges into memory, PyLCG:
1. Converts CIDR ranges to start/end integers
2. Uses generator functions for lazy evaluation
3. Calculates IPs on-demand using index mapping
4. Maintains constant memory usage regardless of range size
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
### Sharding Algorithm
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
The sharding system uses an interleaved approach:
1. Each shard is assigned a subset of indices based on modulo arithmetic
2. The LCG randomizes the order within each shard
3. Work is distributed evenly across shards
4. No sequential scanning patterns
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
## Performance
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
PyLCG is designed for maximum performance:
- Generates millions of IPs per second
- Constant memory usage (~100KB)
- Minimal CPU overhead
- No disk I/O required
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
Benchmark results on a typical system:
- IP Generation: ~5-10 million IPs/second
- Memory Usage: < 1MB for any range size
- LCG Operations: < 1 microsecond per number
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
## Contributing
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
### Performance Optimization
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
We welcome contributions that improve PyLCG's performance. When submitting optimizations:
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
1. Run the included benchmark suite:
```bash
python3 unit_test.py
```
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
2. Include before/after benchmark results for:
- IP generation speed
- Memory usage
- LCG sequence generation
- Shard distribution metrics
3. Consider optimizing:
- Number generation algorithms
- Memory access patterns
- CPU cache utilization
- Python-specific optimizations
4. Document any tradeoffs between:
- Speed vs memory usage
- Randomness vs performance
- Complexity vs maintainability
### Benchmark Guidelines
When running benchmarks:
1. Use consistent hardware/environment
2. Run multiple iterations
3. Test with various CIDR ranges
4. Measure both average and worst-case performance
5. Profile memory usage patterns
6. Test shard distribution uniformity
2024-11-26 03:28:06 +00:00
2024-11-26 05:09:16 +00:00
## Roadmap
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
- [ ] IPv6 support
- [ ] Custom LCG parameters
- [ ] Configurable chunk sizes
- [ ] State persistence
- [ ] Resume capability
- [ ] S3/URL input support
- [ ] Extended benchmark suite
## License
This project is released under the MIT License.
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
---
2024-11-26 03:28:06 +00:00
2024-11-26 05:13:30 +00:00
###### Mirrors: [acid.vegas](https://git.acid.vegas/pylcg) • [SuperNETs](https://git.supernets.org/acidvegas/pylcg) • [GitHub](https://github.com/acidvegas/pylcg) • [GitLab](https://gitlab.com/acidvegas/pylcg) • [Codeberg](https://codeberg.org/acidvegas/pylcg)