Compare commits
No commits in common. "v1.0.1" and "main" have entirely different histories.
2
LICENSE
2
LICENSE
@ -1,6 +1,6 @@
|
||||
ISC License
|
||||
|
||||
Copyright (c) 2024, acidvegas <acid.vegas@acid.vegas>
|
||||
Copyright (c) 2025, acidvegas <acid.vegas@acid.vegas>
|
||||
|
||||
Permission to use, copy, modify, and/or distribute this software for any
|
||||
purpose with or without fee is hereby granted, provided that the above
|
||||
|
174
README.md
174
README.md
@ -3,6 +3,8 @@
|
||||
|
||||
PyLCG is a high-performance Python implementation of a memory-efficient IP address sharding system using Linear Congruential Generators (LCG) for deterministic random number generation. This tool enables distributed scanning & network reconnaissance by efficiently dividing IP ranges across multiple machines while maintaining pseudo-random ordering.
|
||||
|
||||
###### A GoLang version of this library is also available [here](https://github.com/acidvegas/golcg)
|
||||
|
||||
## Features
|
||||
|
||||
- Memory-efficient IP range processing
|
||||
@ -10,28 +12,35 @@ PyLCG is a high-performance Python implementation of a memory-efficient IP addre
|
||||
- High-performance LCG implementation
|
||||
- Support for sharding across multiple machines
|
||||
- Zero dependencies beyond Python standard library
|
||||
- Simple command-line interface
|
||||
- Simple command-line interface and library usage
|
||||
|
||||
## Installation
|
||||
|
||||
### From PyPI
|
||||
```bash
|
||||
pip install pylcg
|
||||
```
|
||||
|
||||
### From Source
|
||||
```bash
|
||||
git clone https://github.com/acidvegas/pylcg
|
||||
cd pylcg
|
||||
chmod +x pylcg.py
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
### Command Line
|
||||
|
||||
```bash
|
||||
./pylcg.py 192.168.0.0/16 --shard-num 1 --total-shards 4 --seed 12345
|
||||
pylcg 192.168.0.0/16 --shard-num 1 --total-shards 4 --seed 12345
|
||||
|
||||
# Resume from previous state
|
||||
pylcg 192.168.0.0/16 --shard-num 1 --total-shards 4 --seed 12345 --state 987654321
|
||||
|
||||
# Pipe to dig for PTR record lookups
|
||||
pylcg 192.168.0.0/16 --seed 12345 | while read ip; do
|
||||
echo -n "$ip -> "
|
||||
dig +short -x $ip
|
||||
done
|
||||
|
||||
# One-liner for PTR lookups
|
||||
pylcg 198.150.0.0/16 | xargs -I {} dig +short -x {}
|
||||
|
||||
# Parallel PTR lookups
|
||||
pylcg 198.150.0.0/16 | parallel "dig +short -x {} | sed 's/^/{} -> /'"
|
||||
```
|
||||
|
||||
### As a Library
|
||||
@ -39,55 +48,103 @@ chmod +x pylcg.py
|
||||
```python
|
||||
from pylcg import ip_stream
|
||||
|
||||
# Generate IPs for the first shard of 4 total shards
|
||||
# Basic usage
|
||||
for ip in ip_stream('192.168.0.0/16', shard_num=1, total_shards=4, seed=12345):
|
||||
print(ip)
|
||||
|
||||
# Resume from previous state
|
||||
for ip in ip_stream('192.168.0.0/16', shard_num=1, total_shards=4, seed=12345, state=987654321):
|
||||
print(ip)
|
||||
```
|
||||
|
||||
## State Management & Resume Capability
|
||||
|
||||
PyLCG automatically saves its state every 1000 IPs processed to enable resume functionality in case of interruption. The state is saved to a temporary file in your system's temp directory (usually `/tmp` on Unix systems or `%TEMP%` on Windows).
|
||||
|
||||
The state file follows the naming pattern:
|
||||
```
|
||||
pylcg_[seed]_[cidr]_[shard]_[total].state
|
||||
```
|
||||
|
||||
For example:
|
||||
```
|
||||
pylcg_12345_192.168.0.0_16_1_4.state
|
||||
```
|
||||
|
||||
The state is saved in memory-mapped temporary storage to minimize disk I/O and improve performance. To resume from a previous state:
|
||||
|
||||
1. Locate your state file in the temp directory
|
||||
2. Read the state value from the file
|
||||
3. Use the same parameters (CIDR, seed, shard settings) with the `--state` parameter
|
||||
|
||||
Example of resuming:
|
||||
```bash
|
||||
# Read the last state
|
||||
state=$(cat /tmp/pylcg_12345_192.168.0.0_16_1_4.state)
|
||||
|
||||
# Resume processing
|
||||
pylcg 192.168.0.0/16 --shard-num 1 --total-shards 4 --seed 12345 --state $state
|
||||
```
|
||||
|
||||
Note: When using the `--state` parameter, you must provide the same `--seed` that was used in the original run.
|
||||
|
||||
## How It Works
|
||||
|
||||
### IP Address Integer Representation
|
||||
|
||||
Every IPv4 address is fundamentally a 32-bit number. For example, the IP address "192.168.1.1" can be broken down into its octets (192, 168, 1, 1) and converted to a single integer:
|
||||
```
|
||||
192.168.1.1 = (192 × 256³) + (168 × 256²) + (1 × 256¹) + (1 × 256⁰)
|
||||
= 3232235777
|
||||
```
|
||||
|
||||
This integer representation allows us to treat IP ranges as simple number sequences. A CIDR block like "192.168.0.0/16" becomes a continuous range of integers:
|
||||
- Start: 192.168.0.0 → 3232235520
|
||||
- End: 192.168.255.255 → 3232301055
|
||||
|
||||
By working with these integer representations, we can perform efficient mathematical operations on IP addresses without the overhead of string manipulation or complex data structures. This is where the Linear Congruential Generator comes into play.
|
||||
|
||||
### Linear Congruential Generator
|
||||
|
||||
PyLCG uses an optimized LCG implementation with carefully chosen parameters:
|
||||
PyLCG uses an optimized LCG implementation with three carefully chosen parameters that work together to generate high-quality pseudo-random sequences:
|
||||
|
||||
| Name | Variable | Value |
|
||||
|------------|----------|--------------|
|
||||
| Multiplier | `a` | `1664525` |
|
||||
| Increment | `c` | `1013904223` |
|
||||
| Modulus | `m` | `2^32` |
|
||||
|
||||
This generates a deterministic sequence of pseudo-random numbers using the formula:
|
||||
```
|
||||
next = (a * current + c) mod m
|
||||
```
|
||||
###### Modulus
|
||||
The modulus value of `2^32` serves as both a mathematical and performance optimization choice. It perfectly matches the CPU's word size, allowing for extremely efficient modulo operations through simple bitwise AND operations. This choice means that all calculations stay within the natural bounds of CPU arithmetic while still providing a large enough period for even the biggest IP ranges we might encounter.
|
||||
|
||||
### Memory-Efficient IP Processing
|
||||
###### Multiplier
|
||||
The multiplier value of `1664525` was originally discovered through extensive mathematical analysis for the Numerical Recipes library. It satisfies the Hull-Dobell theorem's strict requirements for maximum period length in power-of-2 modulus LCGs, being both relatively prime to the modulus and one more than a multiple of 4. This specific value also performs exceptionally well in spectral tests, ensuring good distribution properties across the entire range while being small enough to avoid intermediate overflow in 32-bit arithmetic.
|
||||
|
||||
Instead of loading entire IP ranges into memory, PyLCG:
|
||||
1. Converts CIDR ranges to start/end integers
|
||||
2. Uses generator functions for lazy evaluation
|
||||
3. Calculates IPs on-demand using index mapping
|
||||
4. Maintains constant memory usage regardless of range size
|
||||
###### Increment
|
||||
The increment value of `1013904223` is a carefully selected prime number that completes our parameter trio. When combined with our chosen multiplier and modulus, it ensures optimal bit mixing throughout the sequence and helps eliminate common LCG issues like short cycles or poor distribution. This specific value was selected after extensive testing showed it produced excellent statistical properties and passed rigorous spectral tests for dimensional distribution.
|
||||
|
||||
### Applying LCG to IP Addresses
|
||||
|
||||
Once we have our IP addresses as integers, the LCG is used to generate a pseudo-random sequence that permutes through all possible values in our IP range:
|
||||
|
||||
1. For a given IP range *(start_ip, end_ip)*, we calculate the range size: `range_size = end_ip - start_ip + 1`
|
||||
|
||||
2. The LCG generates a sequence using the formula: `X_{n+1} = (a * X_n + c) mod m`
|
||||
|
||||
3. To map this sequence back to valid IPs in our range:
|
||||
- Generate the next LCG value
|
||||
- Take modulo of the value with range_size to get an offset: `offset = lcg_value % range_size`
|
||||
- Add this offset to start_ip: `ip = start_ip + offset`
|
||||
|
||||
This process ensures that:
|
||||
- Every IP in the range is visited exactly once
|
||||
- The sequence appears random but is deterministic
|
||||
- We maintain constant memory usage regardless of range size
|
||||
- The same seed always produces the same sequence
|
||||
|
||||
### Sharding Algorithm
|
||||
|
||||
The sharding system uses an interleaved approach:
|
||||
1. Each shard is assigned a subset of indices based on modulo arithmetic
|
||||
2. The LCG randomizes the order within each shard
|
||||
3. Work is distributed evenly across shards
|
||||
4. No sequential scanning patterns
|
||||
|
||||
## Performance
|
||||
|
||||
PyLCG is designed for maximum performance:
|
||||
- Generates millions of IPs per second
|
||||
- Constant memory usage (~100KB)
|
||||
- Minimal CPU overhead
|
||||
- No disk I/O required
|
||||
|
||||
Benchmark results on a typical system:
|
||||
- IP Generation: ~5-10 million IPs/second
|
||||
- Memory Usage: < 1MB for any range size
|
||||
- LCG Operations: < 1 microsecond per number
|
||||
The sharding system employs an interleaved approach that ensures even distribution of work across multiple machines while maintaining randomness. Each shard operates independently using a deterministic sequence derived from the base seed plus the shard index. The system distributes IPs across shards using modulo arithmetic, ensuring that each IP is assigned to exactly one shard. This approach prevents sequential scanning patterns while guaranteeing complete coverage of the IP range. The result is a system that can efficiently parallelize work across any number of machines while maintaining the pseudo-random ordering that's crucial for network scanning applications.
|
||||
|
||||
## Contributing
|
||||
|
||||
@ -100,43 +157,6 @@ We welcome contributions that improve PyLCG's performance. When submitting optim
|
||||
python3 unit_test.py
|
||||
```
|
||||
|
||||
2. Include before/after benchmark results for:
|
||||
- IP generation speed
|
||||
- Memory usage
|
||||
- LCG sequence generation
|
||||
- Shard distribution metrics
|
||||
|
||||
3. Consider optimizing:
|
||||
- Number generation algorithms
|
||||
- Memory access patterns
|
||||
- CPU cache utilization
|
||||
- Python-specific optimizations
|
||||
|
||||
4. Document any tradeoffs between:
|
||||
- Speed vs memory usage
|
||||
- Randomness vs performance
|
||||
- Complexity vs maintainability
|
||||
|
||||
### Benchmark Guidelines
|
||||
|
||||
When running benchmarks:
|
||||
1. Use consistent hardware/environment
|
||||
2. Run multiple iterations
|
||||
3. Test with various CIDR ranges
|
||||
4. Measure both average and worst-case performance
|
||||
5. Profile memory usage patterns
|
||||
6. Test shard distribution uniformity
|
||||
|
||||
## Roadmap
|
||||
|
||||
- [ ] IPv6 support
|
||||
- [ ] Custom LCG parameters
|
||||
- [ ] Configurable chunk sizes
|
||||
- [ ] State persistence
|
||||
- [ ] Resume capability
|
||||
- [ ] S3/URL input support
|
||||
- [ ] Extended benchmark suite
|
||||
|
||||
---
|
||||
|
||||
###### Mirrors: [acid.vegas](https://git.acid.vegas/pylcg) • [SuperNETs](https://git.supernets.org/acidvegas/pylcg) • [GitHub](https://github.com/acidvegas/pylcg) • [GitLab](https://gitlab.com/acidvegas/pylcg) • [Codeberg](https://codeberg.org/acidvegas/pylcg)
|
||||
|
114
pylcg.py
114
pylcg.py
@ -1,114 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
# Python implementation of a Linear Congruential Generator for IP Sharding - Developed by acidvegas in Python (https://git.acid.vegas/pylcg)
|
||||
# pylcg.py
|
||||
|
||||
import argparse
|
||||
import ipaddress
|
||||
import random
|
||||
|
||||
|
||||
class LCG:
|
||||
'''Linear Congruential Generator for deterministic random number generation'''
|
||||
|
||||
def __init__(self, seed: int, m: int = 2**32):
|
||||
self.m = m
|
||||
self.a = 1664525
|
||||
self.c = 1013904223
|
||||
self.current = seed
|
||||
|
||||
|
||||
def next(self) -> int:
|
||||
'''Generate next random number'''
|
||||
|
||||
self.current = (self.a * self.current + self.c) % self.m
|
||||
|
||||
return self.current
|
||||
|
||||
|
||||
class IPRange:
|
||||
'''Memory-efficient IP range iterator'''
|
||||
|
||||
def __init__(self, cidr: str):
|
||||
network = ipaddress.ip_network(cidr)
|
||||
self.start = int(network.network_address)
|
||||
self.total = int(network.broadcast_address) - self.start + 1
|
||||
|
||||
|
||||
def get_ip_at_index(self, index: int) -> str:
|
||||
'''
|
||||
Get IP at specific index without generating previous IPs
|
||||
|
||||
:param index: The index of the IP to get
|
||||
'''
|
||||
|
||||
if not 0 <= index < self.total:
|
||||
raise IndexError('IP index out of range')
|
||||
|
||||
return str(ipaddress.ip_address(self.start + index))
|
||||
|
||||
|
||||
def ip_stream(cidr: str, shard_num: int = 1, total_shards: int = 1, seed: int = 0):
|
||||
'''
|
||||
Stream random IPs from the CIDR range. Optionally supports sharding.
|
||||
Each IP in the range will be yielded exactly once in a pseudo-random order.
|
||||
|
||||
:param cidr: Target IP range in CIDR format
|
||||
:param shard_num: Shard number (1-based), defaults to 1
|
||||
:param total_shards: Total number of shards, defaults to 1 (no sharding)
|
||||
:param seed: Random seed for LCG (default: random)
|
||||
'''
|
||||
# Convert to 0-based indexing internally
|
||||
shard_index = shard_num - 1
|
||||
|
||||
# Initialize IP range and LCG
|
||||
ip_range = IPRange(cidr)
|
||||
|
||||
# Use random seed if none provided
|
||||
if not seed:
|
||||
seed = random.randint(0, 2**32-1)
|
||||
|
||||
# Initialize LCG
|
||||
lcg = LCG(seed + shard_index)
|
||||
|
||||
# Calculate how many IPs this shard should generate
|
||||
shard_size = ip_range.total // total_shards
|
||||
|
||||
# Distribute remainder
|
||||
if shard_index < (ip_range.total % total_shards):
|
||||
shard_size += 1
|
||||
|
||||
# Remaining IPs to yield
|
||||
remaining = shard_size
|
||||
|
||||
while remaining > 0:
|
||||
index = lcg.next() % ip_range.total
|
||||
if total_shards == 1 or index % total_shards == shard_index:
|
||||
yield ip_range.get_ip_at_index(index)
|
||||
remaining -= 1
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description='Ultra-fast random IP address generator with optional sharding')
|
||||
parser.add_argument('cidr', help='Target IP range in CIDR format')
|
||||
parser.add_argument('--shard-num', type=int, default=1, help='Shard number (1-based)')
|
||||
parser.add_argument('--total-shards', type=int, default=1, help='Total number of shards (default: 1, no sharding)')
|
||||
parser.add_argument('--seed', type=int, default=0, help='Random seed for LCG')
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
if args.total_shards < 1:
|
||||
raise ValueError('Total shards must be at least 1')
|
||||
|
||||
if args.shard_num > args.total_shards:
|
||||
raise ValueError('Shard number must be less than or equal to total shards')
|
||||
|
||||
if args.shard_num < 1:
|
||||
raise ValueError('Shard number must be at least 1')
|
||||
|
||||
for ip in ip_stream(args.cidr, args.shard_num, args.total_shards, args.seed):
|
||||
print(ip)
|
||||
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
@ -1,5 +1,9 @@
|
||||
#!/usr/bin/env python
|
||||
# PyLCG - Linear Congruential Generator for IP Sharding - Developed by acidvegas ib Python (https://github.com/acidvegas/pylcg)
|
||||
# pylcg/__init__.py
|
||||
|
||||
from .core import LCG, IPRange, ip_stream
|
||||
|
||||
__version__ = "1.0.0"
|
||||
__author__ = "acidvegas"
|
||||
__all__ = ["LCG", "IPRange", "ip_stream"]
|
||||
__version__ = "1.0.3"
|
||||
__author__ = "acidvegas"
|
||||
__all__ = ["LCG", "IPRange", "ip_stream"]
|
||||
|
52
pylcg/cli.py
52
pylcg/cli.py
@ -1,26 +1,38 @@
|
||||
#!/usr/bin/env python
|
||||
# PyLCG - Linear Congruential Generator for IP Sharding - Developed by acidvegas ib Python (https://github.com/acidvegas/pylcg)
|
||||
# pylcg/cli.py
|
||||
|
||||
import argparse
|
||||
|
||||
from .core import ip_stream
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description='Ultra-fast random IP address generator with optional sharding')
|
||||
parser.add_argument('cidr', help='Target IP range in CIDR format')
|
||||
parser.add_argument('--shard-num', type=int, default=1, help='Shard number (1-based)')
|
||||
parser.add_argument('--total-shards', type=int, default=1, help='Total number of shards (default: 1, no sharding)')
|
||||
parser.add_argument('--seed', type=int, default=0, help='Random seed for LCG')
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
if args.total_shards < 1:
|
||||
raise ValueError('Total shards must be at least 1')
|
||||
|
||||
if args.shard_num > args.total_shards:
|
||||
raise ValueError('Shard number must be less than or equal to total shards')
|
||||
|
||||
if args.shard_num < 1:
|
||||
raise ValueError('Shard number must be at least 1')
|
||||
|
||||
for ip in ip_stream(args.cidr, args.shard_num, args.total_shards, args.seed):
|
||||
print(ip)
|
||||
parser = argparse.ArgumentParser(description='Ultra-fast random IP address generator with optional sharding')
|
||||
parser.add_argument('cidr', help='Target IP range in CIDR format')
|
||||
parser.add_argument('--shard-num', type=int, default=1, help='Shard number (1-based)')
|
||||
parser.add_argument('--total-shards', type=int, default=1, help='Total number of shards (default: 1, no sharding)')
|
||||
parser.add_argument('--seed', type=int, required=True, help='Random seed for LCG (required)')
|
||||
parser.add_argument('--state', type=int, help='Resume from specific LCG state (must be used with same seed)')
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
if args.total_shards < 1:
|
||||
raise ValueError('Total shards must be at least 1')
|
||||
|
||||
if args.shard_num > args.total_shards:
|
||||
raise ValueError('Shard number must be less than or equal to total shards')
|
||||
|
||||
if args.shard_num < 1:
|
||||
raise ValueError('Shard number must be at least 1')
|
||||
|
||||
if args.state is not None and not args.seed:
|
||||
raise ValueError('When using --state, you must provide the same --seed that was used originally')
|
||||
|
||||
for ip in ip_stream(args.cidr, args.shard_num, args.total_shards, args.seed, args.state):
|
||||
print(ip)
|
||||
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
main()
|
@ -1,19 +1,93 @@
|
||||
#!/usr/bin/env python
|
||||
# PyLCG - Linear Congruential Generator for IP Sharding - Developed by acidvegas ib Python (https://github.com/acidvegas/pylcg)
|
||||
# pylcg/core.py
|
||||
|
||||
import ipaddress
|
||||
import random
|
||||
|
||||
|
||||
class LCG:
|
||||
'''Linear Congruential Generator for deterministic random number generation'''
|
||||
'''Linear Congruential Generator for deterministic random number generation'''
|
||||
|
||||
def __init__(self, seed: int, m: int = 2**32):
|
||||
self.m = m
|
||||
self.a = 1664525
|
||||
self.c = 1013904223
|
||||
self.current = seed
|
||||
def __init__(self, seed: int, m: int = 2**32):
|
||||
self.m = m
|
||||
self.a = 1664525
|
||||
self.c = 1013904223
|
||||
self.current = seed
|
||||
|
||||
def next(self) -> int:
|
||||
'''Generate next random number'''
|
||||
self.current = (self.a * self.current + self.c) % self.m
|
||||
return self.current
|
||||
def next(self) -> int:
|
||||
'''Generate next random number'''
|
||||
|
||||
# Rest of the code from pylcg.py goes here...
|
||||
# (IPRange class and ip_stream function)
|
||||
self.current = (self.a * self.current + self.c) % self.m
|
||||
return self.current
|
||||
|
||||
|
||||
class IPRange:
|
||||
'''Memory-efficient IP range iterator'''
|
||||
|
||||
def __init__(self, cidr: str):
|
||||
network = ipaddress.ip_network(cidr)
|
||||
self.start = int(network.network_address)
|
||||
self.total = int(network.broadcast_address) - self.start + 1
|
||||
|
||||
def get_ip_at_index(self, index: int) -> str:
|
||||
'''
|
||||
Get IP at specific index without generating previous IPs
|
||||
|
||||
:param index: The index of the IP to get
|
||||
'''
|
||||
|
||||
if not 0 <= index < self.total:
|
||||
raise IndexError('IP index out of range')
|
||||
|
||||
return str(ipaddress.ip_address(self.start + index))
|
||||
|
||||
|
||||
def ip_stream(cidr: str, shard_num: int = 1, total_shards: int = 1, seed: int = 0, state: int = None):
|
||||
'''
|
||||
Stream random IPs from the CIDR range. Optionally supports sharding.
|
||||
Each IP in the range will be yielded exactly once in a pseudo-random order.
|
||||
|
||||
:param cidr: Target IP range in CIDR format
|
||||
:param shard_num: Shard number (1-based), defaults to 1
|
||||
:param total_shards: Total number of shards, defaults to 1 (no sharding)
|
||||
:param seed: Random seed for LCG (default: random)
|
||||
:param state: Resume from specific LCG state (default: None)
|
||||
'''
|
||||
|
||||
# Convert to 0-based indexing internally
|
||||
shard_index = shard_num - 1
|
||||
|
||||
# Initialize IP range and LCG
|
||||
ip_range = IPRange(cidr)
|
||||
|
||||
# Use random seed if none provided
|
||||
if not seed:
|
||||
seed = random.randint(0, 2**32-1)
|
||||
|
||||
# Initialize LCG
|
||||
lcg = LCG(seed + shard_index)
|
||||
|
||||
# Set LCG state if provided
|
||||
if state is not None:
|
||||
lcg.current = state
|
||||
|
||||
# Calculate how many IPs this shard should generate
|
||||
shard_size = ip_range.total // total_shards
|
||||
|
||||
# Distribute remainder
|
||||
if shard_index < (ip_range.total % total_shards):
|
||||
shard_size += 1
|
||||
|
||||
# Remaining IPs to yield
|
||||
remaining = shard_size
|
||||
|
||||
while remaining > 0:
|
||||
index = lcg.next() % ip_range.total
|
||||
if total_shards == 1 or index % total_shards == shard_index:
|
||||
yield ip_range.get_ip_at_index(index)
|
||||
remaining -= 1
|
||||
# Save state every 1000 IPs
|
||||
if remaining % 1000 == 0:
|
||||
from .state import save_state
|
||||
save_state(seed, cidr, shard_num, total_shards, lcg.current)
|
||||
|
24
pylcg/state.py
Normal file
24
pylcg/state.py
Normal file
@ -0,0 +1,24 @@
|
||||
#!/usr/bin/env python
|
||||
# PyLCG - Linear Congruential Generator for IP Sharding - Developed by acidvegas ib Python (https://github.com/acidvegas/pylcg)
|
||||
# pylcg/state.py
|
||||
|
||||
import os
|
||||
import tempfile
|
||||
|
||||
|
||||
def save_state(seed: int, cidr: str, shard: int, total: int, lcg_current: int):
|
||||
'''
|
||||
Save LCG state to temp file
|
||||
|
||||
:param seed: Random seed for LCG
|
||||
:param cidr: Target IP range in CIDR format
|
||||
:param shard: Shard number (1-based)
|
||||
:param total: Total number of shards
|
||||
:param lcg_current: Current LCG state
|
||||
'''
|
||||
|
||||
file_name = f'pylcg_{seed}_{cidr.replace("/", "_")}_{shard}_{total}.state'
|
||||
state_file = os.path.join(tempfile.gettempdir(), file_name)
|
||||
|
||||
with open(state_file, 'w') as f:
|
||||
f.write(str(lcg_current))
|
82
setup.py
82
setup.py
@ -1,43 +1,47 @@
|
||||
#!/usr/bin/env python
|
||||
# PyLCG - Linear Congruential Generator for IP Sharding - Developed by acidvegas ib Python (https://github.com/acidvegas/pylcg)
|
||||
# setup.py
|
||||
|
||||
from setuptools import setup, find_packages
|
||||
|
||||
with open("README.md", "r", encoding="utf-8") as fh:
|
||||
long_description = fh.read()
|
||||
with open('README.md', 'r', encoding='utf-8') as fh:
|
||||
long_description = fh.read()
|
||||
|
||||
setup(
|
||||
name="pylcg",
|
||||
version="1.0.0",
|
||||
author="acidvegas",
|
||||
author_email="acid.vegas@acid.vegas",
|
||||
description="Linear Congruential Generator for IP Sharding",
|
||||
long_description=long_description,
|
||||
long_description_content_type="text/markdown",
|
||||
url="https://github.com/acidvegas/pylcg",
|
||||
project_urls={
|
||||
"Bug Tracker": "https://github.com/acidvegas/pylcg/issues",
|
||||
"Documentation": "https://github.com/acidvegas/pylcg#readme",
|
||||
"Source Code": "https://github.com/acidvegas/pylcg",
|
||||
},
|
||||
classifiers=[
|
||||
"Development Status :: 5 - Production/Stable",
|
||||
"Intended Audience :: Developers",
|
||||
"License :: OSI Approved :: ISC License (ISCL)",
|
||||
"Operating System :: OS Independent",
|
||||
"Programming Language :: Python :: 3",
|
||||
"Programming Language :: Python :: 3.6",
|
||||
"Programming Language :: Python :: 3.7",
|
||||
"Programming Language :: Python :: 3.8",
|
||||
"Programming Language :: Python :: 3.9",
|
||||
"Programming Language :: Python :: 3.10",
|
||||
"Programming Language :: Python :: 3.11",
|
||||
"Topic :: Internet",
|
||||
"Topic :: Security",
|
||||
"Topic :: Software Development :: Libraries :: Python Modules",
|
||||
],
|
||||
packages=find_packages(),
|
||||
python_requires=">=3.6",
|
||||
entry_points={
|
||||
'console_scripts': [
|
||||
'pylcg=pylcg.cli:main',
|
||||
],
|
||||
},
|
||||
)
|
||||
name='pylcg',
|
||||
version='1.0.3',
|
||||
author='acidvegas',
|
||||
author_email='acid.vegas@acid.vegas',
|
||||
description='Linear Congruential Generator for IP Sharding',
|
||||
long_description=long_description,
|
||||
long_description_content_type='text/markdown',
|
||||
url='https://github.com/acidvegas/pylcg',
|
||||
project_urls={
|
||||
'Bug Tracker': 'https://github.com/acidvegas/pylcg/issues',
|
||||
'Documentation': 'https://github.com/acidvegas/pylcg#readme',
|
||||
'Source Code': 'https://github.com/acidvegas/pylcg',
|
||||
},
|
||||
classifiers=[
|
||||
'Development Status :: 5 - Production/Stable',
|
||||
'Intended Audience :: Developers',
|
||||
'License :: OSI Approved :: ISC License (ISCL)',
|
||||
'Operating System :: OS Independent',
|
||||
'Programming Language :: Python :: 3',
|
||||
'Programming Language :: Python :: 3.6',
|
||||
'Programming Language :: Python :: 3.7',
|
||||
'Programming Language :: Python :: 3.8',
|
||||
'Programming Language :: Python :: 3.9',
|
||||
'Programming Language :: Python :: 3.10',
|
||||
'Programming Language :: Python :: 3.11',
|
||||
'Topic :: Internet',
|
||||
'Topic :: Security',
|
||||
'Topic :: Software Development :: Libraries :: Python Modules',
|
||||
],
|
||||
packages=find_packages(),
|
||||
python_requires='>=3.6',
|
||||
entry_points={
|
||||
'console_scripts': [
|
||||
'pylcg=pylcg.cli:main',
|
||||
],
|
||||
},
|
||||
)
|
||||
|
245
unit_test.py
245
unit_test.py
@ -1,135 +1,150 @@
|
||||
#!/usr/bin/env python3
|
||||
import unittest
|
||||
#!/usr/bin/env python
|
||||
# PyLCG - Linear Congruential Generator for IP Sharding - Developed by acidvegas ib Python (https://github.com/acidvegas/pylcg)
|
||||
# unit_test.py
|
||||
|
||||
|
||||
import ipaddress
|
||||
import time
|
||||
import unittest
|
||||
|
||||
from pylcg import IPRange, ip_stream, LCG
|
||||
|
||||
|
||||
class Colors:
|
||||
BLUE = '\033[94m'
|
||||
GREEN = '\033[92m'
|
||||
YELLOW = '\033[93m'
|
||||
CYAN = '\033[96m'
|
||||
RED = '\033[91m'
|
||||
ENDC = '\033[0m'
|
||||
BLUE = '\033[94m'
|
||||
GREEN = '\033[92m'
|
||||
YELLOW = '\033[93m'
|
||||
CYAN = '\033[96m'
|
||||
RED = '\033[91m'
|
||||
ENDC = '\033[0m'
|
||||
|
||||
def print_header(message: str) -> None:
|
||||
print(f'\n\n{Colors.BLUE}{"="*80}')
|
||||
print(f'TEST: {message}')
|
||||
print(f'{"="*80}{Colors.ENDC}\n')
|
||||
print(f'\n\n{Colors.BLUE}{"="*80}')
|
||||
print(f'TEST: {message}')
|
||||
print(f'{"="*80}{Colors.ENDC}\n')
|
||||
|
||||
|
||||
def print_success(message: str) -> None:
|
||||
print(f'{Colors.GREEN}✓ {message}{Colors.ENDC}')
|
||||
print(f'{Colors.GREEN}✓ {message}{Colors.ENDC}')
|
||||
|
||||
|
||||
def print_info(message: str) -> None:
|
||||
print(f"{Colors.CYAN}ℹ {message}{Colors.ENDC}")
|
||||
print(f"{Colors.CYAN}ℹ {message}{Colors.ENDC}")
|
||||
|
||||
|
||||
def print_warning(message: str) -> None:
|
||||
print(f"{Colors.YELLOW}! {message}{Colors.ENDC}")
|
||||
print(f"{Colors.YELLOW}! {message}{Colors.ENDC}")
|
||||
|
||||
|
||||
class TestIPSharder(unittest.TestCase):
|
||||
@classmethod
|
||||
def setUpClass(cls):
|
||||
print_header('Setting up test environment')
|
||||
cls.test_cidr = '192.0.0.0/16' # 65,536 IPs
|
||||
cls.test_seed = 12345
|
||||
cls.total_shards = 4
|
||||
|
||||
# Calculate expected IPs
|
||||
network = ipaddress.ip_network(cls.test_cidr)
|
||||
cls.all_ips = {str(ip) for ip in network}
|
||||
print_success(f"Initialized test environment with {len(cls.all_ips):,} IPs")
|
||||
@classmethod
|
||||
def setUpClass(cls):
|
||||
print_header('Setting up test environment')
|
||||
cls.test_cidr = '192.0.0.0/16' # 65,536 IPs
|
||||
cls.test_seed = 12345
|
||||
cls.total_shards = 4
|
||||
|
||||
def test_ip_range_initialization(self):
|
||||
print_header('Testing IPRange initialization')
|
||||
start_time = time.perf_counter()
|
||||
|
||||
ip_range = IPRange(self.test_cidr)
|
||||
self.assertEqual(ip_range.total, 65536)
|
||||
|
||||
first_ip = ip_range.get_ip_at_index(0)
|
||||
last_ip = ip_range.get_ip_at_index(ip_range.total - 1)
|
||||
|
||||
elapsed = time.perf_counter() - start_time
|
||||
print_success(f'IP range initialization completed in {elapsed:.6f}s')
|
||||
print_info(f'IP range spans from {first_ip} to {last_ip}')
|
||||
print_info(f'Total IPs in range: {ip_range.total:,}')
|
||||
# Calculate expected IPs
|
||||
network = ipaddress.ip_network(cls.test_cidr)
|
||||
cls.all_ips = {str(ip) for ip in network}
|
||||
print_success(f"Initialized test environment with {len(cls.all_ips):,} IPs")
|
||||
|
||||
|
||||
def test_ip_range_initialization(self):
|
||||
print_header('Testing IPRange initialization')
|
||||
start_time = time.perf_counter()
|
||||
|
||||
ip_range = IPRange(self.test_cidr)
|
||||
self.assertEqual(ip_range.total, 65536)
|
||||
|
||||
first_ip = ip_range.get_ip_at_index(0)
|
||||
last_ip = ip_range.get_ip_at_index(ip_range.total - 1)
|
||||
|
||||
elapsed = time.perf_counter() - start_time
|
||||
print_success(f'IP range initialization completed in {elapsed:.6f}s')
|
||||
print_info(f'IP range spans from {first_ip} to {last_ip}')
|
||||
print_info(f'Total IPs in range: {ip_range.total:,}')
|
||||
|
||||
|
||||
def test_lcg_sequence(self):
|
||||
print_header('Testing LCG sequence generation')
|
||||
|
||||
# Test sequence generation speed
|
||||
lcg = LCG(seed=self.test_seed)
|
||||
iterations = 1_000_000
|
||||
|
||||
start_time = time.perf_counter()
|
||||
for _ in range(iterations):
|
||||
lcg.next()
|
||||
elapsed = time.perf_counter() - start_time
|
||||
|
||||
print_success(f'Generated {iterations:,} random numbers in {elapsed:.6f}s')
|
||||
print_info(f'Average time per number: {(elapsed/iterations)*1000000:.2f} microseconds')
|
||||
|
||||
# Test deterministic behavior
|
||||
lcg1 = LCG(seed=self.test_seed)
|
||||
lcg2 = LCG(seed=self.test_seed)
|
||||
|
||||
start_time = time.perf_counter()
|
||||
for _ in range(1000):
|
||||
self.assertEqual(lcg1.next(), lcg2.next())
|
||||
elapsed = time.perf_counter() - start_time
|
||||
|
||||
print_success(f'Verified LCG determinism in {elapsed:.6f}s')
|
||||
|
||||
|
||||
def test_shard_distribution(self):
|
||||
print_header('Testing shard distribution and randomness')
|
||||
|
||||
# Test distribution across shards
|
||||
sample_size = 65_536 # Full size for /16
|
||||
shard_counts = {i: 0 for i in range(1, self.total_shards + 1)} # 1-based sharding
|
||||
unique_ips = set()
|
||||
duplicate_count = 0
|
||||
|
||||
start_time = time.perf_counter()
|
||||
|
||||
# Collect IPs from each shard
|
||||
for shard in range(1, self.total_shards + 1): # 1-based sharding
|
||||
ip_gen = ip_stream(self.test_cidr, shard, self.total_shards, self.test_seed)
|
||||
shard_unique = set()
|
||||
|
||||
# Get all IPs from this shard
|
||||
for ip in ip_gen:
|
||||
if ip in unique_ips:
|
||||
duplicate_count += 1
|
||||
else:
|
||||
unique_ips.add(ip)
|
||||
shard_unique.add(ip)
|
||||
|
||||
shard_counts[shard] = len(shard_unique)
|
||||
|
||||
elapsed = time.perf_counter() - start_time
|
||||
|
||||
# Print distribution statistics
|
||||
print_success(f'Generated {len(unique_ips):,} IPs in {elapsed:.6f}s')
|
||||
print_info(f'Average time per IP: {(elapsed/len(unique_ips))*1000000:.2f} microseconds')
|
||||
print_info(f'Unique IPs generated: {len(unique_ips):,}')
|
||||
|
||||
if duplicate_count > 0:
|
||||
print_warning(f'Duplicates found: {duplicate_count:,} ({(duplicate_count/len(unique_ips))*100:.2f}%)')
|
||||
|
||||
expected_per_shard = sample_size // self.total_shards
|
||||
for shard, count in shard_counts.items():
|
||||
deviation = abs(count - expected_per_shard) / expected_per_shard * 100
|
||||
print_info(f'Shard {shard}: {count:,} unique IPs ({deviation:.2f}% deviation from expected)')
|
||||
|
||||
# Test randomness by checking sequential patterns
|
||||
ips_list = sorted([int(ipaddress.ip_address(ip)) for ip in list(unique_ips)[:1000]])
|
||||
sequential_count = sum(1 for i in range(len(ips_list)-1) if ips_list[i] + 1 == ips_list[i+1])
|
||||
sequential_percentage = (sequential_count / (len(ips_list)-1)) * 100
|
||||
|
||||
print_info(f'Sequential IP pairs in first 1000: {sequential_percentage:.2f}% (lower is more random)')
|
||||
|
||||
def test_lcg_sequence(self):
|
||||
print_header('Testing LCG sequence generation')
|
||||
|
||||
# Test sequence generation speed
|
||||
lcg = LCG(seed=self.test_seed)
|
||||
iterations = 1_000_000
|
||||
|
||||
start_time = time.perf_counter()
|
||||
for _ in range(iterations):
|
||||
lcg.next()
|
||||
elapsed = time.perf_counter() - start_time
|
||||
|
||||
print_success(f'Generated {iterations:,} random numbers in {elapsed:.6f}s')
|
||||
print_info(f'Average time per number: {(elapsed/iterations)*1000000:.2f} microseconds')
|
||||
|
||||
# Test deterministic behavior
|
||||
lcg1 = LCG(seed=self.test_seed)
|
||||
lcg2 = LCG(seed=self.test_seed)
|
||||
|
||||
start_time = time.perf_counter()
|
||||
for _ in range(1000):
|
||||
self.assertEqual(lcg1.next(), lcg2.next())
|
||||
elapsed = time.perf_counter() - start_time
|
||||
|
||||
print_success(f'Verified LCG determinism in {elapsed:.6f}s')
|
||||
|
||||
def test_shard_distribution(self):
|
||||
print_header('Testing shard distribution and randomness')
|
||||
|
||||
# Test distribution across shards
|
||||
sample_size = 65_536 # Full size for /16
|
||||
shard_counts = {i: 0 for i in range(1, self.total_shards + 1)} # 1-based sharding
|
||||
unique_ips = set()
|
||||
duplicate_count = 0
|
||||
|
||||
start_time = time.perf_counter()
|
||||
|
||||
# Collect IPs from each shard
|
||||
for shard in range(1, self.total_shards + 1): # 1-based sharding
|
||||
ip_gen = ip_stream(self.test_cidr, shard, self.total_shards, self.test_seed)
|
||||
shard_unique = set()
|
||||
|
||||
# Get all IPs from this shard
|
||||
for ip in ip_gen:
|
||||
if ip in unique_ips:
|
||||
duplicate_count += 1
|
||||
else:
|
||||
unique_ips.add(ip)
|
||||
shard_unique.add(ip)
|
||||
|
||||
shard_counts[shard] = len(shard_unique)
|
||||
|
||||
elapsed = time.perf_counter() - start_time
|
||||
|
||||
# Print distribution statistics
|
||||
print_success(f'Generated {len(unique_ips):,} IPs in {elapsed:.6f}s')
|
||||
print_info(f'Average time per IP: {(elapsed/len(unique_ips))*1000000:.2f} microseconds')
|
||||
print_info(f'Unique IPs generated: {len(unique_ips):,}')
|
||||
|
||||
if duplicate_count > 0:
|
||||
print_warning(f'Duplicates found: {duplicate_count:,} ({(duplicate_count/len(unique_ips))*100:.2f}%)')
|
||||
|
||||
expected_per_shard = sample_size // self.total_shards
|
||||
for shard, count in shard_counts.items():
|
||||
deviation = abs(count - expected_per_shard) / expected_per_shard * 100
|
||||
print_info(f'Shard {shard}: {count:,} unique IPs ({deviation:.2f}% deviation from expected)')
|
||||
|
||||
# Test randomness by checking sequential patterns
|
||||
ips_list = sorted([int(ipaddress.ip_address(ip)) for ip in list(unique_ips)[:1000]])
|
||||
sequential_count = sum(1 for i in range(len(ips_list)-1) if ips_list[i] + 1 == ips_list[i+1])
|
||||
sequential_percentage = (sequential_count / (len(ips_list)-1)) * 100
|
||||
|
||||
print_info(f'Sequential IP pairs in first 1000: {sequential_percentage:.2f}% (lower is more random)')
|
||||
|
||||
if __name__ == '__main__':
|
||||
print(f"\n{Colors.CYAN}{'='*80}")
|
||||
print(f"Starting IP Sharder Tests - Testing with 65,536 IPs (/16 network)")
|
||||
print(f"{'='*80}{Colors.ENDC}\n")
|
||||
unittest.main(verbosity=2)
|
||||
print(f"\n{Colors.CYAN}{'='*80}")
|
||||
print(f"Starting IP Sharder Tests - Testing with 65,536 IPs (/16 network)")
|
||||
print(f"{'='*80}{Colors.ENDC}\n")
|
||||
unittest.main(verbosity=2)
|
||||
|
Loading…
Reference in New Issue
Block a user