Compare commits
4 Commits
Author | SHA1 | Date | |
---|---|---|---|
e0228615dc | |||
778fb772ec | |||
0c6927220f | |||
445b6dcf26 |
34
Makefile
Normal file
34
Makefile
Normal file
@ -0,0 +1,34 @@
|
|||||||
|
# Compiler settings
|
||||||
|
CC = gcc
|
||||||
|
CFLAGS = -Wall -Wextra -O2
|
||||||
|
PREFIX = /usr/local
|
||||||
|
VERSION = 1.0.1
|
||||||
|
|
||||||
|
# Files
|
||||||
|
PROG = shardz
|
||||||
|
SOURCES = shardz.c
|
||||||
|
OBJECTS = $(SOURCES:.c=.o)
|
||||||
|
|
||||||
|
# Targets
|
||||||
|
all: $(PROG)
|
||||||
|
|
||||||
|
$(PROG): $(OBJECTS)
|
||||||
|
$(CC) $(OBJECTS) $(CFLAGS) -o $(PROG)
|
||||||
|
|
||||||
|
install: $(PROG)
|
||||||
|
install -d $(DESTDIR)$(PREFIX)/bin
|
||||||
|
install -m 755 $(PROG) $(DESTDIR)$(PREFIX)/bin/$(PROG)
|
||||||
|
install -d $(DESTDIR)$(PREFIX)/lib/pkgconfig
|
||||||
|
install -m 644 shardz.pc $(DESTDIR)$(PREFIX)/lib/pkgconfig/
|
||||||
|
install -d $(DESTDIR)$(PREFIX)/share/man/man1
|
||||||
|
install -m 644 man/shardz.1 $(DESTDIR)$(PREFIX)/share/man/man1/
|
||||||
|
|
||||||
|
uninstall:
|
||||||
|
rm -f $(DESTDIR)$(PREFIX)/bin/$(PROG)
|
||||||
|
rm -f $(DESTDIR)$(PREFIX)/lib/pkgconfig/shardz.pc
|
||||||
|
rm -f $(DESTDIR)$(PREFIX)/share/man/man1/shardz.1
|
||||||
|
|
||||||
|
clean:
|
||||||
|
rm -f $(PROG) $(OBJECTS)
|
||||||
|
|
||||||
|
.PHONY: all install uninstall clean
|
41
README.md
41
README.md
@ -11,11 +11,25 @@ Shardz is a lightweight C utility that shards *(splits)* the output of any proce
|
|||||||
- Load balancing input streams
|
- Load balancing input streams
|
||||||
- Splitting any line-based input for distributed processing
|
- Splitting any line-based input for distributed processing
|
||||||
|
|
||||||
## Building
|
## Building & Installation
|
||||||
|
|
||||||
|
### Quick Build
|
||||||
```bash
|
```bash
|
||||||
gcc -o shardz shardz.c
|
gcc -o shardz shardz.c
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Using Make
|
||||||
|
```bash
|
||||||
|
# Build only
|
||||||
|
make
|
||||||
|
|
||||||
|
# Build and install system-wide (requires root/sudo)
|
||||||
|
sudo make install
|
||||||
|
|
||||||
|
# To uninstall
|
||||||
|
sudo make uninstall
|
||||||
|
```
|
||||||
|
|
||||||
## Usage
|
## Usage
|
||||||
```bash
|
```bash
|
||||||
some_command | shardz INDEX/TOTAL
|
some_command | shardz INDEX/TOTAL
|
||||||
@ -26,19 +40,21 @@ Where:
|
|||||||
- `TOTAL` is the total number of shards
|
- `TOTAL` is the total number of shards
|
||||||
|
|
||||||
### Examples
|
### Examples
|
||||||
|
Let's say you have a very large list of domains and you want to do recon on each domain. Using a single machine, this could take a very long time. However, you can split the workload across multiple machines:
|
||||||
|
|
||||||
- Machine number 1 would run:
|
- Machine number 1 would run:
|
||||||
```bash
|
```bash
|
||||||
curl https://example.com/large_file.txt | shardz 1/3
|
curl https://example.com/datasets/large_domain_list.txt | shardz 1/3 | httpx -title -ip -tech-detect -json -o shard-1.json
|
||||||
```
|
```
|
||||||
|
|
||||||
- Machine number 2 would run:
|
- Machine number 2 would run:
|
||||||
```bash
|
```bash
|
||||||
curl https://example.com/large_file.txt | shardz 2/3
|
curl https://example.com/datasets/large_domain_list.txt | shardz 2/3 | httpx -title -ip -tech-detect -json -o shard-2.json
|
||||||
```
|
```
|
||||||
|
|
||||||
- Machine number 3 would run:
|
- Machine number 3 would run:
|
||||||
```bash
|
```bash
|
||||||
curl https://example.com/large_file.txt | shardz 3/3
|
curl https://example.com/datasets/large_domain_list.txt | shardz 3/3 | httpx -title -ip -tech-detect -json -o shard-3.json
|
||||||
```
|
```
|
||||||
|
|
||||||
## How It Works
|
## How It Works
|
||||||
@ -50,6 +66,23 @@ Shardz uses a modulo operation to determine which lines should be processed by e
|
|||||||
|
|
||||||
This ensures an even distribution of the workload across all shards.
|
This ensures an even distribution of the workload across all shards.
|
||||||
|
|
||||||
|
## Simplicity
|
||||||
|
|
||||||
|
For what its worth, the same functionality of this tool can be done with a bash function in your `.bashrc`:
|
||||||
|
```bash
|
||||||
|
shardz() {
|
||||||
|
awk -v n="$1" -v t="$2" 'NR % t == n'
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cat domains.txt | shardz 1/3 | httpx -title -ip -tech-detect -json -o shard-1.json
|
||||||
|
cat domains.txt | shardz 2/3 | httpx -title -ip -tech-detect -json -o shard-2.json
|
||||||
|
cat domains.txt | shardz 3/3 | httpx -title -ip -tech-detect -json -o shard-3.json
|
||||||
|
```
|
||||||
|
|
||||||
|
This was just a fun little project to brush up on my C, and to explore the requirements to having a package added to Linux package manager repositories.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
###### Mirrors: [acid.vegas](https://git.acid.vegas/shardz) • [SuperNETs](https://git.supernets.org/acidvegas/shardz) • [GitHub](https://github.com/acidvegas/shardz) • [GitLab](https://gitlab.com/acidvegas/shardz) • [Codeberg](https://codeberg.org/acidvegas/shardz)
|
###### Mirrors: [acid.vegas](https://git.acid.vegas/shardz) • [SuperNETs](https://git.supernets.org/acidvegas/shardz) • [GitHub](https://github.com/acidvegas/shardz) • [GitLab](https://gitlab.com/acidvegas/shardz) • [Codeberg](https://codeberg.org/acidvegas/shardz)
|
||||||
|
35
man/shardz.1
Normal file
35
man/shardz.1
Normal file
@ -0,0 +1,35 @@
|
|||||||
|
.TH SHARDZ 1 "2025" "shardz 1.0.1" "User Commands"
|
||||||
|
.SH NAME
|
||||||
|
shardz \- shard the output of any process for distributed processing
|
||||||
|
.SH SYNOPSIS
|
||||||
|
.B some_command
|
||||||
|
.RI "|"
|
||||||
|
.B shardz
|
||||||
|
.I INDEX/TOTAL
|
||||||
|
.SH DESCRIPTION
|
||||||
|
.B shardz
|
||||||
|
is a lightweight utility that shards (splits) the output of any process for distributed processing.
|
||||||
|
It allows you to easily distribute workloads across multiple processes or machines by splitting
|
||||||
|
input streams into evenly distributed chunks.
|
||||||
|
.SH OPTIONS
|
||||||
|
.TP
|
||||||
|
.I INDEX/TOTAL
|
||||||
|
INDEX is the shard number (starting from 1), and TOTAL is the total number of shards.
|
||||||
|
.SH EXAMPLES
|
||||||
|
Machine number 1 would run:
|
||||||
|
.PP
|
||||||
|
.nf
|
||||||
|
curl https://example.com/large_file.txt | shardz 1/3
|
||||||
|
.fi
|
||||||
|
.PP
|
||||||
|
Machine number 2 would run:
|
||||||
|
.PP
|
||||||
|
.nf
|
||||||
|
curl https://example.com/large_file.txt | shardz 2/3
|
||||||
|
.fi
|
||||||
|
.SH AUTHOR
|
||||||
|
Written by acidvegas <acid.vegas@acid.vegas>
|
||||||
|
.SH COPYRIGHT
|
||||||
|
Copyright \(co 2025 acidvegas
|
||||||
|
.br
|
||||||
|
Licensed under the ISC License.
|
23
pkg/arch/PKGBUILD
Normal file
23
pkg/arch/PKGBUILD
Normal file
@ -0,0 +1,23 @@
|
|||||||
|
# Maintainer: acidvegas <acid.vegas@acid.vegas>
|
||||||
|
|
||||||
|
pkgname=shardz
|
||||||
|
pkgver=1.0.0
|
||||||
|
pkgrel=1
|
||||||
|
pkgdesc="Utility that shards the output of any process for distributed processing"
|
||||||
|
arch=('x86_64' 'i686' 'aarch64' 'armv7h')
|
||||||
|
url="https://github.com/acidvegas/shardz"
|
||||||
|
license=('ISC')
|
||||||
|
depends=('glibc')
|
||||||
|
source=("$pkgname-$pkgver.tar.gz::$url/archive/v$pkgver.tar.gz")
|
||||||
|
sha256sums=('SKIP')
|
||||||
|
|
||||||
|
build() {
|
||||||
|
cd "$pkgname-$pkgver"
|
||||||
|
make
|
||||||
|
}
|
||||||
|
|
||||||
|
package() {
|
||||||
|
cd "$pkgname-$pkgver"
|
||||||
|
make DESTDIR="$pkgdir" PREFIX=/usr install
|
||||||
|
install -Dm644 LICENSE "$pkgdir/usr/share/licenses/$pkgname/LICENSE"
|
||||||
|
}
|
16
pkg/debian/control
Normal file
16
pkg/debian/control
Normal file
@ -0,0 +1,16 @@
|
|||||||
|
Source: shardz
|
||||||
|
Section: utils
|
||||||
|
Priority: optional
|
||||||
|
Maintainer: acidvegas <acid.vegas@acid.vegas>
|
||||||
|
Build-Depends: debhelper-compat (= 13), gcc, make
|
||||||
|
Standards-Version: 4.5.1
|
||||||
|
Homepage: https://github.com/acidvegas/shardz
|
||||||
|
|
||||||
|
Package: shardz
|
||||||
|
Architecture: any
|
||||||
|
Depends: ${shlibs:Depends}, ${misc:Depends}
|
||||||
|
Description: Utility that shards process output for distributed processing
|
||||||
|
Shardz is a lightweight C utility that shards (splits) the output of any
|
||||||
|
process for distributed processing. It allows you to easily distribute
|
||||||
|
workloads across multiple processes or machines by splitting input streams
|
||||||
|
into evenly distributed chunks.
|
7
pkg/debian/rules
Normal file
7
pkg/debian/rules
Normal file
@ -0,0 +1,7 @@
|
|||||||
|
#!/usr/bin/make -f
|
||||||
|
|
||||||
|
%:
|
||||||
|
dh $@
|
||||||
|
|
||||||
|
override_dh_auto_install:
|
||||||
|
dh_auto_install -- PREFIX=/usr
|
35
pkg/rpm/shardz.spec
Normal file
35
pkg/rpm/shardz.spec
Normal file
@ -0,0 +1,35 @@
|
|||||||
|
Name: shardz
|
||||||
|
Version: 1.0.1
|
||||||
|
Release: 1%{?dist}
|
||||||
|
Summary: Utility that shards the output of any process for distributed processing
|
||||||
|
|
||||||
|
License: ISC
|
||||||
|
URL: https://github.com/acidvegas/shardz
|
||||||
|
Source0: %{url}/archive/v%{version}/%{name}-%{version}.tar.gz
|
||||||
|
|
||||||
|
BuildRequires: gcc
|
||||||
|
BuildRequires: make
|
||||||
|
|
||||||
|
%description
|
||||||
|
Shardz is a lightweight C utility that shards (splits) the output of any process
|
||||||
|
for distributed processing. It allows you to easily distribute workloads across
|
||||||
|
multiple processes or machines by splitting input streams into evenly distributed chunks.
|
||||||
|
|
||||||
|
%prep
|
||||||
|
%autosetup
|
||||||
|
|
||||||
|
%build
|
||||||
|
%make_build
|
||||||
|
|
||||||
|
%install
|
||||||
|
%make_install
|
||||||
|
|
||||||
|
%files
|
||||||
|
%license LICENSE
|
||||||
|
%{_bindir}/shardz
|
||||||
|
%{_mandir}/man1/shardz.1*
|
||||||
|
%{_libdir}/pkgconfig/shardz.pc
|
||||||
|
|
||||||
|
%changelog
|
||||||
|
* Wed Dec 07 2024 acidvegas <acid.vegas@acid.vegas> - 1.0.1
|
||||||
|
- Initial package
|
Loading…
Reference in New Issue
Block a user