Skip to content

Latest commit

 

History

History
100 lines (83 loc) · 2.31 KB

README.md

File metadata and controls

100 lines (83 loc) · 2.31 KB

xstdin

tests

CLI tool for distributing stdin to a set of long-running workers. By default, distributes input in chunks of ~8KiB (always on line boundaries) to 4 workers. With the -l flag, the input is distributed in a strict round-robin fashion, line by line to each worker (comes at a huge performance cost).

Like xargs, but for stdin. Like parallel, but keeps a set or workers running. Imagined by paddor. Developed with help by GPT-4.

Installation

cargo install --path .

Usage

Usage: xstdin [-n NUM] [-b SIZE] [-l] <command> [<arg1> <arg2> ...]

Options:
    -n, --workers NUM   set number of workers (default is 4)
    -b, --buffer-size SIZE
                        set buffer capacity (default is 8KiB)
    -l, --line-mode     strictly distribute input by line (default
                        buffer-size)
    -h, --help          print this help menu

Examples

seq 1 10 | xstdin -n 2 cat
1
3
5
7
9
2
4
6
8
10
seq 1 10 | xstdin -l -n 2 -- ruby -e 'STDIN.each_line { |line| puts "#$$: #{line}" }'
23026: 2
23014: 1
23026: 4
23014: 3
23026: 6
23014: 5
23026: 8
23014: 7
23026: 10
23014: 9

Benchmarks

MacBook Air, M2, 2023:

# yes baseline
yes | pv --rate | cat > /dev/null
[3.79GiB/s]

# strict round robin
yes | pv --rate | xstdin -l cat > /dev/null
[1.55MiB/s]

# chunked round robin
yes | pv --rate | xstdin cat > /dev/null
[2.64GiB/s]

# big chunks round robin
yes | pv --rate | xstdin -b 32000 cat > /dev/null
[3.29GiB/s]
# large input
du -sh input_large.txt
9.7G	input_large.txt

time pv --rate input_large.txt | xstdin -- cat > /dev/null
[2.57GiB/s]
[2.24GiB/s]
pv --rate input_large.txt  0.11s user 2.10s system 50% cpu 4.343 total
xstdin -- cat > /dev/null  1.02s user 6.54s system 174% cpu 4.343 total

wc -l input_large.txt 
 5219249490 input_large.txt

time pv --rate input_large.txt | xstdin -- wc -l | awk '{s+=$1} END {print s}'
[1.67GiB/s]
5219249490
pv --rate input_large.txt  0.11s user 2.46s system 44% cpu 5.828 total
xstdin -- wc -l  11.05s user 4.86s system 272% cpu 5.827 total
awk '{s+=$1} END {print s}'  0.00s user 0.00s system 0% cpu 5.827 total