6/30/2026 at 9:42:52 PM
> It might seem odd to prefer shell scripting over a full-featured dynamic scripting language, but shell scripts like this have some material advantages over Python:And thus 99% of bioinformatics pipelines are shell at their heart... You need 10 packages, written in 4 different programming languages, and the common interfaces are files and pipes.
And for that matter, this could use a named pipe rather than a file (assuming `odgi depth` only uses streaming access):
odgi depth -i chr8.pan.og -r chm13#chr8 | \
bedtools makewindows -b /dev/stdin -w 5000 > chm13.chr8.w5kbps.bed
odgi depth -i chr8.pan.og -b chm13.chr8.w5kbps.bed --threads 2 | \
bedtools sort > chr8.pan.depth.w5kbps.bed
And Bash process substitution allows writing it all without an explicitly named pipe, though it may look a bit ugly: odgi depth -i chr8.pan.og -b --threads 2 <( \
odgi depth -i chr8.pan.og -r chm13#chr8 | \
bedtools makewindows -b /dev/stdin -w 5000
) | \
bedtools sort > chr8.pan.depth.w5kbps.bed
Which is why bioinformaticians get bad reputations with software engineers. (I still have a fair amount of misplaced pride for adding a shebang to a Makefile once to make a pipeline into a command several decades ago...)
by epistasis
6/30/2026 at 10:11:39 PM
The "bash is 235x faster than Hadoop" article pops up on HN every so often and this is another great time to link it here:https://adamdrake.com/command-line-tools-can-be-235x-faster-...
by alexpotato
6/30/2026 at 10:04:07 PM
I added a shebang to a readme once (written in literate style) so the poor engineers on the other side wouldn't have to deal with the multi-step monstrosity within.by AlotOfReading