4/17/2026 at 2:29:51 AM
I read this article a week or so ago and immediately implemented a VS Code extension that I've always wanted: a static analysis tool for targets pipelines. targets is an R package which provides Make-like pipelines for data science and analysis work. You write your pipeline as a DAG and targets orchestrates the analysis and only re-runs downstream nodes if upstream ones are invalidated and the output changes. Fantastic tool, but at a certain level of complexity the DAG becomes a bit hard to navigate and reason about ("wait, what targets are downstream of this one again?"). This isn't really a targets problem, as this will happen with any analysis of decent complexity, but the structure targets adds to the analysis actually allows for a decent amount of static analysis of the environment/code. Enter tree-sitter.I wrote a VS Code extension that analyzes the pipeline and provides useful hover information (like size, time last invalidated, computation time for that target, and children/parent info) as well as links to quickly jump to different targets and their children/parents. I've dogfooded the hell out of it and it's already vastly improved my targets workflow within a week. Things like providing better error hints in the IDE for targets-specific malformed inputs and showing which targets are emitting errors really take lots of the friction out of an analysis.
All that to say: nice work on extending tree-sitter to R!
tarborist: targets + tree-sitter https://open-vsx.org/extension/tylermorganwall/tarborist
by tylermw
4/17/2026 at 6:53:14 AM
I only dabble in data analysis. I scratch the surface of what R can do, and my most complicated analysis fits in 100 or so lines of code I manage manually rather than with the help of tools like targets. What sort of work do you do where you get to play around with fun tools like that?by kqr
4/17/2026 at 11:30:51 AM
It's not necessarily the number of lines that motivates these tools. Say you're running an NLP pipeline where you want to do sentiment analysis on a large text corpus (tweets, for example) and then relate sentiment over time to some other variables. Each of those steps might only be a dozen lines of code, but the sentiment analysis might take a nonnegligable amount of time. If you can avoid rerunning it when only the later analysis has changed that can save you considerable time while iterating on the second step of the analysis.The old fashioned way to do this in R is to use the REPL and only rerun the lines of the script that have changed, with the earlier part staying in the environment. But it's easy to make mistakes doing it manually that way; having the computer track what has changed and needs to be rerun is much less error-prone.
by CrazyStat
4/17/2026 at 1:46:21 PM
Yes, the main benefit is caching and reproducibility: with targets (or any other DAG-based approach), you only recompute what needs to be recomputed and you are assured that no stale inputs or temporary analysis artifacts end up in the final product. If you don't own the underlying data sources and those sources can change at any point, a DAG-based approach helps ensure that.by tylermw
4/17/2026 at 8:55:53 AM
Long time lurker on HN but this totally deserves my first (edit: second) ever post. Looks amazing, thank you!by adamalt
4/18/2026 at 1:55:13 AM
Honored!by tylermw
4/17/2026 at 11:45:47 AM
It has been a lot of fun watching you iterate on this via bluesky updates!by davisvaughan
4/17/2026 at 1:40:07 PM
Thanks for all the work you (and the rest of the contributors) have done putting this together! I think bringing tree-sitter to R has already shown massive benefits: Just air alone has been a big improvement to my workflow.by tylermw
4/17/2026 at 4:06:25 PM
What is the advantage of targets over nextflow or snakemake?by kjkjadksj