alt.hn

3/19/2026 at 2:20:34 AM

Cook: A simple CLI for orchestrating Claude Code

https://rjcorwin.github.io/cook/

by staticvar

3/19/2026 at 4:32:22 AM

I did a Show HN[0] a few days back with my CLI agent called cook[1] and for a moment I was ecstatic my tool made it to the front page. haha.

[0]: https://news.ycombinator.com/item?id=47262711 [1]: https://getcook.dev

by vadepaysa

3/19/2026 at 9:29:36 AM

Oh haha, small world! Maybe I should add cook (agent) support for cook (orchestration) and then we'll have cook manage subagents via cook!

by staticvar

3/19/2026 at 5:44:22 AM

[dead]

by catlifeonmars

3/19/2026 at 8:25:07 AM

I dunno. I just let Claude build a python script that calls Claude code though subprocess.run().

I recently made a sort of Autoresearch with that approach. The script calls Claude Code to create a hyphotesis, then code based on that, evaluate- rinse and repeat. I am still trying to figure out if I am actually on to something or just burning tokens. Jury is still out.

by maCDzP

3/19/2026 at 9:40:26 AM

That's totally a valid approach! Especially for a very specific workflow you are looking for. For the cases I cover in cook, I had done those patterns enough times that I figured it was time to build a tool/skill for Claude so that I didn't have to explain it as much and also not have to wait for claude to code it up, and possibly interpret me wrong. Now ask claude to "/cook race 3 of foo plan with review, pick the best" and it knows what to do.

by staticvar

3/19/2026 at 10:32:46 AM

I think you're onto something, but I would add that it's sort of like a live REPL that has an integrated agent but with extra steps.

I haven't used python much but I wouldn't be surprised if you can set up a sufficiently powerful REPL with it. I know Julia can do it very well and it's a very similar language. Obviously there are powerful Lisps that do this very well as well.

by dgb23

3/19/2026 at 11:15:51 AM

I went quite far down this approach last year; you're welcome to take what you want from my repo -

https://github.com/riazarbi/way

by scrappyjoe

3/19/2026 at 12:50:07 PM

Hey scrappyejoe, way looks pretty cool. The goal of cook is to be unopinionated, exposing primitives for the shape of workflows as opposed to defining what happens in those workflows. Cook is something that way could use under the hood.

by staticvar

3/19/2026 at 1:41:04 PM

Cool, I'm already digging into your stuff, thanks for posting it.

by scrappyjoe

3/19/2026 at 12:42:23 PM

that's rlm

by danr4

3/19/2026 at 9:18:18 AM

[dead]

by jamiemallers

3/19/2026 at 3:05:42 AM

Can someone explain what this is to my n00b brain. I don't get what claude-cli is missing that this adds in?

by rc_kas

3/19/2026 at 4:16:31 AM

As a prerequisite you’d want to understand the purpose of Ralph Wiggum Loops

But in general this is meta to the CLI agent.

So if you were to use the CLI to perform a review of some code. This tool would allow you to loop the output of the code review 5 times onto itself.

by sghiassy

3/19/2026 at 7:13:08 AM

> So if you were to use the CLI to perform a review of some code. This tool would allow you to loop the output of the code review 5 times onto itself.

Claude already does that if you ask nicely.

by exolab

3/19/2026 at 10:17:04 AM

To a certain extent, yes it does! For my cases, I'm often running 3 parallel implementations that get 10 to 20 iterations deep, and then Claude has to sort out the pros and cons of the options and also take the best bits of each. Easy to hit the context window with Claude just running those on its own, so giving `/cook` to Claude, it can offload a bit more via cook and stay higher level.

by staticvar

3/19/2026 at 3:37:37 AM

IMO the raw Claude CLI is great for one-off interactive sessions, but as soon as you want repeatable multi-step workflows you’re either copy-pasting prompts forever or hacking your own solution manually. That’s exactly the gap these tools fill.

My take on a solution for this is https://ossature.dev — .smd spec markdown files + ossature audit / build that gives you DAG orchestration, SHA-traced increments, and tiny focused contexts.

by beshrkayali

3/19/2026 at 4:40:05 AM

Isn’t a repeatable, multi-step workflow exactly what a script or Makefile does?

by eloisius

3/19/2026 at 7:17:22 AM

Yeah bash scripts start clean but the sprawl kicks in quick as the workflow and project becomes more complex. Prompts get copied, deps turn manual, and maintenance of your workflow itself becomes the chore.

Ossature swaps that for structured SMDs and optional AMDs. Multiple specs build a clean DAG that drops into an editable plan.toml so everything stays traceable without the mess.

Feel free to check the example projects on https://github.com/ossature/ossature-examples

by beshrkayali

3/19/2026 at 7:52:33 AM

> Yeah bash scripts start clean but the sprawl kicks in quick as the workflow and project becomes more complex.

Then just use Python.

by wiseowise

3/19/2026 at 8:00:08 AM

That’s what Ossature is :)

by beshrkayali

3/19/2026 at 12:49:15 PM

[dead]

by hrmtst93837

3/19/2026 at 5:13:31 AM

I use bash scripts. Both Claude and Vibe support all kinds of arguments if you need a prompt to “become a task”. Bash is also deterministic and easy to read and debug.

by isodev

3/19/2026 at 7:11:06 AM

can you elaborate on "easy to read and debug", because in my experience it is anything but

by Yiin

3/19/2026 at 7:12:15 AM

Compared to a random tool someone vibecoded?

by isodev

3/19/2026 at 10:09:33 AM

what about a random bash script that somebody vibe coded

by petcat

3/19/2026 at 5:24:50 AM

Had a quick look. Stumbled upon the markdown format smd.

Was wondering if using front-matter instead of a "custom" encoding for parseble data was considered?

by je42

3/19/2026 at 9:43:26 PM

Yeah, I did briefly consider front-matter, but ended up with inline @ tags because I thought it kept the entire document feeling like one coherent spec instead of header-data + body, front matter felt like config to me, but this is 0.0.1 so things might change :)

by beshrkayali

3/19/2026 at 2:15:12 PM

This is for co-ordinating instances of Claude or Codex, not something you do inside each instance.

by niobe

3/19/2026 at 6:08:07 PM

Claude and Codex can also use the cook command to coordinate runs of other agents. This is similar to how you can describe a workflow to them of how to use subagents, and they'll try, but this gives them a reliable deterministic way to run those agents. An added benefit of having Claude/Codex/etc. use cook directly is that they are really good at analyzing the traces of what is happening inside of cook and after the fact.

by rjcorwin

3/19/2026 at 3:19:10 AM

Maybe not adds in, but wraps around. You could accomplish much of this with fairly simply bash scripts.

by transitorykris

3/19/2026 at 3:32:48 AM

You could accomplish all of it with claude -p (headless mode).

by esperent

3/19/2026 at 3:49:45 AM

Admittedly I might be missing a flag or two with claude, but how are multiple loops and comparisons of solutions done with just headless mode?

by transitorykris

3/19/2026 at 12:50:04 PM

You have building blocks like "--resume <sessionId>" and "--fork-session".

For example, one thing you can do is curate the context of an "immutable" conversation and then reuse it as a base context for other prompts.

by hombre_fatal

3/19/2026 at 8:01:07 AM

It's just a prompt.

by loveparade

3/19/2026 at 4:37:31 AM

Via skills.

by esperent

3/19/2026 at 3:49:19 AM

Indeed.

Where are people finding time for these sort of projects.

by brcmthrowaway

3/19/2026 at 8:46:50 AM

They bootstrap a workflow with a prompt then build an orchestrator off that then prompt it to be converted to an opencode plugin and then prompt a website to be generated advertising it and then prompt a tool that reviews hacker news feedback and automatically incorporates feedback into next generation of the tool. At the end of the week they go to their manager and complain they are out of tokens for the actual job they are being paid for.

by injidup

3/19/2026 at 10:25:40 AM

Haha, not far off. Only difference is I'm not spending my tokens at work. I use this on a side project video game that I'm developing.

by staticvar

3/19/2026 at 8:50:35 AM

A noob question: is there a tool that automatically instructs Claude Code to "continue" when the token quota is reset after 5h? I am interested in that more than some rather fancy loops.

by smarx007

3/19/2026 at 9:53:16 AM

This is a fantastic idea and I'll add it!

by staticvar

3/19/2026 at 1:20:58 PM

Done! @let-it-cook/cli@5.1.0 is out. This works when running loops, but even when you just run `cook "do something"` which itself is not a loop, just a call to your agent.

by staticvar

3/19/2026 at 4:00:14 PM

I wrote https://jsr.io/@cdaringe/ralphmania which has a lot of feature overlap. cook looks more polished.

1. How do you handle worktree merge conflicts and/or integration validation issues? 2. Can i work straight from a list of requirements? I think i saw you support it… 3. I have my variant write a minimal explainer for every satisfied spec, aka receipts. Its pretty great, because i often review the receipts, and if imperfect, mark as NEEDS_REWORK + notes, and it’ll eventually just pick that up on a future iter

by cdaringe

3/19/2026 at 3:57:39 AM

There is a skill installation option. The skill markdown has 180 lines [1].

My take? I like it. It's concise enough for me to try it out. And I love the webpage.

[1] https://github.com/rjcorwin/cook/blob/main/no-code/SKILL.md

by sbinnee

3/19/2026 at 5:36:13 AM

Given that subagents have different thinking/effort behavior from the main agent and very limited control on that front (I’m not completely sure about this but see https://github.com/anthropics/claude-code/issues/14321 and I’ve also noticed very different behavior when the same prompt is used in the main agent or passed to a subagent), I’m not sure this skill will be the same.

by oefrha

3/19/2026 at 9:48:10 AM

Nice! You found the no-code option that just has the outer agent perform the duties of the workflows that cook describes. It's a bit experimental (the whole thing is really), but it would be nice to get some folks impressions of whether this works well as a pure skill or if y'all find the deterministic nature of the cook script improves reliability.

by staticvar

3/19/2026 at 6:12:20 AM

Looks pretty nice. I think a lot of devs have been making similar tools, I've written my own thing that does a work review loop. I like the interface you've made. I'll probably give it a go, but I'm also reluctant to relinquish the control I have when it's my own code doing orchestration.

by jemmyw

3/19/2026 at 9:58:40 AM

Oh ya, lots of tools out there orchestrating these days, and just writing a script is a valid option. On the control bit, note that if you `cook init` in your project, it generates a COOK.md that lets you template the meta prompt. Claude could probably take a look at how you've been doing it and port it over to COOK.md so it's similar to the prompts you've been using.

by staticvar

3/19/2026 at 2:07:29 PM

The composability here is really elegant — review v3 pick as a pipeline that "just works" is the kind of DX that makes agent orchestration feel tractable rather than overwhelming.

by lazybean

3/19/2026 at 2:39:38 PM

em dash + "x rather than y" sentence construction is setting off my brain's LLM detector here.

by theowaway213456

3/19/2026 at 8:00:01 PM

A lot of people coming up with these tools to orchestrate agents to do a one-shot implementation in a sandbox (in a server/container/pipeline).

I also have a similar - yet different approach - with a Mother Agent (MoMa) planner-reviewer-implementer multi agent pattern that orchestrates a feedback loop using Claude memory between agents.

https://news.ycombinator.com/item?id=47437012#47437013

I understood that you have a Judge agent that evaluates independent subagent solo executions and chooses best solution based on ralph algorithm. Did you play with limits on how many solo agents it is sufficient to spawn vs. not getting a better solution? and what is the limit of soloagent solutions that the Judge can compare? (obviously must depend on the complexity and context cost of a solution)

by mizioand

3/19/2026 at 12:59:44 PM

AI agent orchestration is future. That's where workflow engine shines. I'm doing the same thing using Dagu.sh and I don't use terminal so much anymore.

by yohamta

3/19/2026 at 2:32:41 PM

Dagu.sh, using yaml files to describe the flow, looks like a nice step up in sophistication from the cook approach that's just trying to make it easy to issue directly from the command line.

My 2 cents on the dagu.sh website, it should lead with the demo section (https://docs.dagu.sh/overview/#demo). That helped me connect what it was and how I might use it.

by staticvar

3/19/2026 at 7:13:24 AM

How heavy on tokens is this? I don't use these style workflows and am fairly new to claude code, so I assume it's better than 3x tokens when doing 3 passes?

by kasperstorgaard

3/19/2026 at 7:20:51 AM

It's not 3x because of 3 runs; can be more token, can be less.

The way of thinking it is, telling Claude to tackle the problem 3 times, each time it may or may not use different approach, fix or improve on things it did previously.

by hasperdi

3/19/2026 at 12:58:07 PM

That's right. However if you use the v3 operator, you get three parallel versions being built, and then combined depending on which resolver you use (pick, merge, and compare).

by staticvar

3/19/2026 at 3:28:24 PM

One thing i wish more CLI tools did: non-interactive mode. i build bash tools that have interactive prompts for first-time users, but everything the prompt asks also has a CLI flag. makes scripting and CI/CD so much simpler - you can test the exact same code path without mock stdin.

by bivlked

3/19/2026 at 2:59:21 PM

I really like the idea but my gut says it would be hard to trust. In the last example... "cleanest result" is not a great definition of done (that's only sort of a nitpick).

In general, I feel that removing the decision process (or relegating it to a language model) is not a good idea.

by 4b11b4

3/19/2026 at 3:07:07 PM

Yes, plz don't trust it, always review! The idea is that one prompt in Claude Code got you 80% of the way there, but with some automated review/iterate, it gets you 95% of the way there. It's not worth your time to review the 80% done version when you could be reviewing the 95% done version.

by staticvar

3/19/2026 at 3:19:20 PM

Also on that point about keeping humans in the loop on decisions, I've found following the Research-Plan-Implement process where we humans review at each of those stages, to be really helpful. This doc describes the skill I use with my agents so they keep me looped in: https://gist.github.com/rjcorwin/296885590dc8a4ebc64e70879dc...

Then I use cook to iterate and explore during the AI led parts.

by staticvar

3/20/2026 at 4:57:53 AM

Is the form factor what makes this different amongst the 500 that exist? Like CLI vs UIs?

by itsankur

3/19/2026 at 2:11:22 PM

Very nice. However, I do like to read every agent summary before letting them move on. I'm not sure I'd be able to apply this level of automation to many tasks.

by niobe

3/20/2026 at 8:28:17 AM

Surprised that there's no discussion about the prevalence of using TypeScript for developing these CLI agent harnesses. To me it seems concerning that such CLI programs are so chunky and commonly have to use over 1GB of RAM. Might be boomer-talk, but I am used to CLI apps being extremely lightweight and fast (think Total Commander, which has a UI, but is still very lightweight and responsive)

I understand that part of the reason is because many of these harnesses are vibe-coded, so plenty is lost in terms of optimization. And, well, because LLMs code best in TypeScript

by bean469

3/19/2026 at 8:35:14 AM

Semi-on-topic: Anyone know a way to get a good alternative UI on top of Cursor?

My company’s tracking how much we use the damn thing (its autocomplete is literally less-useful than standard VSCode, only time it’s consistently good is when it sees me do one thing to a line, sees repeated similar lines after that, and suggests I do it on the next one too, one at a time, and that’s only useful to me because I’ve never actually bothered to learn how to properly use a text editor) so I can’t avoid it, but even on codebases in the hundreds of lines it’s OOM killing things on my 16GB laptop (it, plus goddamn Teams, were eating half the memory by themselves the other day… with Cursor sitting at almost 6GB alone. JFC. On the plus side if this is what software from a company that should be full of experts at using these things looks like, guess our jobs are safe from them… though not from recession and ZIRP unwinding)

by genthree

3/19/2026 at 5:54:49 AM

Dull colors and a display font used for copy makes this website incredibly unpleasant to read.

by NetOpWibby

3/19/2026 at 10:02:50 AM

Ah sorry about that. I have weird tastes in design. The README.md is less detailed but covers the basics: https://github.com/rjcorwin/cook/

by staticvar

3/19/2026 at 10:55:45 AM

Two types of people eh, I thought it was quite enjoyable! Reminded me of a SNES game

by hmokiguess

3/19/2026 at 4:35:09 AM

How does this handle when Claude needs user input? To choose an option, grant tool permission, clarify questions…

by khazhoux

3/19/2026 at 10:09:57 AM

On asking for user input during implementation, it's best to use this when you have a plan sufficiently written up that you can point it to. To prep that plan, you can also use cook to iterate on the plan for you. Having Claude Code use `/cook` directly is nice because it watches what the subagents are up to and can speak for them, although Claude can't speak to the subagents running through cook.

On permissions, by default, when it runs instances of Claude they will inherit your Claude's permissions. So if there is no permission to `rm -rf /`, Claude will just get denied and move on. Using the docker sandbox option (see bottom of page), then it runs inside that `--dangerously-skip-permissions` and get more stuff done (my preferred option). The hard part about that is it means you need to set up the Docker sandbox with any dependencies your project needs. Run `cook init` and edit the `.cook/Dockerfile` to set those up.

by staticvar

3/19/2026 at 11:01:46 AM

Re: So if there is no permission to `rm -rf /`, Claude will just get denied and move on.

Until it doesn't and it finds a way to work around the restriction. Lots of stories around about that.

by trumbitta2

3/19/2026 at 1:02:53 PM

I would be interested in which stories you are thinking of. Stories of Claude breaking out of the restrictions set in its sandbox or stories of people not configuring Claude's sandbox correctly?

by staticvar

3/19/2026 at 8:18:19 AM

If you impl this as a backend and connect to Telegram bots, agents can just do `$ ask "Should I do this?"` for agent→human and `$ alert "this thing blocked me"` for coder→planner. That's what I'm actually doing — I have 1 manager + 3 designers + 1 researcher + 2 debugger + 1 communicator + any number of temporal coders/reviewers in my setup, all connected to taskwarrior for task-driven-dev

by neilbb

3/19/2026 at 8:11:53 PM

That is pretty cool building the whole dev team of agents and is it still with a star topology of a Manager agent interacting with all the other subagents?

I usually spawn 1 Mother Agent in a star topology with 3 subagents Planner, Reviewer, Implementer and them let them talk using Claude built-in agent tool. But the best thing I think was probably that a "do-nothing" setup wizard is part of the workflow.

https://github.com/mizioandOrg/claude-planner-reviewer-imple...

Did you have success with running stuff in a pipeline and being requested for input in agent->human needed scenarios?

by mizioand

3/20/2026 at 2:42:02 AM

Yeah the pipeline runs effectively and I'm able to be in the loop when the loop needs me.

In my setup there are two planes — manager and worker. On the manager plane, all primary agents form a mesh with p2p communication. Each designer connects to 1 or more workers in a star topology, since workers may have questions or get blocked while executing a plan.

The limitation of the built-in agent tool is it doesn't allow nested subagent spawning. But it's normal for a designer or researcher to need subagents — when a plan is done, I use a plan-review-leader agent to review it. If you try mother → planner → plan-review-leader → plan-vs-reality-validator, the nesting gets deep fast and blocks your manager from doing other work.

I wrote a blog post about this yesterday: https://dev.to/neil_agentic/ttal-more-than-a-harness-enginee...

by neilbb

3/20/2026 at 2:47:55 AM

using a single plan-reviewer would be slow when there are multiple aspects to review. That's why a local star topology with a plan-review-leader is needed: it spawns multiple reviewers in parallel, each focused on a different aspect.

by neilbb

3/19/2026 at 5:21:58 AM

It seems to be in the spirit of automated vibecoding. I assume it skips all permission checks.

by facorreia

3/19/2026 at 10:11:51 AM

By default it's locked down to the permissions you have granted in your Claude config. If you use the docker sandbox mode, then you can really let it fly as it can issue more commands in a safer environment.

by staticvar

3/19/2026 at 5:10:48 AM

claude> "We want to add a title section that shows what page we are currently on, use cook to manage the development process"

* coolers whirring, gpus on fire, tokens flying, investors happy, developer goes for 6th break of the day

by nurettin

3/19/2026 at 10:04:39 AM

can we integrate it with Devin too? seems like it doable

by viditraj

3/20/2026 at 9:39:54 AM

[dead]

by odziggy

3/19/2026 at 2:11:02 PM

[dead]

by bhekanik

3/19/2026 at 3:47:18 AM

[dead]

by perfmode

3/19/2026 at 8:55:32 AM

[dead]

by BANRONFANTHE

3/19/2026 at 4:01:18 AM

[flagged]

by eddie-wang

3/19/2026 at 2:05:02 PM

[dead]

by olivercoleai

3/19/2026 at 11:06:55 AM

[dead]

by derodero24

3/19/2026 at 3:21:31 AM

[dead]

by shablulman

3/19/2026 at 12:46:39 PM

[dead]

by uxmaniik52

3/19/2026 at 1:07:54 PM

Good to hear that you're having luck with small models. Note that cook exposes a --model param, also workflow specific model params (--model-work, --model-review, etc) so you can have a smaller model implementing a plan and a larger model reviewing the implementation.

by staticvar

3/19/2026 at 12:17:27 PM

[dead]

by gethwhunter34

3/19/2026 at 8:54:17 AM

[dead]

by erdmozkn62

3/19/2026 at 3:03:12 AM

[dead]

by fortylove

3/19/2026 at 2:50:40 PM

[dead]

by catlover76

3/19/2026 at 5:00:19 AM

[dead]

by NikitaCometa65

3/19/2026 at 4:20:54 AM

[dead]

by panditaditya21

3/19/2026 at 4:10:50 AM

[dead]

by pissedoffadmin

3/19/2026 at 4:20:02 AM

[flagged]

by xiaolu627

3/19/2026 at 3:03:13 AM

[flagged]

by rafaamaral

3/19/2026 at 4:35:56 AM

If this was human written sarcasm, bravo.

by cheriot

3/19/2026 at 3:14:21 AM

just use 200usd plan, I forgot what limits are.

by Yiin

3/19/2026 at 4:33:27 AM

Do you hit the limit pretty quickly on the Pro plan these days? Im thinking about subscribing for video editing, but Im still not sure.

by tmatsuzaki

3/19/2026 at 3:24:01 AM

You'll remember it soon

by croes

3/19/2026 at 3:39:18 AM

Do you often hit the limits recently on the $200 plan? I don't even come close

by weird-eye-issue

3/19/2026 at 5:25:20 PM

I didn’t often hit the limits with the plus plan but it changed.

They same will happen with $200 plan.

But the $500 plan will fix that

by croes

3/19/2026 at 3:54:22 AM

i used to, its much better now. opus 4.6 has been great on tokens

by dionian

3/19/2026 at 4:05:24 AM

Yes, quite a while back, they used to charge a lot more for the Opus tokens

by weird-eye-issue

3/19/2026 at 4:37:17 AM

Have not hit limits for 2 months now and I use it a lot. I have 200 max as well.

by anonzzzies