6/13/2026 at 12:02:06 AM
DESIGN.md:> Each rule below is enforced mechanically by the skill, not left to vibes.
> R1. Repo docs are the memory; not in HANDOFF.md = didn't happen
SKILL.md:
> Not in docs/HANDOFF.md = didn't happen. Refuse to judge results that exist only in conversation or builder chat output.
"Mechnical enforcement" just means "prompting the LLM a bit extra" these days? It (still) amazes me how much effort and tokens we expend on what could and should be a two line script...
by Denvercoder9
6/13/2026 at 1:05:06 AM
Agents are in a wacky state, which makes projects like this fall into a weird spot. Eg I vaguely expect my agent to do two disparate things: manage dependency injection for tools, prompt modifications, etc, but also be the sort of “brain trust” that controls the flow of execution (can we stop now, do we keep going, etc).This project is meant to be the latter, but there’s not a clean way to integrate that into Claude Code or Codex because they expect to do both.
Pi can do it, but then your users can’t use their Claude subscriptions, so you have to cludgily try to do the same thing via LLM prompts.
by everforward
6/13/2026 at 5:19:29 AM
But why does your agent control doneness? It seems to me the most odd part to delegate. All LLMs are terrible at it. Most LLM tasks can be expressed as a DAG or DAG of DAGs. Why delegate that to a random point in context instead of enforcing the flow?by nostrebored
6/13/2026 at 10:20:57 PM
Most LLM tasks can be expressed as a DAG, but the odds of it succeeding go way, way up if you drop the acyclic requirement (eg a “run tests, if they fail, fix it and loop back to running the tests” stage).And it gets delegated to context because it’s either to have another session and tell it to double check and critique the first LLM than it is to write a deterministic test for every prompt. Like if I want a new form that sends a REST request on submit, I can have two LLMs duking it out in 5 minutes. If I have to write Selenium tests then I might as well just write the feature. Or I can have an LLM write the tests, but that’s more or less the same as letting a second LLM judge the first.
by everforward
6/13/2026 at 12:18:52 AM
[dead]by rbren
6/13/2026 at 12:50:17 AM
[dead]by uvbfibsuvkdh