3/26/2026 at 9:28:36 AM
I’ve come to the realization that these kind of systems don’t work, and that a human in the loop is crucial for task planning; the LLM’s role being to identify issues, communicate the design / architecture, etc before it’s handed off, otherwise the LLM always ends up doing not entirely the correct thing.How is this part tackled when all that you have is GH issues? Doesn’t this work only for the most trivial issues?
by stingraycharles
3/26/2026 at 6:02:56 PM
I've come to the opposite conclusions: The big limitation of systems like this is starting and ending with human involvement at the same level, instead of directing at a higher level. You end up quibbling over detail the agents can handle themselves with sufficient guardrails and process, instead of setting higher level requirements and reviewing higher level decisions and outcomes, and dealing with exceptions.You can afford a lot of extra guardrails and process to ensure sufficient quality when the result is a system that gets improved autonomously 24/7.
I'm on my way home from a client, and meanwhile another project has spent the last 10 hours improving with no involvement from me. I spent a few minutes reviewing things this morning, after it's spent the whole night improving unattended.
by vidarh
3/26/2026 at 6:35:18 PM
I find that that doesn’t work in the long run. Software agents are not yet capable of maintaining a decently active repository for extended periods of time.I am all for delegating everything to AI agents, but it just becomes a mess over time if you don’t steer things often enough.
by stingraycharles
3/26/2026 at 7:40:42 PM
Not my experience at all. If anything, they make it cheap enough to deal with tech debt that it is far easier to justify being strict.by vidarh
3/26/2026 at 10:25:46 AM
Had the same realization which inspired eforge (shameless plug) https://github.com/eforge-build/eforge - planning stays in the developer’s control with all engineering (agent orchestration) handed off to eforge. This has been working well for a solo or siloed developer (me) that is free to plan independently. Allows the developer to confidently stay in the planning plane while eforge handles the rest using a methodology that in my experience works well. Of course, garbage in garbage out - thorough human planning (AI assisted, not autonomous) is key.by mshark
3/26/2026 at 10:35:20 AM
To me that doesn't do enough yet in terms of up-front planning and visualization, but it's a step in the right direction. I prefer Traycer myself.by stingraycharles
3/26/2026 at 12:09:30 PM
Hadn’t seen Traycer, that looks really polished. An important difference is that eforge is open source (Apache 2.0). I purposefully left out planning features from eforge because I don’t want the same tool that builds my code to force me into a planning methodology. Our role as developers has shifted heavily into planning (offloading implementation), and I’m still getting comfortable with that and want to be free to explore the planning space. Maybe I’ll change my mind after my planning opinions evolve.by mshark
3/26/2026 at 3:22:38 PM
Maybe - I do think as the model get better they'll be able to handle more and more difficult tasks. And yet, even if they can only solve the simplest issues now, why not let them so you can focus on the more important things?by jawiggins