We are experimenting with this kind of development style and from my experience so far this shift a lot of the complexity of building into the story writing and manual testing phases.As I will need to fully handover the task and let the agent(s) essentially one-shot the implementation I need to be way for specific and clear in giving it context and goals, otherwise I’m afraid it will start build code purely by accumulation creating a pile of unmanageable garbage.
Also changes which requires new UI components tend to require more manual adjustments and detailed testing on the UX and general level of polishing of the experience our users expect at this stage.
I’m starting to develop a feeling of tasks that can be done this way and I think those more or less represent 20 to 30% of the tasks in a normal sprint. The other 70% will have diminishing returns if not actually a negative return as I will need to familiarise with the code before being able to instruct AI to improve/fix it.
From your experience building this, what’s your take on:
1. How do your product helps in reducing the project management/requirements gathering for each individual tasks to be completed with a sufficient level of accuracy?
2. Your strong point seems to be in parallelisation, but considering my previous analysis I don’t see how this is a real pain for a small teams. Is this intended to be more of a tool for scale up with a stable product mostly in maintenance/enhancement mode?
3. Are you imagining a way for this tool to implement some kind of automated way of actually e2e test the code of each task?
2/25/2026
at
2:16:46 AM
Thanks! What tools have you been experimenting with?Agreed. That this evolution pushes much of the work into describing desired outcomes and giving sufficient context.
To your questions:
Emdash helps reduce the setup cost of each environment by allowing you to open an isolated git worktree, copying over env variables and other desired context. And then preserving your conversation per task. That said, you still need to write clear goals and point it in the right direction.
I think it's less about team scale and more about individual throughput. My working mode is that I'm actively working on one or two tasks, switching between them as one runs. Then I have a long list of open tasks in the sidebar that are more explorative, quick reviews, issue creation etc. So for me it's not about one-shotting tasks, but instead about navigating between them easily as they're progressing
Automated e2e testing is tricky, particularly for rendering. I think roborev (https://github.com/roborev-dev/roborev) is moving in the right direction. Generating bug reports synchronously per commit and not just once you create a PR. I also think what https://cursor.com shipped today with computer-use agents testing interfaces is very interesting.
by onecommit