2/28/2026 at 5:52:27 PM
Everything in this post stems from the assumption that you already know what you're doing, which is probably true for things you've built before. But I hope we can agree that you can't spec out something you have no clue how to build, let alone write the tests before you've even explored the boundaries of the problem space. That's completely unreasonable.My second point is that this approach is fundamentally wrong for AI-first development. If the cost of writing code is approaching zero, there's no point investing resources to perfect a system in one shot. What matters more is how fast you can explore the edges. You can now spin up five agents to implement five different versions of the thing you're building and simply pick the best one.
In our shop, we have hundreds of agents working on various problems at any given time. Most of the code gets discarded. What we accept to merge are the good parts.
by _pdp_
2/28/2026 at 6:49:14 PM
Nothing of what you write here matches my experience with AI.Specification is worth writing (and spending a lot more time on than implementation) because it's the part that you can still control, fully read, understand etc. Once it gets into the code, reviewing it will be a lot harder, and if you insist on reviewing everything it'll slow things down to your speed.
> If the cost of writing code is approaching zero, there's no point investing resources to perfect a system in one shot.
THe AI won't get the perfect system in one shot, far from it! And especially not from sloppy initial requirements that leave a lot of edge (or not-so-edge) cases unadressed. But if you have a good requirement to start with, you have a chance to correct the AI, keep it on track; you have something to go back to and ask other AI, "is this implementation conforming to the spec or did it miss things?"
> five different versions of the thing you're building and simply pick the best one.
Problem is, what if the best one is still not good enough? Then what? You do 50? They might all be bad. You need a way to iterate to convergence
by virgilp
3/1/2026 at 7:12:56 AM
Same, I've sorta ended up converging on make a rough plan, get second and third opinions from various AI's on it, sort of decide and make choices while shaping the plan, which we turn into a detailed specsheet. Then follow the 'how to design programs' method which is mostly writing documentation first, then expected outcomes, then tests, then the functions, then test the flow of the pipeline. This usually looks like starting with Claude to write the documentation, expectations and create the scaffolding, then having Gemini write the tests and the code, then have codex try to run the pipeline and fix anything it finds that is broken along the way. I've found this to work fairly well, it's looser than waterfall, but waterfall-ish, but it's also sort of TDD-ish, and knowing that there will be failures and things to fix, but it also sort of knows the overall strategy and flow of how things will work before we start.by michaelbrave
2/28/2026 at 9:59:00 PM
This. Waterfall never worked for a reason. Humans and agents both need to develop a first draft, then re-evaluate with the lessons learned and the structure that has evolved. It’s very very time consuming to plan a complex, working system up front. NASA has done it, for the moon landing. But we don’t have those resources, so we plan, build, evaluate, and repeat.by manmal
2/28/2026 at 10:14:41 PM
That "first draft" still has to start with a spec. Your only real choice is whether the spec is an actual part of project documentation with a human in the loop, or it's improvised on the spot within the AI's hidden thinking tokens. One of these choices is preferable to the other.by zozbot234
3/1/2026 at 8:39:07 AM
I agree, and personally I often start with a spec. However, I haven’t found it useful to make this very detailed. The best ROI I‘ve been getting is closing the loop as tightly as possible before starting work, through very elaborate test harnesses and invariants that help keeping the implementation simple.I‘d rather spend 50% of my time on test setup, than 20% on a spec that will not work.
by manmal
2/28/2026 at 10:59:39 PM
So, rollback and try again with the insight.AI makes it cheap to implement complex first drafts and iterations.
I'm building a CRM system for my business; first time it took about 2 weeks to get a working prototype. V4 from scratch took about 5 hours.
by ErrantX
2/28/2026 at 11:10:19 PM
AI is also excellent at reverse engineering specs from existing code, so you can also ask it to reflect simple iterative changes to the code back into the spec, and use that to guide further development. That doesn't have much of an equivalent in the old Waterfall.by zozbot234
3/1/2026 at 8:42:31 AM
Yeah, if done right. In my experience, such a reimplementation is often lossy, if tests don’t enforce presence of all features and nonfunctional requirements. Maybe the primary value of the early versions is building up the test system, allowing an ideal implementation with that in place.Or put this way: We’re brute forcing (nicer term: evolutionizing) the codebase to have a better structure. Evolutionary pressure (tests) needs to exist, so things move in a better direction.
by manmal
3/1/2026 at 1:17:01 PM
What matters ultimately is the system achieves your goals. The clearer you can be about that the less the implementation detail actually matters.For example; do you care if the UI has a purple theme or a blue one? Or if it's React or Vur. If you do that's part of your goals, if not it doesn't entirely matter if V1 is Blue and React, but V4 ends up Purple and Vue.
by ErrantX
3/1/2026 at 1:57:50 AM
are you intentionally being vague here becuase it's a HN comment and you can't be arsed going into detail?or do you literally type
> Look at the git repo that took us 2 weeks, re-do it in another fresh repo... do better this time.
I think you don't and that your response is intentional misdirection to pointlessly argue against the planning artifact approach.
by NamlchakKhandro
3/1/2026 at 7:32:20 AM
> NASA has done it, for the moon landing.Which one? The one in 1960s or the one which has just been delayed - again?
I think you can just as well develop a first spec and iterate on than coding up a solution, important is exploration and iteration - in this specific case.
by Towaway69
3/1/2026 at 8:36:38 AM
Iterating on paper in my experience never captures the full complexity that is iteratively created by the new constraints of the code as it‘s being written.by manmal
3/1/2026 at 3:14:10 AM
> Waterfall never worked for a reasonWe're going to need some evidence for this claim. I feel like nearly 70 years of NASA has something to say about this.
by virgil_disgr4ce
3/1/2026 at 8:47:40 AM
While writing the comment, I did think to myself, that NASA did a ton of prototypes to de-risk. They simulated the landing as close as they could possibly make it, on earth. So, probably not pure waterfall either. Maybe my comment was a bit too brusque in that regard.by manmal
3/1/2026 at 6:58:37 AM
It does say - you will never have the time and resources of NASAby blabla1224
3/1/2026 at 2:28:15 PM
"Waterfall" was primarily a strawman that the agile salesman made up. Sure, it existed it some form but was not widely practiced.by osigurdson
3/1/2026 at 12:25:21 PM
You claim to disagreeing with OP but you seem to be describing basically the same core loop of planning and execution.Doing OODA faster has always been the key thing to creating high quality outcomes.
by __alexs
3/1/2026 at 7:52:03 PM
No, OP literally claims "you can't spec out something you have no clue how to build"; I claim that on the contrary, you absolutely can - you don't need to know "how to build" but you need to clarify what you want to build. You can't ask AI to build something (and actually obtain a good "something") until you can say exactly what the said "something" is.You iterate, yes - sometimes because the AI gets it wrong; and sometimes because you got it wrong (or didn't say exactly what you wanted, and AI assumed you wanted something else). But the less specific and clear you are in your requirements, the less likely it is you'll actually get what you want. With you not being specific in the requirements, it only really works if you want something that lots of people are building/have built before, because that will allow the AI to make correct assumptions about what to build.
by virgilp
3/1/2026 at 3:07:48 AM
>THe AI won't get the perfect system in one shot, far from it! And especially not from sloppy initial requirements that leave a lot of edge (or not-so-edge) cases unadressed. But if you have a good requirement to start with, you have a chance to correct the AI, keep it on track; you have something to go back to and ask other AI, "is this implementation conforming to the spec or did it miss things?"This is an antiquated way of thinking. If you ramp up the number of agents you're using the auto-correcting and reviewing behavior kicks in which makes for much less human intervention until the final code review.
by nojito
3/1/2026 at 6:57:52 AM
Yes, but what about the "spec-review"? Isn't that even more important? Is the system doing what we (and its users) need it to be doing?by galaxyLogic
3/1/2026 at 3:16:06 AM
> You can now spin up five agents to implement five different versions of the thing you're building and simply pick the best one.Or you end up with five different mediocre solutions where the best parts are randomly distributed amongst all five.
by petersumskas
2/28/2026 at 8:30:11 PM
There’s a real tension here.If you are vibe-coding, this approach is definitely going to kill you buzz and lose all the rapid iteration benefits.
But if you are working in an existing large system, vibe coding is hard to bring into the core. So I think something more formal like OP is needed to reap major benefits from AI.
by theptip
2/28/2026 at 10:10:32 PM
This is just AI-written slop, but even if you're vibe coding and want to go for rapid iteration, you still benefit by having the AI write out a broad plan of what it's going to do and looking it over before telling it to implement it. One-shot vibe coding is totally worthless, but the more you're aware of what the AI is thinking about and ready to revise its plans, the better it can potentially do.by zozbot234
3/1/2026 at 5:39:17 AM
> In our shop, we have hundreds of agents working on various problems at any given time. Most of the code gets discarded. What we accept to merge are the good parts.What you’ve described is an incredibly expensive and inefficient genetic algorithm with a human review as the fitness function. It’s not the flex you might think it is.
by hdhdhsjsbdh
2/28/2026 at 8:53:14 PM
If the price of code is zero then changing the spec also costs zero in terms of code and. This is what always was the problem with specs before. You'd write one, run it through the prover, write the code, then have to throw out the whole thing because there was a business case you didn't account for.Now the bottom 98% can be given to a robot with a clear success signal other than 'it looks about right'.
by noosphr
2/28/2026 at 9:13:41 PM
code is orthogonal to spec. you can iterate on the code and iterate on the spec. the spec is not meant to be constant, it's a form of ECC for the artifacts of the coding pipeline.by baq
3/1/2026 at 6:51:35 AM
Exactly.Also if you want to gain something by being less specific, eg. not writing code, and then want to be specific in writing a spec, then you just switched a precise system for an imprecise one.
by LunicLynx
2/28/2026 at 6:49:22 PM
Thats why I have AI do a write up about the system I want to build, I then review it all. If it looks good I use it as my prompt.by giancarlostoro
2/28/2026 at 7:33:53 PM
> But I hope we can agree that you can't spec out something you have no clue how to buildEh, of course you can. You can specify anything as long as you know what you want it to do. This is like systems engineering 101 and people do it successfully all the time.
by zppln
2/28/2026 at 5:58:21 PM
If you don't mind the question with regard to your second point, couldn't what you've done in your shop be also used here? There's no reason why 'try to develop it five different ways and pick the best parts out of each' is incompatible with the 'VSDD' concept; seems like it could be included?by DaylitMagic
2/28/2026 at 6:09:16 PM
> you can't spec out something you have no clue how to buildIdeally—and at least somewhat in practice—a specification language is as much a tool for design as it is for correctness. Writing the specification lets you explore the design space of your problem quickly with feedback from the specification language itself, even before you get to implementing anything. A high-level spec lets you pin down which properties of the system actually matter, automatically finds an inconsistencies and forces you to resolve them explicitly. (This is especially important for using AI because an AI model will silently resolve inconsistencies in ways that don't always make sense but are also easy to miss!)
Then, when you do start implementing the system and inevitably find issues you missed, the specification language gives you a clear place to update your design to match your understanding. You get a concrete artifact that captures your understanding of the problem and the solution, and you can use that to keep the overall complexity of the system from getting beyond practical human comprehension.
A key insight is that formal specification absolutely does not have to be a totally up-front tool. If anything, it's a tool that makes iterating on the design of the system easier.
Traditionally, formal specification have been hard to use as design tools partly because of incidental complexity in the spec systems themselves, but mostly because of the overhead needed to not only implement the spec but also maintain a connection between the spec and the implementation. The tools that have been practical outside of specific niches are the ones that solve this connection problem. Type systems are a lightweight sort of formal verification, and the reason they took off more than other approaches is that typechecking automatically maintains the connection between the types and the rest of the code.
LLMs help smooth out the learning curve for using specification languages, and make it much easier to generate and check that implementations match the spec. There are still a lot of rough edges to work out but, to me, this absolutely seems to be the most promising direction for AI-supported system design and development in the future.
by tikhonj
2/28/2026 at 5:58:10 PM
"Most of the code gets discarded." If you don't mind sharing, what's your signal-to-token ratio?by politician
2/28/2026 at 6:55:36 PM
How do you propose we measure signal? Lines of code is renowned for being a very bad measure of anything, and I really can't come up with anything better.by kvdveer
2/28/2026 at 8:23:29 PM
The OP said that they kept what they liked and discarded the rest. I think that's a reasonable definition for signal; so, the signal-to-token ratio would be a simple ratio of (tokens committed)/(tokens purchased). You could argue that any tokens spent exploring options or refining things could be signal and I would agree, but that's harder to measure after the fact. We could give them a flat 10x multiplier to capture this part if you want.by politician
2/28/2026 at 9:56:36 PM
I'm going to call it out as bullshit, you can't dig out "what you like" from "hundreds agents running all the time".by mirekrusin
2/28/2026 at 10:33:16 PM
One of our projects has 1.2K open pull requests.https://i.postimg.cc/Jnfk9b8g/Xnapper-2026-02-28-22-25-42.pn...
We probably accept 1-2 per day.
I personally discard code for the tiniest of reasons. If something feels off moments after I open the PR, it gets deleted. The reason we still have 1.2K open PRs is because we can't review all of them in time.
The most likely solution is to delete all of them after a month or two. By that time the open PRs on this project alone will be at least 10-20 more.
by _pdp_
3/1/2026 at 4:42:16 AM
Doesn't seem like too efficient process, no? Seems to me like investment in better quality of the output is exactly what is needed here, wouldn't you agree?by mirekrusin
3/1/2026 at 7:04:46 AM
I feel they sit of on the opposite end of the OP here. One wants to write out specs to control the agent implementation to achieve a one shot execution. Other side says: let’s won’t waste time of humans writing anything.I’m personally torn. A lot of the spec talk and now here in combination with TDD etc feels like the pipe dreams of the mid 2000. There was this idea of the Architect role who writes UML and specs. And a normal engineer just fills in the gaps. Then there was TDD. Nothing against it personally. But trying to write code in test first approach when you don’t really have a clue how a specific platform/system/library works had tons of overhead. Also the side effect of code written in the most convenient way to be tested and not to be executed. All in all to throw this ideas together for AI now… But throwing tokens out of the window and hoping for the token lottery to generate the best PR is also not the right direction in my book. But somebody needs to investigate in both extremes I say.
by larusso
3/1/2026 at 10:24:13 AM
Actually, nobody said the spec needs to be written by humans.My personal opinion: with today's LLMs, the spec should be steered by a human because its quality is proportional to result quality. Human interaction is much cheaper at that stage — it's all natural language that makes sense. Later, reasoning about the code itself will be harder.
In general, any non-trivial, valuable output must be based on some verification loop. A spec is just one way to express verification (natural language — a bit fuzzy, but still counts). Others are typecheckers, tests, and linters (especially when linter rules relate to correctness, not just cosmetics).
Personally, on non-trivial tasks, I see very good results with iterative, interactive, verifiable loops:
- Start with a task
- Write spec in e.g. SPEC.md → "ask question" until answer is "ok"/proceed
- Write implementation PLAN.md — topologically sorted list of steps, possibly with substeps → ask question
- For each step: implement, write tests, verify (step isn't done until tests pass, typecheck passes, etc.); update SPEC/PLAN as needed → ask question
- When done, convert SPEC.md and PLAN.md into PR description (summary) and discard
("Ask question" means an interactive prompt that appears for the user. Each step is gated by this prompt — it holds off further progress, giving you a chance to review and modify the result in small bits you can actually reason about.) The workflow: you accept all changes before confirming the next step. This way you get code deltas that make sense. You can review and understand them, and if something's wrong you can modify by hand (especially renames, which editors like VS Code handle nicely) or prompt for a change. The LLM is instructed to proceed only when the re-asked answer is "ok".
This works with systems like VSCode Copilot, not so much with CC cli.
I'm looking forward to an automated setup where the "human" is replaced by an "LLM judge" — I think you could already design a fairly efficient system like this, but for my work LLMs aren't quite there yet.
That said, there's an aspect that shouldn't be forgotten: this interactive approach keeps you in the driving seat and you know what's happening with the codebase, especially if you're running many of these loops per day. Fully automated solutions leave you outside the picture. You'll quickly get disconnected from what's going on — it'll feel more like a project run by another team where you kind of know what it does on the surface but have no idea how. IMO this is dangerous for long-term, sustainable development.
by mirekrusin
2/28/2026 at 9:47:44 PM
A lot of interesting replies below this comment that I won't be able to respond to individually.I'll just leave this here:
by _pdp_
2/28/2026 at 10:30:03 PM
That seems barely related and settles nothing? Bottom line is simple, saying "you can't spec out something you have no clue how to build" is saying you cannot desire coldness unless you understand how to build a refrigerator. It's just the difference between what and how. If you don't know the difference between implementation and specifications, just try a whole day of answering "what" and "why" questions with "how" answers and see how it goes.by robot-wrangler
2/28/2026 at 10:42:46 PM
Writing tests for a known solution (verification) is straightforward. But speccing out and testing something you haven't even figured out how to build yet (discovery) is a fundamentally harder problem.Try speccing out a flux capacitor. I'll wait.
https://chatbotkit.com/reflections/verification-is-easier-th...
by _pdp_
2/28/2026 at 11:21:25 PM
> Try speccing out a flux capacitor. I'll wait.One way to spec that is presumably something like "X% more efficient than current best-in-class", "made of Y,Z with no exotic materials", "takes no longer than T days to create" and so on.
Anyway, being "anti" spec isn't even wrong because it's just a completely incoherent position. There's always a spec.. including any informal prompt you kick off your agents with. Call it a "structured prompt" if that soothes you and your agents, then let's move on to the interesting part where we decide how much structure is optimal
by robot-wrangler