7/5/2026 at 6:20:39 PM
I was worried this time last year that by this time this year, companies would have slashed their engineering teams down to a handful and everything would be driven by mostly autonomous agents with human guidance. But it just hasn't happened. Do I write all my code with an agent now? Yes. Can you just give an agent a desired outcome and let it work, unsupervised? Absolutely not. I can produce more code than I used to, but if I want it to be good, to be stable, to do what the product manager and designers want, it's only about 2 to 3 times more code than before. And that productivity is impacted by the fact that I'm reviewing 2 to 3 times more code than before (and you have to review, even more so now than before, because if you just let opus or gpt 5 do its thing, you'll get some terrible results, and I've found a lot of engineers on my team are just letting it do it's thing without a lot of iteration).by efficax
7/5/2026 at 6:48:07 PM
>I was worried this time last year that by this time this year, companies would have slashed their engineering teams down to a handful and everything would be driven by mostly autonomous agents with human guidance. But it just hasn't happened.I find this somewhat puzzling. I thought things were moving quickly, but at this time last year I couldn't even get Claude (using Cursor) to spin me up a service skeleton that would compile, let alone do anything meaningful.
I know it feels like a long time somehow, but it was only between November and February that things started to actually somewhat work without significant hand holding. Even now, it seems like we're still figuring out how to fully leverage the current models and tooling, even in organizations that have largely gotten on board.
by supern0va
7/5/2026 at 7:13:30 PM
It's not all that surprising that people were worried and believed this. The AI companies and infrastructure companies partnering with them have spent a lot of money and time trying to convince people this is the case year after year. The critical clue people miss is that everyone claiming that has very clear financial incentives to convince people that's the case even when they know it isn't. Anyone who was actually building with LLMs and judging for themselves based on its performance knew fully well that wasn't the case year after year.by iepathos
7/5/2026 at 7:56:43 PM
I've said this before: if anthropic (et al) thought they genuinely had a shot at replacing even 30% of white collar work, they would ABSOLUTELY NOT warn ANYONE. They would do what oil, leaded gas, and cigarette companies did. Swear under oath this is completely safe, commit GRIEVOUS societal harm that you explicitly promised wouldn't happen, and then end up in history books instead of jail for reasons beyond my ability to fathom.No. The very fact they are trying to "warn" us means it's all marketing.
This has been corroborated for me on the engineering front that I can't find a single IC I respect who actually thought there was any evidence AI was going to live up to the hype. I saw a lot of people I always thought were idiots/sycophants/brown nosers go insane with AI. Never saw anyone id trust to help me cross a street blindfolded say more that "I may be wrong, but I'm not seeing any evidence yet".
by atomicnumber3
7/5/2026 at 8:21:08 PM
Fwiw , you're conflating multiple things and consequently drawing premature conclusions.It can be massively over hyped for it's current capacity and decimate the white collar work.
A lot of the difference of opinion is down to their point of view. At my dayjob, LLMs will not live up to anything because the enterprise is not structured to take advantage of it's strength. That's unlikely to change within the foreseeable future.
I strongly suspect you mostly talked with people coming from just such a background, because it's hard to go beyond our own bubbles
by ffsm8
7/5/2026 at 8:23:40 PM
Sure, naturally. And yet parent commenter is remarking that simultaneously no AI-true-believer startups have supplanted the old money, and simultaneously despite much talk the bigcos have not slashed headcount to tiny AI-powered teams.by atomicnumber3
7/5/2026 at 7:01:04 PM
> at this time last year I couldn't even get Claude (using Cursor) to spin me up a service skeleton that would compile, let alone do anything meaningfulI've been using it to do this for 2 years now. And many people with me. The change you mention is one of is primarily one of Overton windows, of vibes.
by deaux
7/5/2026 at 7:04:55 PM
Which harness software were you using for this 2 years ago? VS Code Copilot? Cursor?by simonw
7/5/2026 at 6:43:09 PM
> Can you just give an agent a desired outcome and let it work, unsupervised? Absolutely not.Ignoring instructions - whether in AGENTS.md or my prompt - is the worst of it, and it routinely happens. It just waives things that I explicitly told it to do as part of the design.
Vibe coders (in the true sense, zero oversight) claim that you just need to prompt it carefully. That's completely untrue when faced with your careful prompt being ignored.
I even have "don't overrule me without asking" in my global AGENTS.md, and it simply doesn't do that.
by zamalek
7/5/2026 at 6:54:07 PM
Your context isn’t to give it orders, they just don’t work like that. Your context (AGENTS.me, skills, per-request context we are sending in for each request to bots) is to give it the info it needs in the language category it’s trained for the answers you want; you have to give it a clear instruction each prompt. Basically, when you have a long session, you can see this by saying, ok, now moving onto another thing, blah blah blah (implicitly ignoring all previous instructions). It can even back fire - nagging too much about don’t skip tests in the context can make it slip into the linguistic space where there is some emergency and faking the results might be justified (I imagine there is a certain amount of training out there “just making the tests pass for now, will fix later, I promise.” If you rarely mention tests except “this one is failing, please investigate what is going on” (an informational outcome not a test outcome), it doesn’t really “cheat” (tho it can leap to conclusions as always). The tests need to be some deterministic step in the process anyways, tests don’t need fuzzy word directed search capabilities. But the models just don’t have the structure to allow feeding in a ten page set of rules and follow them. You can add a step to say, please check this git commit for compliance with the 23 rules in this standards file, and it will work better to catch the gaps.by lanstin
7/5/2026 at 6:53:02 PM
These are word generators, not agents, I’m really not sure why people think they could be capable agents (ie independent) when they consistently ignore instructions, generate the wrong things and then double down when questioned, etc etc.You’ve been sold something that simply doesn’t work for the purported use case (intelligence) and instead is like a stupid database of all world knowledge with the appearance of intelligence.
Useful tools at times (if you bear in mind their limitations), but not close to intelligent, independent agents.
by grey-area
7/5/2026 at 7:00:57 PM
> I even have "don't overrule me without asking" in my global AGENTS.md, and it simply doesn't do that.You really need to look into hooks based on your coding agent. This is very much a solved problem as I demonstrate with
https://github.com/gitsense/pi-brains
I have a test repo
https://github.com/gitsense/gsc-rules-demos
that shows how you can block and warn and do other things.
You obviously can't have a "Don't make a mistake" rule though.
by sdesol
7/5/2026 at 6:52:14 PM
I’m convinced the magic bullet is deterministic checks. Linters, static analyzers, etc. Whatever you can do to create deterministic gates that the LLM simply must overcome to reach a “done” state, do it. Has been making a huge difference for my team, but sister teams are so invested in writing the perfect Make No Mistakes prompt that they just can’t see it.Basically I treat it like a junior dev. We don’t get junior devs to write code correctly by cajoling them just right, we add CI gates. It still works.
by rogerrogerr
7/5/2026 at 7:55:18 PM
Why aren't the teams using shared checks? Are the codes in different repos?by sdesol
7/5/2026 at 8:02:56 PM
They’re very, very different projects.by rogerrogerr
7/5/2026 at 6:49:58 PM
Also noticed this. Their intelligence is very jagged. I’ve had them produce some highly optimized code yet fail to follow basic code guidelines.by codemog
7/5/2026 at 7:13:08 PM
In my limited testing Fable is far better at obeying CLAUDE.MD than Opus is.by ls612
7/5/2026 at 6:24:25 PM
I have experienced and feel very much the same, and it is refreshing to see a realistic post about the success of agentic coding instead of the usual hype or doom.by alt227
7/5/2026 at 6:36:57 PM
As crazy as it may sound, my workflow today does not look too different from a year ago - where I was already heavy into claude code.Im not certain things will look too different a year from now either. We still have serious bottlenecks in terms of focus/attention you have for both delegating agent work and being able to review it. Even if we solve the "trust what ai does" problem, these cognitive deficit issues still exist - for teams coordinating work, even users adopting new shit, etc.
As an industry we are leaning heavy into accepting "slop" as the status quo - we care more about efficiency of output right now. Slop will get better & we can become more adaptive to living with the paradox of amazing yet delicate systems generated by AI. But I feel big shifts coming in this regard and if/when it does we may find ourselves in the dystopia of broader unemployment with worse net outcomes.
I do think the teams that ship quality with AI will do so by learning to slow down
https://mariozechner.at/posts/2026-03-25-thoughts-on-slowing...
by ramoz