Productivity gains from AI coding assistants haven’t budged past 10% – survey

2/19/2026 at 8:14:26 PM

This is self-reported productivity, in that devs are saying AI saves them about 4 hours per week. But let’s not forget the METR study that found a 20% increase in self-reported productivity but a 19% decrease in actual measured productivity.

(It used a clever and rigorous technique for measuring productivity differences, BTW, for anyone as skeptical of productivity measures as I am.)

by jdlshore

2/19/2026 at 9:10:04 PM

Let's also not forget the multiple other studies that found significant boosts to productivity using rigorous methods like RCTs.

However, because these threads always go the same way whenever I post this, I'll link to a previous thread in hopes of preempting the same comments and advancing the discussion! https://news.ycombinator.com/item?id=46559254

Also, DX (whose CTO was giving the presentation) actually collects telemetry-based metrics (PR's etc.) as well: https://getdx.com/uploads/ai-measurement-framework.pdf

It's not clear from TFA if these savings are self-reported or from DX metrics.

by keeda

2/19/2026 at 8:39:45 PM

https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...

That info is from mid 2025, talking about models released in Oct 2024 and Feb 2025. It predates tools like Claude Code and Codex, Lovable was 1/3 current ARR, etc.

This might still be true but we desperately need new data.

by samuelknight

2/19/2026 at 8:54:57 PM

None of those changes address the issue jdlshore is pointing out: self assessed developers productivity increases from LLMs are not a reliable indication of actual productivity increases. It's true that modern LLMs might have less of a negative impact on productivity or increase it, but you won't be able to tell by asking developers if they feel more productive.

(Also, Anthropic released Claude Code in Febuary of 2025, which was near the start of the period the study ran).

by lunar_mycroft

2/19/2026 at 8:56:41 PM

Yeah new data would be great, but i feel like these tools are not substantively better and this is becoming the new "its different this time!"

by monkaiju

2/19/2026 at 8:38:30 PM

Has the METR study been replicated?

by williamcotton

2/19/2026 at 8:59:19 PM

Not a scientific study, but someone did replicate the experiment on themselves [0] and found that in their case, any effect from LLM use wasn't detectable in their sample. Notably they almost certainly had more experience with LLMs than most of the METR participants did.

[0] https://mikelovesrobots.substack.com/p/wheres-the-shovelware...

by lunar_mycroft

2/19/2026 at 8:43:57 PM

I haven’t heard about any similar studies, no. I’m planning to conduct one at my workplace but we’re still deciding exactly which uses of AI to test.

by jdlshore

2/19/2026 at 7:36:33 PM

You're only as fast as your biggest bottleneck. Adding AI to an existing organization is just going to show you where your bottlenecks are, it's not going to magically make them go away. For most companies, the speed of writing code probably wasn't the bottleneck in the first place.

by overgard

2/19/2026 at 7:47:27 PM

the amount of people that work in technology and have never heard of amdahl's law always shocks me

https://en.wikipedia.org/wiki/Amdahl's_law

a 100% increase in coding speed means I then I get to spend an extra 30 minutes a week in meetings

while now hating my job, because the only fun bit has been removed

"progress"

by blibble

2/19/2026 at 8:38:32 PM

So if I'm understanding you correctly, prior to AI tools you spent 1 hour per week coding? And now you spend 30 minutes per week?

by hnuser847

2/19/2026 at 8:16:03 PM

the number of people who have heard of Amdahl's law but don't know when to use "amount of X" vs "number of Y" always shocks me as well

by zht

2/19/2026 at 8:12:13 PM

Agreed. The bottleneck is QA/Code review and that is never going away from most corps. I've never worked at a job in tech that didn't require code review and no, asking a code agent to review a PR is never going to be "good enough".

And here we are, the central argument for why code agents are not these job killing hype beasts that are so regularly claimed.

Has anyone seen what multi-agent code workflows produce? Take a look at openclaw, the code base is an absolute disaster. 500k LoC for something that can be accomplished in 10k.

by qudat

2/19/2026 at 8:53:03 PM

My head of engineering spent half a day creating a complex setup of agents in opencode, to refactor a data model across multiple repositories. After a day running agents and switching between providers to work around the token limits, it dumped a -20k +30k change set we'll need to review.

If we're very lucky, we'll break even time wise compared to just running a single agent on a tight leash.

by lbreakjai

2/19/2026 at 10:19:11 PM

YOLO. Just ship it.

by voidfunc

2/19/2026 at 9:54:49 PM

While reading your comment the Benny Hill theme Yackety Sax started playing in my head.

by jimbokun

2/19/2026 at 8:25:05 PM

> I've never worked at a job in tech that didn't require code review

I have. Sometimes the resulting code was much worse than what you get from an LLM, and yet the project itself was still a success despite this.

I've also worked in places with code review, where the project's own code quality architecture-and-process caused it to be so late to the market it was an automatic failure.

What matters to a business is ideally identical to the business metrics, which are usually not (but sometimes are) the code metrics.

by ben_w

2/19/2026 at 8:42:48 PM

The bottleneck at larger orgs is mostly always decision-making.

Getting code written and reviewed is the trivial part of the job in most cases, discovering the product needs, considering/uncovering edge-cases, defining business logic that is extensible or easily modifiable when conditions change, etc. are the parts that consume 80% of my time.

We in the engineering org at the company I work for have raised this flag many times during adoption of AI-assisting tools, now that the rollout is deeply in progress with most developers using the tools, changing workflows, it has become the sore thumb sticking out: yes, we can deliver more code if it's needed but for what exactly do you need it?

So far I haven't seen a speed up in decision-making, the same chain of approvals, prioritisation, definitions chugs along as it was and it is clearly the bottleneck.

by piva00

2/19/2026 at 8:35:21 PM

i dont think thats actually the bottleneck?

the bottleneck is aligning people on what the right thing to do is, and fiting the change into everyone's mental models. it gets worse the more people are involved

by 8note

2/19/2026 at 8:28:17 PM

> Take a look at openclaw, the code base is an absolute disaster. 500k LoC for something that can be accomplished in 10k.

Mission accomplished: acquhire worth probably millions and millions.

I agree with you, by the way.

by oblio

2/19/2026 at 8:40:10 PM

It was a hire not an acquihire. There was no acquisition.

by MYEUHD

2/19/2026 at 9:06:11 PM

There was a big payoff on signing so to-may-to, to-mah-to.

by oblio

2/19/2026 at 8:16:10 PM

I'm sorry but consider how many more edge cases and alternatives can be handled in 500k LoC as compared to that tiny 10k.

In the days of AGI, higher LoC is better. It just means the code is more robust, more adaptable, better suited to real world conditions.

by co_king_5

2/19/2026 at 8:58:51 PM

That’s… not how software works, no matter how it is produced. Complexity is the enemy; always.

by Yodel0914

2/19/2026 at 8:06:13 PM

In high-performance teams it is. In bike-shedding environments of course it is not.

by menaerus

2/19/2026 at 9:34:37 PM

I'm not sure I'd call it bike shedding so much as that a lot of time and effort tends to go into hard to answer questions: what to build, why to build it, figuring out the target customer, etc. A lot of times going a thousand miles per hour with an LLM just means you figure out pretty quickly you're building the wrong thing. There's a lot of value to that (although we used to just call this "prototyping"), but, that doesn't remove the work of actually figuring out what your product is.

The least productive teams I've been on, it wasn't usually engineering talent that was the problem, it was extremely vague or confused requirements.

by overgard

2/19/2026 at 10:12:30 PM

I think you meant to say incompetent leadership.

by menaerus

2/19/2026 at 8:02:58 PM

This. The key bottleneck in many organizations is the "socialize and align" on what to build. Or just "socialize and align" in general. :)

by outside1234

2/19/2026 at 7:43:51 PM

one thing that aways slowed me down was writing jsdocs and testing.

Now i can write one example of a pass and then get codex to read the code and write a test for all the branches in that section saves time as it can type a lot faster than i can and its mostly copying the example i already have but changing the input to hit all the branches.

by vorticalbox

2/19/2026 at 7:56:56 PM

> let's have LLMs check our code for correctness

Lmao. Rofl even.

(Testing is the one thing you would never outsource to AI.)

by otabdeveloper4

2/19/2026 at 8:04:48 PM

Outsourcing testing to AI makes perfect sense if you assume that tests exist out of an obligation to meet some code coverage requirements, rather than to ensure correctness. Often I'll write a module and a few tests that cover its functionality, only for CI to complain that line coverage has decreased and reject my merge! AI to the rescue! A perfect job for a bullshit generator.

by idle_zealot

2/19/2026 at 8:37:51 PM

outsourcing testing the AI also gets its code to be connected to deterministic results, and show let the agent interact with the code to speculate expectations and check them against the actual code.

it could still speculate wrong things, but it wont speculate that the code is supposed to crash on the first line of code

by 8note

2/19/2026 at 8:41:41 PM

> Testing is the one thing you would never outsource to AI

That's not really true.

Making the AI write the code, the test, and the review of itself within the same session is YOLO.

There's a ton of scaffolding in testing that can be easily automated.

When I ask the AI to test, I typically provide a lot of equivalence classes.

And the AI still surprises me with finding more.

On the other hand, it's equally excellent at saying "it tested", and when you look at the tests, they can be extremely shallow. Or they can be fairly many unit tests of certain parts of the code, but when you run the whole program, it just breaks.

The most valuable testing when programming with AI (generated by AI, or otherwise) are near-realistic integration tests. That's true for human programmers, but we take for granted that casual use of the program we make as we develop it constitutes as a poor man's test. When people who generally don't write tests start using AI, there's just nothing but fingers crossed.

I'd rather say: If there's one thing you would never outsource to AI, it's final QA.

by sshine

2/19/2026 at 8:31:45 PM

> (Testing is the one thing you would never outsource to AI.)

I would rephrase that as "all LLMs, no matter how many you use, are only as good as one single pair of eyes".

If you're a one-person team and have no capital to spend on a proper test team, set the AI at it. If you're a megacorp with 10k full time QA testers, the AI probably isn't going to catch anything novel that the rest of them didn't, but it's cheap enough you can have it work through everything to make sure you have, actually, worked through everything.

by ben_w

2/19/2026 at 8:09:21 PM

You don't use the LLM to check your code for correctness; you use the LLM to generate tests to exercise code paths, and verify that they do exercise those code paths.

by LoganDark

2/19/2026 at 8:30:45 PM

And that test will check the code paths are run.

That doesn't tell you that the code is correct. It tells you that the branching code can reach all the branches. That isn't very useful.

by onion2k

2/19/2026 at 7:33:54 PM

I think that over time people will start looking at AI-assisted coding the same way we now look at loosely typed code, or at (heavy) frameworks: it saves time in the short term, but may cause significant problems down the line. Whether or not this tradeoff makes sense in a specific situation is a matter of debate, and there's usually no obviously right or wrong answer.

by nasretdinov

2/19/2026 at 7:43:06 PM

Once the free money runs out, the AI cos may shift to making heavily verified code snippets with more direct language control. This will heavily simplify a lot of boilerplate instead of fairytales of some AGI coding wiz.

by doomslayer999

2/19/2026 at 7:45:11 PM

Isn't the boilerplate that "AI" is capable of generating becoming more and more dated with each passing day?

Are the AI firms capable of retraining their models to understand new features in the technologies we work with? Or are LLMs going to be stuck generating C.A. 2022 boilerplate forever?

by co_king_5

2/19/2026 at 9:57:00 PM

I mean if people continue checking open source code into GitHub using those new features then they should be able to learn them just fine.

by jimbokun

2/19/2026 at 10:30:07 PM

This is only true if there continues to be tremendous amounts of money/hardware/power available to perform the training, in perpetuity.

by danaris

2/19/2026 at 7:46:50 PM

No to the first question, and maybe with a lot of money for the second question.

by doomslayer999

2/19/2026 at 8:28:55 PM

In the 20 years I've been in the industry, boiler plate has dropped dramatically in the backend.

Right now, front end has tons of boiler plate. It's one of the reasons AI hassle such a wow factor for FE, trivial tasks require a lot of code.

But even that is much better than it was 10 years ago.

That was a long way of saying I disagree with your no.

by mattmanser

2/19/2026 at 8:36:28 PM

FE has a lot of boilerplate only if you’re starting from scratch every single time. That’s why we had template systems and why we invented view libraries. Once you’ve defined your libraries, you just copy-paste stuff.

by skydhash

2/19/2026 at 7:48:46 PM

It seems like they should be able to “overweight” newer training data. But the risk is the newer training data is going to skew more towards AI slop than older training data.

by matthewbauer

2/19/2026 at 7:59:47 PM

There won't ever be newer training data.

The OG data came from sites like Stackoverflow. These sites will stop existing once LLMs become better and easier to use. Game over.

by otabdeveloper4

2/19/2026 at 8:08:43 PM

Every time claude code runs tests or builds after a change, it's collecting training data.

by esclerofilo

2/19/2026 at 8:10:51 PM

Has Anthropic been able to leverage this training data successfully?

by co_king_5

2/19/2026 at 8:23:02 PM

I can't pretend to know how things work internally, but I would expect it to be involved in model updates.

by esclerofilo

2/19/2026 at 8:32:55 PM

You need human language programming-related questions to train on too, not just the code.

by otabdeveloper4

2/19/2026 at 8:39:13 PM

thats what the related chats are for?

by 8note

2/19/2026 at 9:19:33 PM

It really depends on the situation. I think there's an argument for generating in a lower level strongly typed language, where most of the work of writing the pointlessly verbose parts is eliminated, any errors are found by the compiler immediately, but it still leaves the option for handwritten optimizations when needed. Sort of how one can drop down to C in python for the parts that need more performance.

by moffkalast

2/19/2026 at 8:07:58 PM

Apparently "AI is speeding up the onboarding process", they say. But isn't that because the onboarding process is about learning, and by having an AI regurgitate the answers you can complete the process without learning anything, which might speed it up but completely defeats the purpose?

by ptx

2/19/2026 at 8:31:13 PM

Yes, that's how I'd interpret it, too.

According to the article, onboarding speed is measured as “time to the 10th Pull Request (PR).”

As we have seen on public GitHub projects, LLMs have made it really easy to submit a large number of low-effort pull requests without having any understanding of a project.

Obviously, such a kind of higher onboarding speed is not necessarily good for an organization.

by raphman

2/19/2026 at 9:59:22 PM

Yeah it should only count ACCEPTED pull requests.

by jimbokun

2/19/2026 at 8:36:28 PM

I think there's definite scope for that being true; not because you can start doing stuff before you understand it (you can), but because you can ask questions of a codebase your unfamiliar with to learn about it faster.

by mjfisher

2/19/2026 at 8:42:04 PM

id guess the time til forst being able to make useful changes has dropped to near zero, but the time to get mastery of the code base has gone towards infinity.

is that mastery still useful as time goes on though? its always felt a bit like its unhealthy for code to have people with mastery on it. its a sign of a bad bus factor. every effort ive ever seen around code quality and documentation improvement has been to make that code mastery and full understanding irrelevant.

by 8note

2/19/2026 at 8:10:26 PM

Correct. Reading code is important. The details are in the minutia, and the way code works is that the minutia are important.

Summarizing this with AI makes you lose that context.

by OptionOfT

2/19/2026 at 8:16:43 PM

This has been my experience as a dev, and it always confuses me when people say they prefer to work at a “higher level”. The minutiae are often just as important as some of the higher level decisions. Not everything, but not an insignificant portion either. This applies to basic things like correctness, performance, and security - craft, style, and taste are not involved.

by snsjzhhz

2/19/2026 at 8:18:38 PM

> This has been my experience as a dev, and it always confuses me when people say they prefer to work at a “higher level”.

> The minutiae are often just as important as some of the higher level decisions.

Frankly, a failure to understand this is a tell that someone is not equipped to evaluate code quality.

by co_king_5

2/19/2026 at 7:49:46 PM

Unsurprising for multiple reasons. Most organizations have other bottlenecks and limiting factors than “how fast can you develop”.

Regardless, if you’re a dev who is now 2x as productive in terms of work completed per day, and quality remains stable, why should this translate to 2x the output? Most people are paid by the hour and not for outcomes.

And yes, I am suggesting that if you complete in 4 hours that which took you 8 hours in 2019, that you should consider calling it a day.

by xeiotos

2/19/2026 at 7:40:47 PM

I found the title for this post misleading. To clarify it a bit, AI has only improved productivity by 10% even though 93% of devs are using it.

by bluejekyll

2/19/2026 at 8:00:47 PM

Yeah, the title may suggest that productivity is still 10% out of 100% after CEOs fired half of developers believing that the rest will do all the job with the help of AI.

by dandanua

2/19/2026 at 7:46:13 PM

I think some AI companies are just now starting to feel the pressure to profit.

Soon, I predict we will see a pretty significant jump in price that will make a 10% productivity gain seem tiny compared to the associated bills.

For now, these companies are trying to reach critical mass so their users are so dependant on their tech that they have to keep paying at least in the short term.

by ilovetux

2/19/2026 at 8:54:57 PM

The real takeaway here -- also corroborated by the DORA 2025 report https://dora.dev/research/2025/ -- is that more than anything, AI amplifies your current development culture. Organizations with strong quality control discipline enjoy more velocity, those with weak practices suffer more outages.

Expecting AI to magically overcome your development culture is like expecting consultants to magically fix your business culture.

Furthermore, by various estimates, engineers only spend 10 - 60% of their time on actual code. So, given that currently AI is largely used only for coding activities, 10% is actually considerable savings.

Also this is the result of retro-fitting AI into existing workflows; actual "AI-native" workflows would probably look very different, likely having refactored in other parts of software engineering. Spotify's "Honk" workflow is probably just a starting point.

by keeda

2/19/2026 at 9:06:23 PM

I'm pretty sure it has to do with the individual as well as the culture. Juniors/new hire use AI to multiply by two their wrong/unsafe output, and seniors then have to spend more time correcting it.

I'll be honest: I piss poor code, each time I come back to an old project I see where I could have done better. New hires are worse, but before AI (and especially Opus) they didn't produce that much code before spending like 6 months learning (I'm on a netsec tooling team). Now, they start producing code after two weeks or less, and every line have to be checked because they don't understand what they are doing.

I think my personal output was increased by 15% on average (maybe 5 on difficult projects), but our team output decreased overall.

by orwin

2/19/2026 at 9:19:57 PM

Yes, we as a society urgently have to figure out how to learn and educate with AI. There are even studies showing that students who use AI to do their work do not learn the necessary skills.

And I'm also hearing grumblings about entry level talent that is absolutely clueless without AI, which does not help the junior hiring scene at all.

At this point it seems clear that people wishing to learn a discipline should restrict their usage of AI until they have "built the muscles", but none of our educational, testing, recruitment and upskilling practices are conducive to that.

by keeda

2/19/2026 at 8:44:58 PM

My biggest road blocks as an engineer has almost never been the authorship of code but everything else around it.

* Getting code reviewed

* Making sure its actually solving the problem

* Communicating to the rest of the team whats happening

* Getting tests to pass

* Getting it deployed

* Verifying that the fix is implemented in production

* Starting it all over when there is a misunderstanding

Slinging more code faster is great and getting unit testing more-or-less for free is awesome but the separation between a good and great engineer is one of communication and management.

AI is causing us to regress to thinking that code velocity is a good metric to use when comparing engineers.

by kiernanmcgowan

2/19/2026 at 8:18:29 PM

As far as I can tell from my workplace the total impact on productivity is neutral to negative.

by chvid

2/19/2026 at 8:20:49 PM

I read this article as the CTO being the bottleneck if he's only seeing 10% productivity boost at his organization.

I dont think this is a purely AI problem more with the legacy costs of maintaining many minds that can't be solved by just giving people AI tools until the AI comes for the CTO role (but not CEO or revenue generating roles) too and whichever manager is bottlenecking.

I imagine a future where we have Nasdaq listed companies run by just a dozen people with AI agents running and talking to each other so fast that text becomes a bottleneck and they need another medium that can only be understood by an AI that will hold humans hand

This shift would also be reflected by new hardware shifts...perhaps photonic chips or anything that lets AI scale up crazy without the energy cost....

Exciting times are ahead AI but it's also accelerating digital UBI....could be good and bad.

by agentifysh

2/19/2026 at 8:27:18 PM

> it's also accelerating digital UBI

Do you have sources for this claim?

by Nezteb

2/19/2026 at 8:32:02 PM

A 10% uplift in productivity for the cost of probably 0.001% of the salary budget is an incredible success.

by onion2k

2/19/2026 at 8:35:58 PM

This is exactly right. And assuming organizations use the gains to cut headcount rather than boost total productivity, a 10% reduction in white collar employment would still be an era-defining systemic shock to the economy.

by arctic-true

2/19/2026 at 8:41:07 PM

Productivity improvements from automation actually result in an increase in jobs, not fewer jobs. Basic economics.

by emp17344

2/19/2026 at 8:47:39 PM

How are CTO's so out of touch and yet loud and proud about it.

by wewewedxfgdf

2/19/2026 at 8:29:26 PM

The title is misleading. Productivity isn't at 10%, it's at 110%.

by rcfox

2/19/2026 at 9:56:13 PM

Ximm's Law applies to the "plateau" of 10%

In other words: notionally, if not literally, by the time trailing numbers are collected they are out of date.

This is of course axiomatic, but, that staleness is a serious matter in this particular moment.

It's a cliché that six months can be a lifetime on the bleeding edge of tech.

This is the first time in my career that is more or less literally true.

Humans reason poorly with non-linear change.

This entire article is a demonstration of that.

by aaroninsf

2/19/2026 at 7:45:04 PM

Blunt opinion: Most devs are not that good and really only execute what they are told to do.

The threat of AI for devs, and the way to drastically improve productivity is there: keep the better devs who can think systemically, who can design solutions, who can solve issues themselves and give them all the AI help available, cut the rest.

by mytailorisrich

2/19/2026 at 8:24:39 PM

That’s how I feel too. When I was an architect at a ~300-person company, a big chunk of my job shifted to reviews, technical design docs, and guidance. I’m getting great results by feeding context like that into Claude Code, then reviewing and steering what it produces.

It really does feel like a multiplier on me and I understand things enough to get my hands dirty where Claude struggles.

Lately I’ve been wondering if that role evolves into a more hierarchical review system: senior engineers own independent modules end-to-end, and architects focus on integration, interfaces, and overall coherence. Honestly, the best parts of our product already worked like that even before AI.

by tyleo

2/19/2026 at 7:34:44 PM

I can see where productivity could be higher if all I did was type in programs to some spec, or bootstrapping new apps all day - but that's like not the reality of "programming", at least for me past 25 years. Sorting through what to even make and interpreting "requirements" is what takes the most time

by gedy

2/19/2026 at 7:51:55 PM

AI adoption has reduced productivity at my workplace, and by a noticeable amount!

by moralestapia

2/19/2026 at 8:16:14 PM

This will lead to natural selection. As AI becomes increasingly integrated into all areas, companies that manage it less effectively than others will face greater selection pressure.

by randomtoast

2/19/2026 at 8:42:13 PM

Or, AI will turn out to just not be that useful.

by emp17344

2/19/2026 at 9:06:48 PM

It's such a weird effect.

At a personal level, AI has made non-trivial improvements to my life. I can clearly see the value in there.

At an organizational level, it tends to get in the way much more than helping out. I do not yet see the value in there.

by moralestapia

2/19/2026 at 8:02:03 PM

That's expected for any new "low-code" solution du jour.

by otabdeveloper4

2/19/2026 at 7:35:24 PM

Yeah, industry has told them that devs aren't valuable and AI can do their job. Who TF has motivation after that?

by downrightmike

2/19/2026 at 8:45:08 PM

People getting paid >$400k TC

by havefunbesafe

2/19/2026 at 7:40:01 PM

No motivation? I'm sorry buddy but your ass is getting replaced by Claude Code in the next 3-6 weeks.

by co_king_5

2/19/2026 at 8:03:04 PM

[dead]

by ihsw