Anthropic Education the AI Fluency Index

2/23/2026 at 6:07:09 PM

> But we know that any person who uses AI is likely to improve at what they do.

Do we?

by mlpoknbji

2/23/2026 at 6:12:04 PM

I would suggest that any person who uses AI will atrophy their compositional skills unless they specifically take care to preserve those skills.

by co_king_5

2/23/2026 at 6:36:48 PM

As a student, I constantly worry about this. But everyone in my class is producing output at a pace I can't compete with without AI assistance.

by rishabhaiover

2/23/2026 at 6:44:34 PM

what class are you in that "producing output at a [rapid] pace" is relevant to the grade?

by Avshalom

2/23/2026 at 6:53:01 PM

pick any cs class

by rishabhaiover

2/23/2026 at 6:55:16 PM

I have a minor in CS and no -producing the assignment by the deadline is important- grades are not based on quantity of code vs classmates.

by Avshalom

2/23/2026 at 6:58:47 PM

I mean, maybe things have changed (I finished college about 20 years ago), but I don't remember producing large volumes of stuff as being a particularly important part of a CS degree.

by rsynnott

2/23/2026 at 7:25:42 PM

Between a challenging job market, increasing new frontiers of learning (AI, MLops, parallel hardware) and an average mind like mine, a tool that increases throughput is likely to be adopted by masses, whether you like it or not and quality is not a concern for most, passing and getting an A is (most of my professors actively encourage to use LLMs for reports/code generation/presentations)

by rishabhaiover

2/23/2026 at 7:58:28 PM

It will be a very interesting experiment when your generation of computer science graduates enters the job market, to put it mildly.

by plastic-enjoyer

2/23/2026 at 9:29:04 PM

Individuals believe they act freely, but they are constrained and directed by historical forces beyond their awareness - Leo Tolstoy

by rishabhaiover

2/23/2026 at 7:02:54 PM

That was never a worry in any of my CS classes.

by lawn

2/23/2026 at 7:52:43 PM

My brother is a CS student and he is pretty much in the same boat.

by co_king_5

2/23/2026 at 7:30:06 PM

Copying AI slop isn’t producing output! It’s also not conducive to learning

by theappsecguy

2/23/2026 at 11:25:58 PM

As if you are just a such a genius the models are of no use to you.

How can you not think that makes you sound like a complete moron?

by fatherwavelet

2/23/2026 at 6:24:33 PM

Yah and this seems to be supported by preliminary evidence on the impact of AI on things like retention and cognitive ability.

by Insanity

2/23/2026 at 10:52:26 PM

Not even just skills, motivation too.

by wasmainiac

2/23/2026 at 7:55:04 PM

I could have sworn there was research that stated the more you use these tools the quicker your skills degrade, which honestly feels accurate to me and why I've started reading more technical books again.

by shimman

2/23/2026 at 11:21:51 PM

I just don't understand how someone can have these models at their disposal not learn anything?

The general lack of intellectual curiosity is just mind blowing to me.

by fatherwavelet

2/23/2026 at 8:01:05 PM

> I've started reading more technical books again

How's that working out for you in the context of working with AI tools? Do you feel like it's helping you make better use of them? Or keeping your mind sharp?

I've been considering getting some books on core topics I haven't (re)visited in a long time to see if not having to write as much code anymore instead gives me time to (re)learn more and accelerate.

by rkomorn

2/23/2026 at 6:31:43 PM

Not until large-N research is done without sponsorship, support, or veiled threats from AI companies.

At which point, if the evidence turns out to be negative, it will be considered invalid because no model less recent than November 2027 is worth using for anything. If the evidence turns out to be slightly positive, it will be hailed as the next educational paradigm shift and AI training will be part of unemployment settlements.

by dsr_

2/23/2026 at 7:32:02 PM

We DEEPLY do not.

That's not, IMO, a "skills go down" position. It's respecting that this is a bigger maybe than anyone in living memory has encountered.

by selridge

2/23/2026 at 7:55:21 PM

Clearly this means Anthropic believes this but would be nice to have a footnote pointing to research backing this claim.

by jimbokun

2/23/2026 at 9:25:40 PM

It is also not very convincing considering that while the UI of Claude is not bad it is also not exactly stellar.

by amelius

2/23/2026 at 6:34:46 PM

Let me add a single data point.

> is likely to improve at what they do

personally, my skills are not improving.

professionally, my output is increased

by throwaw12

2/23/2026 at 6:52:45 PM

My software development skillset has improved. I’m learning and stress testing new patterns that would have taken far longer pre-AI. I’m also working in new domains and tech stacks that would have taken me much longer to get up to speed on.

by mobattah

2/23/2026 at 7:00:16 PM

I would even say it's likely the opposite. My output as a programmer is now much higher than before, but I am losing my programming skills with each use of claude code.

by poszlem

2/23/2026 at 7:37:29 PM

People who use AI mindfully and actively can possibly improve.

The olden days of buidling skills and competencies are largely dying or dead when the skills and competencies are changing faster than skills and competency training ever intended to.

by j45

2/23/2026 at 7:52:16 PM

If things change fast, learning becomes even more important. And learning about the principles that don't change becomes most important of all.

by tovej

2/24/2026 at 12:32:31 AM

Yup, continuous learning, the principles that don't change are in part identified and part still coming to the forefront.

by j45

2/23/2026 at 5:57:03 PM

So I guess the key takeaway is basically that the better Claude gets at producing polished output, the less users bother questioning it. They found that artifact conversations have lower rates of fact-checking and reasoning challenges across the board. That's kind of an uncomfortable loop for a company selling increasingly capable models.

by dmk

2/23/2026 at 7:30:16 PM

> the less users bother questioning it

This makes me think of checklists. We have decades of experience in uncountable areas showing that checklists reminding users to question the universe improve outcomes: Is the chemical mixture at the temperature indicated by the chart? Did you get confirmation from Air Traffic Control? Are you about to amputate the correct limb? Is this really the file you want to permanently erase?

Yet our human brains are usually primed to skip steps, take shortcuts, and see what we expect rather than what's really there. It's surprisingly hard to keep doing the work both consistently and to notice deviations.

> lower rates of fact-checking and reasoning challenges

Now here we are with LLMs, geared to produce a flood of superficially-plausible output which strikes at our weak-point, the ability to do intentional review in a deep and sustained way. We've automated the stuff that wasn't as-hard and putting an even greater amount of pressure on the remaining bottleneck.

Rather than the old definition involving customer interaction and ads, I fear the new "attention economy" is going to be managing the scarce resource of human inspection and validation.

by Terr_

2/23/2026 at 7:59:54 PM

Sounds like having a strong checklist of steps to take for every pull request will be crucial for creating reliable and correct software when AIs write most of the code.

But the temptation to short change this step when it becomes the bottleneck for shipping code will become immense.

by jimbokun

2/23/2026 at 7:35:38 PM

> So I guess the key takeaway is basically that the better Claude gets at producing polished output, the less users bother questioning it.

This is exactly what I worry about when I use AI tools to generate code. Even if I check it, and it seems to work, it's easy to think, "oh, I'm done." However, I'll (often) later find obvious logical errors that make all of the code suspect. I don't bother, most of the time though.

I'm starting to group code in my head by code I've thoroughly thought about, and "suspect" code that, while it seems to work, is inherently not trustworthy.

by boplicity

2/23/2026 at 6:09:03 PM

I think we're still at the stage where model performance largely depends on:

- how many data sources it has access to

- the quality of your prompts

So, if prompting quality decreases, so does model performance.

by Florin_Andrei

2/23/2026 at 6:15:14 PM

Sure, but the study is saying something slightly different, it's not that people write bad prompts for artifacts, they actually write better ones (more specific, more examples, clearer goals,...). They just stop evaluating the result. So the input quality goes up but the quality control goes down.

by dmk

2/23/2026 at 8:02:29 PM

Seems like it’s impossible for output to be good if the prompt is bad. Unless the AI is ignoring the literal instructions and just guessing “what you really want” which would be bad in a different way.

by jimbokun

2/23/2026 at 8:34:29 PM

> On two occasions I have been asked, — "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" In one case a member of the Upper, and in the other a member of the Lower, House put this question. I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.

- Charles Babbage, https://archive.org/details/passagesfromlife03char/page/67/m...

EDIT: This is a new iteration of an old problem. Even GIGO [1] arguably predates computers and describes a lot of systemic problems. It does seem a lot more difficult to distinguish between a "garbage" or "good" prompt though. Perhaps this problem is just going to keep getting harder.

1. https://en.wikipedia.org/wiki/Garbage_in,_garbage_out

by AnIrishDuck

2/23/2026 at 6:32:11 PM

What does prompting quality even mean, empirically? I feel like the LLM providers could/should provide prompt scoring as some kind of metric and provide hints to users on ways they can improve (possibly including ways the LLM is specifically trained to act for a given prompt).

by candiddevmike

2/23/2026 at 6:33:18 PM

That would be a quality metric, and right now they are focused on quantity metrics.

by dsr_

2/23/2026 at 10:24:31 PM

> In line with our recent Economic Index, we find that the most common expression of AI fluency is augmentative—treating AI as a thought partner, rather than delegating work entirely. In fact, these conversations exhibit more than double the number of AI fluency behaviors than quick, back-and-forth chats.

> But we also find that when AI produces artifacts—including apps, code, documents, or interactive tools—users are less likely to question its reasoning (-3.1 percentage points) or identify missing context (-5.2pp). This aligns with related patterns we observed in our recent study on coding skills.

Well, sure. If you're asking the AI to produce artifacts directly, it's likely because you pre-judged yourself less competent to do that kind of analysis.

by zahlman

2/23/2026 at 6:40:00 PM

I feel like the authors make a logical inconsistency. They present the drop in "identify missing context" behavior in artifact conversations as potentially concerning, like people are thinking less critically. But their own data suggests a simpler explanation: artifact conversations show higher rates of upfront specification (clarifying goals +14.7pp, specifying format +14.5pp, providing examples +13.4pp). It's obvious that when you provide more context upfront, you end up with less missing context later. I'd be more sceptical about such research.

by kseniamorph

2/23/2026 at 8:22:30 PM

This is a highly circular method of evaluation. It correlates "fluency behaviors" with longer conversations and more back and forth.

What it notably does not correlate any of these these behaviors with is external value or utility.

It is entirely possible that those people who are getting the most value out of LLMs are the ones with shorter interactions, and that those who engage in lengthier interactions are distracting themselves, wasting time, or chasing rabbit trails (the equivalent of falling in a wiki-hole, at the most charitable.)

I can't prove that either -- but this data doesn't weigh in one way or the other. It only confirms that people who are chatty with their LLMs are chatty with their LLMs.

In my own case, I find the longer I "chat" with the LLM the more likely I am to end up with a false belief, a bad strategy, or some other rabbit hole. 90% of the value (in my personal experience) is in the initial prompt, perhaps with 1-2 clarifying follow-ups.

by lukev

2/23/2026 at 5:51:48 PM

I’m not alone in finding this against the claims of the product right?

Claude is meant to be so clever it can replace all white collar work in the next n-years, but also “you’re not using it right?” Which one is it?

by bargainbin

2/23/2026 at 6:34:16 PM

Which one will convince you to buy more Claude? Please answer honestly, it's for the sake of profits.

by dsr_

2/23/2026 at 7:00:40 PM

Anthropic in particular seem to be in a weird place where on the one hand they fund some real research, which is often not all roses and sunshine for them, but on the other hand, like all AI companies, they feel the need to make absurdly over-the-top claims about what's coming up Real Soon Now(TM).

by rsynnott

2/23/2026 at 8:04:33 PM

Anthropic is a weird company where the CEO almost admits at times they are probably building the Torment Nexus, yet still feel they need to do it anyway…because someone else might do it first?

by jimbokun

2/23/2026 at 6:03:41 PM

I'm not quite convinced of the maximalist claims, but these two aren't incompatible. Every time we talk about a company being "mismanaged" by e.g. a private equity buyout, what we mean is that the owners had access to a large volume of high quality white collar work but couldn't figure out how to use it right.

by SpicyLemonZest

2/23/2026 at 4:21:29 PM

You could arrive at the essence of this by just having read and internalized Carl Sagan's The Demon-Haunted World. Especially the Baloney Detection Kit.

In my experience good prompting is mostly just good thinking.

by Kye

2/23/2026 at 7:17:02 PM

And having the experience and judgment to ask the right thing.

by esafak

2/23/2026 at 7:41:02 PM

And being willing to be wrong and to be misled; finding ways to contain that or build forcing functions against it.

In a strange way that's exciting, because it forces me to learn. And sometimes forces me to confront whether stuff I had was domain knowledge or portable as experience.

by selridge

2/23/2026 at 6:33:07 PM

[dead]

by sdf2erf

2/23/2026 at 5:45:38 PM

To the extent that this should be a thing, there are very few people I would want doing it less than the company who has repeatedly been caught lying about its product's achievements. Anthropic should not be taken seriously after their track record.

by bigstrat2003

2/23/2026 at 7:53:26 PM

[dead]

by MarcLore

2/23/2026 at 5:52:54 PM

Honestly to use llms properly all you need to know is that it’s a next word (or action) prediction model and like all models increased entropy hurts it. Try to reduce entropy to get better results. Rest is just sugarcoated nonsense. To use llms properly you need a physics class.

by sarkarghya

2/23/2026 at 6:15:27 PM

Which class? Or what subjects

by Barbing

2/23/2026 at 6:35:11 PM

And then some alignment, prompting structure, and task decomposition.

by rishabhaiover

2/23/2026 at 7:58:41 PM

And praying that your desired output was embedded into the training data that was used to generate the model.

by arcanemachiner