6/30/2026 at 7:24:14 PM
I'm struggling to understand why I'd ever use this instead of just using a lower effort level for opus given on many of the benchmarks listed the cost per task rises above opus at anything higher than medium effort.Only thing I can think of is for when someone is out of opus credits. Of course there are API billing use cases but I'd probably still just use opus on low.
by Jcampuzano2
6/30/2026 at 7:42:31 PM
More and more I find myself trying to stop Opus from doing something stupid, and at every turn I need to tell it to stop overcomplicating things.I think the models are being optimized for wealth extraction from users and companies, instead of solving problems.
I don't know why Opus would try to create an entire library when I told it specifically to do something simple that would take 2-3 lines of Python.
by itopaloglu83
7/1/2026 at 5:38:52 AM
"I think the models are being optimized for wealth extraction from users and companies, instead of solving problems."YES! They introduced the new tokenizer to increase token generation by upto 33%.
On top of this, Anthropic are generating almost twice as much revenue per paid user than openai - whilst their subscriptions have lower usage limits than openai's:
by scrollop
7/1/2026 at 9:10:22 AM
This slot machine has access to your bank account and can decide how much to play on its own!by Traubenfuchs
6/30/2026 at 9:22:30 PM
> More and more I find myself trying to stop Opus from doing something stupid, and at every turn I need to tell it to stop overcomplicating thingsYeah, that’s my thoughts as well. I feel it’s great for benchmarks and some tasks while in other it tries to spend as much tokens as possible, tries to overcomplicate task and needs seconds or third round of steering that costs. With the scale Anthropic operates I bet it’s huge amount of extra money just to make sure their model works.
by __natty__
6/30/2026 at 10:49:49 PM
It’s really weird when you go to one of the open models and suddenly the same context window stretches nearly 3-4 times as long.by Aeolun
7/1/2026 at 9:24:45 AM
> I think the models are being optimized for wealth extraction from users and companies, instead of solving problems.I don't think so. Expect that in a market with high vendor lock-in but that's not the case here. The market is extremely competitive and switching cost are near zero. Anthropic can't afford to pull shit like this and sacrifice quality.
by ngruhn
7/1/2026 at 1:14:42 PM
You don't have LLM-based processes if you think there is no lock-in. There may be no lock-in for coding if you enforce decent rules (but still some ambiguous docs can be interpreted differently), but any non-trivial pipeline/system, these models are not stupid but each has some quirks. Sometimes for some reason they will ignore some instruction while all other models have no trouble following it. These things accumulate.Plus there's subjective stuff even for coding, people learning how to deal with it. Even on HN you can already see cloude/codex camps each strongly convinced that one is better than the other.
by comboy
7/1/2026 at 1:05:01 PM
My employer just finalized a contract with Anthropic, for enterprise Claude Code use. Which means that unless there is a _major_ downgrade in service quality, we are now locked in for the next few years (but at least for a year, although vendor contracts are renegotiated less frequently).Just checked the dashboard, and we seem to have the exact same $200 credit as others, enterprise or not. Token inflation affects us just like everyone else.
It feels a bit like buying the same box of chocolates every day, but the size / weight of the box is shrinking... the price remains unchanged!
by haspok
7/1/2026 at 11:35:23 AM
The disconnect between the reality of and the consumer sentiment of this particular realm of products seems to be one of the most dramatic and widespread I’ve personally ever seen.by notnaut
7/1/2026 at 1:52:37 PM
You'd also expect that in a market where the players think they're in a bubble.by somenameforme
7/1/2026 at 2:09:05 PM
Anthropic can't afford to pull shit like this and sacrifice quality.And yet, the Java language exists.
For market share, ANTHROP\C needs to optimize for the vast mediocrity that are mid-bell-curve users and enterprises.
This adaptation tends to come with significant drag for the right ends of the bell curve firms or teams.
Unless ANTHROP\C have a separate objective function by cohort and ensure that doesn't regress, improving results for the emerging middle will nerf tools from point of view of those with high in-domain expertise.
by Terretta
7/1/2026 at 12:38:28 AM
Yeah. Mine really likes to read excess code. I'll ask it questions like "If I move all these three ETL jobs into a subfolder will it break anything?" It'll start with giving me the simple answer but then continue on to consider another question and realize it requires reading my entire other repo that handles all of my cloud's infrastructure. And it'll proceed to read through tens of thousands of lines of terraform.by indoordin0saur
7/1/2026 at 1:33:11 PM
Then it tells you nothing will break, spend a bunch of tokens on the migration, hit a wall, having an oops moment and telling you it made a mistake assuming a key fact instead of verifying and present you the option to roll everything back or rewrite the rest of your system.by anygivnthursday
7/1/2026 at 2:13:52 PM
By contrast, I keep trying to find an incantation that can get it to AT LEAST read the comments surrounding a code target before a grep replace if it isn't going to read more than the first 60 lines of any doc. (Btw, the receiver of a -tail 60 doesn't seem to know it was cut, it insists it "read" the whole file.)It seems to me ANTHROP\C have harnessed hard for not spending tokens to bring content into context. I wish they'd left us a "LEROY_JENKINS" flag: read and think before you code. In Claude Code anyway, it appears to default to:
export CLAUDE_CODE_LEROY_JENKINS=true
by Terretta
6/30/2026 at 8:10:24 PM
> I don't know why Opus would try to create an entire library when I told it specifically to do something simple that would take 2-3 lines of Python.Because it reasons in one direction. First it encounters some kind of issue with 2-3 lines of Python that might make it not work, and then it goes onto plan B, which is making a library, but it doesn't circle back and compare the effort of making the library to working around whatever might make the 2-3 lines not work. Except sometimes it does, because it's inscrutable.
by post-it
7/2/2026 at 12:37:25 AM
So, as the quip goes, junior level reasoning.by ethbr1
7/1/2026 at 12:45:29 PM
My experience with Opus in the last weeks is the opposite. I have the feeling Opus got smarter since they released and blocked Fable. Maybe they got more compute available since a) they finished Training Mythos/Fable and b) couldn't provide inference for it?by benny_s
7/1/2026 at 1:29:42 PM
Interesting that I have the exact opposite experience with Opus 4.8 being nearly unusable dumb in the past couple of days. I was trying to explain this as the new Sonnet release announcement may have overloaded their systems again, but let's see in a few days. Right now it hurts more to my workflow than helps.by anygivnthursday
7/1/2026 at 2:14:55 PM
Agree.And this type of observation appears highly related to how people hold it.
by Terretta
7/1/2026 at 9:09:28 AM
It's really bad when you let opus do investigations on broken java or infrastructure stuff. It starts decompiling .jar, sometimes multiple versions of the same dependency, reading every single kubernetes/terraform file and loading all the logs and info kubectl offers.by Traubenfuchs
7/1/2026 at 1:35:32 PM
Is this a new thing? At least I only noticed this recently that instead of looking at sources now it prefers to decompile and read java byte code for some reason.by anygivnthursday
6/30/2026 at 10:46:14 PM
[dead]by MagicMoonlight
7/1/2026 at 12:15:02 AM
[flagged]by 3ffs
6/30/2026 at 7:29:43 PM
Older Opus models will likely get deprecated and then over time this is the cheapest model. That is how prices are currently increased.by nicce
6/30/2026 at 9:35:21 PM
Yeah... Sonnet becomes the new cheap model, and some Fable class model becomes the more expensive/better one.by ChrisLTD
7/1/2026 at 2:52:32 AM
Wat. Price/perf has been going down massively over the last few years.by theptip
7/1/2026 at 10:22:40 AM
Because they still haven't fully captured the market for Agentic Development.by darkwater
6/30/2026 at 8:47:53 PM
Looking at some of the agentic coding benchmarks on the system card[0], pages 117-118, it seems that running it at low outperforms Sonnet 4.6 at any level, and is a good deal cheaper as well. So on low it could be a good workhorse for an Opus-planned task.by phainopepla2
7/1/2026 at 4:06:03 PM
That is certainly an improvement then. Sonnet 4.6 is a great everyday agent for the limited Pro plan, but it’s not much better than M3 or Kimi 2.7, both significantly cheaper models.by port11
6/30/2026 at 7:35:07 PM
Speed is a huge reason. Sometimes you just need some simple tasks get done fast, and waiting 30-60 seconds for opus to even start thinking can really slow things down.by enraged_camel
6/30/2026 at 7:52:10 PM
Opus with low reasoning effort would be faster than Sonnet with high reasoning. So that won't exactly help. I think it would just be what those models are optimized to performby humanymous
6/30/2026 at 10:39:49 PM
Specific task based benchmarks don't reflect a lot of day to day agentic use cases in my experience. If you are working on a series of discrete tasks and can clear context after each one and move to the next, you might get that sort of efficiency from Opus low effort. I often find that when working through a real problem, iterating and discovering, context length can creep up, and that is where opus tends to get expensive.by c0m47053
6/30/2026 at 7:57:54 PM
Maybe it's not for you? I don't pay, so I can't even use Opus... So this is an upgrade over Sonnet 4.6 for me.by SirMaster
7/1/2026 at 11:58:42 AM
Is there a router or wrapper that provides a real-time cost estimation for alternative settings? Obviously, you can't predict exact output tokens without running the inference, but a tool that calculates the exact input cost across models and applies a historical average for the output tokens could be useful. Like, you run a task on Sonnet, and it estimates: "Based on your input tokens and a 1:1 output ratio, this would have cost $X on Opus at a low effort level."by licjon
7/1/2026 at 2:12:35 PM
Not sure of any out-of-the-box tool. But Anthropic has a token count API which gives a near estimate of the input tokens for messages [1].So this API can be used in a UserPromptSubmit hook [2] in the harness, get the token count for any model, calculate the cost and compare.
[1] https://platform.claude.com/docs/en/build-with-claude/token-...
by annjose
7/1/2026 at 2:51:49 AM
Are we reading the same chart? They have Sonnet <= high as Pareto dominant on $/perf.You have to test each task obviously but it is not a bad model on its face.
by theptip
7/1/2026 at 9:48:44 AM
They have updated itby frozeus
7/2/2026 at 1:41:13 AM
Ha! So we were not looking at the same chart. That makes more sense.> Anthropic did post an official explanation, stating the original chart used a "simpler methodology" that "underestimated Sonnet 5's performance." The new chart supposedly uses their "standard methodology."
Oops!
by theptip
7/1/2026 at 10:38:35 AM
Did Anthropic have Opus 4.8 and Sonnet 5 switched in the Agentic Search chart at first?by LUmBULtERA
7/1/2026 at 11:21:55 AM
No, and the original had everything more expensive. There's a comparison here:https://www.reddit.com/r/ClaudeAI/comments/1ukgqwr/looks_lik...
The explanation Anthropic gave for the update doesn't address how the x-axis needed to range up to $50 previously and only $10 now. In any case the pass rates are also lower.
Probably the difference between whatever it is people notice when they say models become "nerfed".
by fluidcruft
7/1/2026 at 1:53:10 PM
Huh, super interesting. Thanks!by LUmBULtERA
7/1/2026 at 12:49:23 PM
I would use Sonnet instead of Opus because it's faster. Isn't it? It's a smaller modelby hollownobody
7/1/2026 at 11:28:09 AM
I concur. I already use Opus 4.8 for almost all my tasks and this gives me almost no reason to try Sonnet 5.by southforgeai
7/1/2026 at 11:18:07 AM
If you are out of Opus credits, you are out of all model credits.by fluidcruft