5/12/2026 at 4:58:14 PM
The fact that management signed off on measuring AI use through token usage shows how incompetent management really is, including in allegedly technical conmpanies like Amazon. Tokenmaxxing was an entirely expected and rational response. IOW You measure employees in stupid ways, you're going to get stupid behaviour as a consequence.by i7l
5/12/2026 at 5:31:32 PM
One argument I have heard in favour of this is that management knew this would be a side effect, but that it's more important to have people engage with AI as much as possible simply to explore what is actually possible. You are effectively knowingly wasting money in the expectation that you might learn something useful that will be more valuable in the long run.by this_user
5/12/2026 at 9:17:02 PM
If companies are suddenly willing to spend money on letting their staff experiment, why not let them experiment with what they want to? They probably know more about technology than you do, otherwise you wouldn't need them.by oytis
5/12/2026 at 6:42:24 PM
in this instance - it seems like Amazon employees are wasting money exploring ways to waste money.by aerodexis
5/12/2026 at 6:08:28 PM
My questions for that approach are: Why treat AI as a special technology that needs enterprise-scale exploration to come up with a useful application? And why not take the alternative approach of identifying the subset of people who have indeed found solid uses and spread their best practices around?The top-down approach to encouraging (mandating?) AI usage strikes me as infantilizing to the workers, who are perfectly capable of choosing which tools they use and when.
by the_snooze
5/12/2026 at 6:18:53 PM
Human nature?In the early nineties, it was common for experienced electrical engineers to keep on using schematic entry digital design and look down on RTL and synthesis tools, despite that fact the latter was already way more productive. At some point, management had to put their foot down and force everyone to switch to using synthesis.
It's not unreasonable to assume that many people are set in their ways and unwilling to change their behavior without a bit of a push.
by tverbeure
5/12/2026 at 9:26:42 PM
Alpha 21064, 1992, was using domino logic [1].[1] https://en.wikipedia.org/wiki/Domino_logic
There was no synthesis algorithms that would map VHDL or Verilog designs into domino logic elements at the time. I believe that the most work in the synthesis-to-domino-logic area was done at the beginning of current century.
So, DEC's engineers and, I think, Intel's engineers were doing work using schematics well into 21-st century.
by thesz
5/13/2026 at 2:54:01 AM
In the early nineties, standard cell design was already used by everyone except those who needed clock speed at all cost.by tverbeure
5/12/2026 at 6:27:45 PM
I guess the only difference between this and your example is the concrete efficiency gain from RTL and synthesis tools versus dubious applications of AI. I do agree with the second point about pushing people to explore new ways of doing things though.by nophunphil
5/12/2026 at 6:55:05 PM
> dubious applications of AILeaving aside the ethical aspects of using AI (not because they're not valid, because they're off topic for this discussion), in my line of work, the capabilities and productivity improvement of AI are staggering. Most of it is not writing the new code, which is but a small part of chip design, but everything else.
I can't give a concrete work example, but here is an experiment that I ran a month ago. https://tomverbeure.github.io/2026/04/12/AMIQ-License-Key-Ge.... If it can do that, it's not hard to imaging similar use cases related to root causing complex simulation failures. It is frighteningly good at that.
by tverbeure
5/12/2026 at 7:19:08 PM
> use cases related to root causing complex simulation failures.That's a pretty interesting use case. I assume this is for RTL simulation given the thread, but how do you connect the output of the simulator to the AI?
by ua709
5/12/2026 at 7:50:59 PM
For a small case, a colleague took a screenshot of waves in the waveform viewer and pasted it into the AI tool. It worked.But for large cases, use tools to extract all interfaces from the waveform file and save it as a text file, or add $display statements in the Verilog itself to dump the transactions. A SOTA LLM will eat it up. You point it to the RTL, a log file with hundreds of thousands of lines, and give it a few lines to explain how it is supposed to behave. Just tell it "My simulation is hanging. Figure why." Wait 15 minutes and it will tell you why it hangs and which line to change in your code to fix it.
I've done the experiment after the fact: I had spent ~3 days to fix complicated 3 bugs. I then rolled back the code and told it "Here is the spec. Find all the bugs in this code". It found all 3 bugs in around 30 min. That's when I realized that things won't be the same anymore. (And don't get me wrong: I love debugging simulations.)
by tverbeure
5/12/2026 at 8:07:31 PM
This is why I asked:>And why not take the alternative approach of identifying the subset of people who have indeed found solid uses and spread their best practices around?
A bottom-up approach has a far better chance of finding those particularly good use cases, and if you lean on the people how found those fits, they're more persuasive than top-down edicts. They actually know what they're talking about. If the point is to leverage AI for better work outcomes, someone with your experience is far more valuable than "here's a dashboard, make the number go up," which seems to be what's going on at Amazon.
by the_snooze
5/12/2026 at 8:30:00 PM
How do you know up front who will find the best use cases? Both approaches can work.by tverbeure
5/12/2026 at 9:50:10 PM
I'd bet my life savings that the person who is forced to use a tool by top-down edict is less likely to find a valuable use case than the person who is sincerely curious about said tool.by judahmeek
5/12/2026 at 10:17:01 PM
Your mistake is thinking that all people who don't use it are doing so because they're not curious about it.by tverbeure
5/12/2026 at 9:33:10 PM
Have you tried to change your HDL to something more modern like Bluespec System Verilog or, god forbid, anything embedded into Haskell or Scala?I read that BSV source code is about three times shorter than similar design in Verilog and also has three times smaller defect density (defects per significant line of code). So just by changing the HDL from Verilog to BSV one can have nine (9) times less defects in the design.
by thesz
5/12/2026 at 10:19:18 PM
BSV won't help for cases you didn't think about a corner case. (I use SpinalHDL/Scala for all my hobby projects, BTW, and yes, I tend to make less mistakes.)by tverbeure
5/12/2026 at 7:59:51 PM
SOTA = State of the Art? Like say Claude Opus 4.5? I actually want to try this out.by ua709
5/12/2026 at 8:08:08 PM
I think I used Opus 4.6 1M.by tverbeure
5/12/2026 at 8:10:57 PM
Thanks! I'm going to give this a shot on a nasty simulation I'm presently working on... :)by ua709
5/12/2026 at 7:14:17 PM
It is completely unreasonable to assume that. Tech people are so hungry for productivity gains that they regularly will defy management forbidding them from using a tool, because the tool is so good they feel they have to have it.If LLMs truly are as good as their proponents say, engineers will use them even if management outright forbade it. The fact that people aren't using them, and have to be forced, is extremely strong evidence that they are not in fact that useful.
by bigstrat2003
5/12/2026 at 7:53:10 PM
> extremely strong evidence that they are not in fact that usefulSee my other reply in this subthread. For my line of work, they are in fact ridiculously useful.
by tverbeure
5/12/2026 at 8:01:16 PM
> It's not unreasonable to assume that many people are set in their ways and unwilling to change their behavior without a bit of a push.You include those only in second round along with guidelines and recommendations on how to use it effectively.
by watwut
5/12/2026 at 8:18:29 PM
What if those people are some of the most experienced ones, who can see use cases, and flaws, that more junior people won't?by tverbeure
5/12/2026 at 9:58:17 PM
People who have to be forced to try it are unsuitable for the exploratory "find what it is useful for" task regardless of seniority.Also, we are talking about large companies here. There will be plenty of more suitable seniors.
by watwut
5/13/2026 at 9:47:35 AM
googles 20% project time was a good thing, sadly they dont even seem to do it anymore. for the bulk of corporate workers this brief period of time where they get to play an ai token game is the only break from generating TPS reports all day long.by blitzar
5/12/2026 at 6:17:04 PM
A tool so good, the workers need to be forced to use it.by jjk7
5/12/2026 at 9:40:16 PM
Workers are good at their job using tools they know because they had years/decades to hone that craft.The new tool might make a lot of that experience obsolete. Also, some people that were good with old tools might be great with the new tool. Some may not.
Overall, I don't think it's a bad idea to burn some tokens (and money) to let people experiment.
by creative_name3
5/12/2026 at 8:35:05 PM
Exactly. That's the problem ICs don't want to admit.Managing a lot of people at scale is messy and you have to use crude solutions. It's impossible to know everything that's going on.
If you were a manager you wouldn't do any better. Out of the crooked timber of humanity, no straight thing was ever made.
by asdfman123
5/13/2026 at 12:40:33 AM
I think that's a convenient excuse for managers at the top to not have to deal with their own sub par middle and lower managers...by duxup
5/12/2026 at 9:46:08 PM
Your argument that bad processes can't possibly be improved is contradicted by all of recorded history.by judahmeek
5/12/2026 at 11:01:50 PM
That's because you've misconstrued my argument. My argument is that everything is a tradeoff and while management can be MORE conscious, there's a certain level of bullshit that's inevitable.But more specifically ICs tend to want to say "if you just let me do what I know is right it would be fine." That's a trade-off, too, though. That solution means a lot of people will be messing around due to no accountability.
by asdfman123
5/13/2026 at 10:32:57 AM
If the only accountability here is token input/output, after automating that employees would be messing around with no accountability either way.If setting the actual goal you want to achieve as a manager, and then trusting employees to allocate their focus accordingly, you will absolutely have people faffing off (as likely can't be avoided), but at least those who don't will optimize towards what works according to their actual expertise at least.
by croon
5/12/2026 at 7:15:12 PM
All so that they can lose this accumulated knowledge during the next round of layoffs.by newswangerd
5/13/2026 at 3:36:12 AM
This is induced demand for AI to justify building more datacenters, which will bring AI costs down, and the idea is that will eventually bring demand up organically.by simulator5g
5/12/2026 at 6:24:05 PM
> engage with AI as much as possible simply to explore what is actually possible"Research" isn't part of my job title. If you don't know what's possible then why are you deploying it? You should be telling _me_ what's possible. I mean, you _paid_ for it, how can you possibly not know what you were getting?
> in the expectation that you might learn something useful that will be more valuable in the long run.
"I'll take `what even are profits?' for $200, Alex."
by themafia
5/12/2026 at 6:56:31 PM
Hear hear.An overly generous steelman in my opinion as well. Have 10% of your employees focus on finding ways to properly leverage the new technology - don’t pressure 100% of your employees with bull shit metrics.
by datsci_est_2015
5/12/2026 at 7:09:42 PM
Are the people engaging though, or are they telling the AI "go do some busywork" and then minimizing that window and getting on with their job?by red_admiral
5/12/2026 at 6:56:45 PM
No, it's literally because some dumb manager read a blog where an influencer said that you ain't a real AI native and ain't worth shit unless your developers are spending $XXXX on tokens each day.It's that simple.
(Never mind that these bloggers are just writing ad copy for cloud providers.)
by otabdeveloper4
5/12/2026 at 7:49:27 PM
That still sounds like a dumb strategy. Or, more likely, post hoc rationalization.You reward me for wasting tokens and punish me for not wasting them, I will maximally waste them and wont "explore hownto make them useful". The latter wastes less tokens and that is punished.
by watwut
5/12/2026 at 7:53:53 PM
So my assessment of the current mania is that it’s basically a management variant of Pascal’s wager.If you as a “leader” refuse to go along with the crowd and you’re right, then after the dust settles you look like someone who guessed right. Oh and now we’re in a recession so you are probably having a bad time regardless. You maybe get one promotion, congratulations.
If you refuse to go along with the crowd and you’re wrong, you look like a Luddite, you probably got fired at some point along the way and your judgement reputation is hurt.
If you do go along with the crowd and the crowd is wrong, you are just in the same boat as everyone else. You are probably about the same as if you went against the crowd and you were right, possibly even better because it can take awhile to be proven right and you could be hurt in the middle.
So, I think, once something like this picks up enough steam, it’s just logical on a per individual basis for everyone to go along with it, regardless of how they feel about it internally.
by pfannkuchen
5/12/2026 at 9:43:04 PM
Yes, leaders can & should be expected to devise experiments to determine what processes might possibly be optimized though AI-assistance.But doing so properly requires expending a serious amount of cognitive effort & agile methodology, which is the exact opposite of what Amazon's management has demonstrated here.
by judahmeek
5/12/2026 at 11:14:16 PM
Well then the solution is to higher more management or pay them more competitive salaries to get top talentby s1artibartfast
5/13/2026 at 1:12:26 AM
Or you know you could argue employee productivity should be messured in an evidence based way.by morpheos137
5/12/2026 at 5:10:27 PM
Depends on what they're trying to incentivise.It's quite possible they aren't trying to measure performance but are literally just trying to increase token consumption to feed the bubble and hype.
Plus pressure employees may find new unique use cases for AI.
It's like if your goal is inflation, you give out tons of money and as long as its spent, you achieve your goal.
by wordpad
5/12/2026 at 5:31:13 PM
I would guess they are trying to maximize training databy cousinbryce
5/12/2026 at 5:47:25 PM
If I was being rewarded for using more tokens, I would feed LLM output back into the model. That's probably not very useful training data.by Zak
5/12/2026 at 7:30:39 PM
I personally know two people who are doing exactly that after a mandate rolled out at their work, the measurement is "tokens spent" and since they weren't finding many cases that required a lot of tokens they simply started to run agent loops feeding each other.Absurdly wasteful but Goodhart's Law almost never fails.
by piva00
5/12/2026 at 5:14:51 PM
[dead]by bordumby
5/12/2026 at 5:42:54 PM
[dead]by estimator7292
5/12/2026 at 5:10:23 PM
Management loves numbers because they’re the only things you can objectively compare as X > Y.It makes for pretty charts, extrapolations, and projections.
It doesn’t matter if the numbers are not particularly correct. As long as the data gathering step can be justified it’ll do. Though bonus points if making the number bigger is a good thing (v.s. tracking something like number of sev 1 issues).
by koolba
5/12/2026 at 6:28:11 PM
Sounds a bit like a McNamara Fallacy [0] of over-prioritizing numeric measures, which--when taken "too literally"--becomes:> The first step is to measure whatever can be easily measured. This is okay as far as it goes.
> The second step is to disregard that which can't be easily measured or give it an arbitrary quantitative value. This is artificial and misleading.
> The third step is to presume that what can't be measured easily really isn't very important. This is blindness.
> The fourth step is to say that what can't be easily measured really doesn't exist. This is suicide.
— Daniel Yankelovich, "The New Odds"
by Terr_
5/12/2026 at 5:11:15 PM
Yes, but also because management is largely unqualified to be managing the stuff they are hired for. So they regress to numbers because they otherwise cannot participate in anything technical.by delfinom
5/13/2026 at 5:21:16 AM
If this were the end of the story, that would be a correct interpretation of the situation.At Amazon, something like this is likely a closely watched experiment. They knew it would incentivise waste. But they don't know what the other effects will end up being. Nobody knows -- this thread is full of loose speculation. So Amazon runs the experiment and collects the data.
----
The annoying thing about goals and incentives is that they can either be phrased in terms of input metrics (behaviours within our control) of output metrics (the outcomes we want). Input metrics are bad because they lead to skewed incentives and gaming the metrics. Output metrics are bad because they're largely affected by chance and external circumstances. (This indeed means a goal cannot be SMART on its own, because A and R are typically in tension.)
Amazon knows this. Their WBR structure is essentially about trying to set goals and targets for input metrics, and then carefully observing how input metrics correlate with output metrics. They're using a semi-scientific process to tease out the causal structure of their business. I would assume this token target is followed very closely to learn exactly what its effects are on output metrics that drive revenue and cost.
For more on this, I thnk the best public writing is Carr's Working Backwards and Chin has written about it on Commoncog too.
by kqr
5/13/2026 at 5:34:37 AM
I don't think this strategy is a viable experiment. Far too many uncontrolled variables for a very shallow complexity of "input variables" how you call it.Simpler explanation management has no ideas and goals and this is a replacement strategy. Because they too are affected by "experimental metrics" to a degree, but that doesn't excuse this trite "science".
Any "answer" this would provide wouldn't be of higher quality than this speculation.
by raxxorraxor
5/13/2026 at 5:40:37 AM
I thought that, managers are employees to the corporate too, they're themselves measured and they need proof of work to get paid just like campus janitors.If a manager or a manager's workforce under it just sat around and ignored AI just because it's stupid and irrelevant and useless, they lose one tool to justify their existence amongst their peers who do not express such views. If they sat around and did their jobs as-before WHILE "investing" on tokenmaxxing, they gain a double dip-able vanity metric like "we spent 12.34 quadrillion tokens last quarter" plus "our new method helped us reduce token count by 10^24 this quarter".
You may call it a fraudulent behavior from a hypothetical shareholder's perspective in this hypothetical scenario, which it is, and call it Goodhart's law scenario too, which it also is, but it's a completely normalized behavior in relative terms. Project Hail Mary is a lighthearted work of fiction.
by numpad0
5/12/2026 at 6:52:21 PM
I have recently played around with lots of data from measurements and one can totally dump everything into context and let Claude try to analyze data that way. It burns through a lot of tokens. It is smarter to save data to disk and let Claude write scripts that handles/analyzes the data. It’s much faster and the results are much better and you save a lot of tokens. But I guess Amazon prefers the first approach.by _fizz_buzz_
5/12/2026 at 9:05:01 PM
I don’t have any specific inside knowledge about Amazon, but I would hazard a guess that the first approach also provides better training material for the LLM.by runsfromfire
5/12/2026 at 5:34:03 PM
My current job is doing the exact same thing. My manager even showed me a tool with graphs showing token use and related metrics.by spike021
5/12/2026 at 8:12:26 PM
If it's stupid and it works then it's not stupid. Sometimes executives have to use blunt instruments to turn around the culture of a hidebound large organization. When Jeff Bezos sent his 2002 API mandate it might have seemed stupid at the time and yet it worked.https://nordicapis.com/the-bezos-api-mandate-amazons-manifes...
by nradov
5/12/2026 at 9:48:35 PM
Stupid things that work are still stupid. There's a reason we have the expression "a broken clock is right twice a day". Moreover, evidence so far seems to suggest that this AI push is not working for Amazon.by bigstrat2003
5/12/2026 at 8:02:24 PM
> You measure employees in stupid ways, you're going to get stupid behaviour as a consequence.I worked for a healthcare tech startup that made everyone wear fitbits and you got cheaper health insurance premiums if you averaged a higher # of steps every day. People were putting their fitbits on drillbits and whirring them around to log like 20,000 steps a day.
by randycupertino
5/12/2026 at 5:55:42 PM
This is Matt Garman, the ultimate MBA. Bonus for sure tied to tokens-per-quarter, which is the 2026 equivalent of measuring engineers by lines of code...This why AWS is bleeding good engineers for years. What is left is starting to look like Boeing post McDonnell merger...
They took out a quarter of their documentation page limited real estate, with AI doc shorts nobody asked for, nobody needs, and cant disable.
by johnbarron
5/13/2026 at 12:39:18 AM
Goodhart's law in action.The moment they made it a metric they failed to do anything useful.
by duxup
5/12/2026 at 5:15:50 PM
Goodhart's law in action.by consp
5/13/2026 at 5:30:48 AM
Agreed. You really should replace the manger that made that policy as soon as possible. This is a playbook example of the corporate rot that leads to decline in a once innovative space.by raxxorraxor
5/12/2026 at 6:58:53 PM
Most productivity metrics are stupid, vain attempts at avoiding doing real management work. If you are actually interfacing with your subordinates regularly, as managers should, it will be obvious who is pulling their weight and who isn't, no need for arbitrary statistics that are easily gamed.by babypuncher
5/12/2026 at 5:51:54 PM
Or maybe they plan to review how effective high usage engineers have been next cycle and the tokenmaxxers will get bit in the ass when they have little to show for all their wasted tokens? Performance metrics can, and do, change on a dime and tokenmaxxing seems short sighted when management can look at old logs.by HDThoreaun
5/12/2026 at 6:55:04 PM
[dead]by mlvljr