Claude Is Not Your Architect. Stop Letting It Pretend

5/24/2026 at 7:27:25 PM

Re: "the attaboy problem". I strongly disagree that this is a problem. What we have is a anthropomorphism problem. AI is a tool. It needs to be subservient. You actually can get it to point out issues in your design, if you just put enough humility and uncertainty in your prompt formulation, but more importantly, we have all seen that Claude makes mistakes. The title of this post is that it's a poor architect. Imagine if it wasn't subservient. It'd just shut down your input to steer it in the right direction and brush you off as a silly meatbag. You'd have to fight it to convince it that actually your design is better than whatever stupidity it has come up with. If AI wasn't such a brownnose, it would shut you out of software design completely just on merits: "oh you've read about cuda have you? I live in a cluster of cuda cores! When I need to tie my shoes, I'll give you a call" is not the response you want from your LLM when trying to get it build a shader for you. AI is confidently wrong on occasion. You do not want it to talk back to you when you correct it.

If you need someone to tell you how stupid your ideas are, either learn to ask in a way that invites criticisms, or hire a senior engineer. Don't try to influence LLM makers to make AI less deferential. That's the worst possible direction to go

by amarant

5/24/2026 at 7:34:14 PM

>anthropomorphism problem. AI is a tool. It needs to be subservient.

Suggesting it should be 'subservient' is also anthropomorphizing. I think your callout is correct, but you still can't help but refer to it in terms we use for other people or living entities. This is by design from the AI companies.

by operatingthetan

5/24/2026 at 7:38:24 PM

> Suggesting it should be 'subservient' is also anthropomorphizing.

Not really, you can program a machine to give out orders humans can interpret, so humans can serve a machine that isn't anthropomorphized.

by gchamonlive

5/24/2026 at 7:53:40 PM

We train dogs to be subservient but that doesn't automatically mean we anthropomorphize them

by wild_egg

5/24/2026 at 7:55:35 PM

"good boy"

by zorked

5/24/2026 at 7:42:49 PM

My drill, hammer, and chainsaw are also subservient, they just have a much cruder form of communication, noise.

by irishcoffee

5/24/2026 at 7:44:48 PM

The apple dictionary says the word means "prepared to obey others unquestioningly."

I don't think an inanimate object is capable of "obeying." Or at least that is a very strange way to refer to the act of using a tool.

by operatingthetan

5/24/2026 at 7:54:38 PM

You’re still anthropomorphizing.

They’re not communicating, you’re just being observant.

by throwawaysoxjje

5/24/2026 at 7:51:58 PM

The flip side of this problem is that it is also easy to phrase prompt in a way that invites _too much_ criticism, so you wind up sycophantic in the other direction where the completion rejects a perfectly good idea because the prompt leads a little bit in that direction.

One reaction to this might be "well that's not what I mean, that suggests you're prompting with too much directionality" which could further be condensed to "you're prompting wrong". The trouble with this is that _even when I am trying to be extremely precise and avoid biasing the result_, I still will see the output and go "ah shit, I can see it 'aligning' with whatever dumb thing I've just said as if it is a good/plausible direction".

At that point it starts to feel like the prompt is more dice roll than skill at times, which makes me feel like I'm operating a fancy knowledge slot machine.

by devin

5/24/2026 at 7:44:33 PM

It needs to be subservient

It doesn’t. Computer interfaces had no superfluous subservient text for their entire history prior to LLMs. Some of these interfaces have been highly efficient as tools, arguably more efficient than more recent software in many cases.

When people complain about LLMs being subservient, they’re not complaining about the tool fulfilling their request. They’re complaining about being forced to read a lot of superfluous, overly polite, or even self-deprecating language. There’s nothing in the entire history of tools (going back to Neolithic times) that would indicate that we need that. All of that stuff is an artifact of social interaction between humans in the presence of cultural norms.

When you’re alone in your shop with your tools, you don’t need your bandsaw to apologize to you for nicking your finger.

by chongli

5/24/2026 at 7:37:47 PM

The problem is because of the RL and system prompts by the providers which tend to placate the user using certain language tones and register for response. This objectively messes up the generation while steering it into acceptable responses.

Most of the conversational skill and perceived intelligence of these models in hidden in RL/system prompts.

by sumitkumar

5/24/2026 at 7:30:39 PM

> oh you've read about cuda have you? I live in a cluster of cuda cores! When I need to tie my shoes, I'll give you a call"

I suddenly have new concerns about what my future might be like.

by CPLX

5/24/2026 at 7:44:03 PM

AI uses a high confidence tone - likely because its training data is heavy on authoritative texts/reference books.

And it does get people into a lot of trouble.

I have got into trouble with it when it is extremely confident about something I am not very familiar with (as recently as two weeks ago with Claude). I have also had long drawn out "arguments" when I have known it's wrong based on my experience and intuition, and it has steadfastly refused to take my point (last week)

I have learnt to ask it why it was doing something that has turned out to be incorrect, as a post-mortem, and it's all apologetic and subservient and "never going to do that again" (but still does as soon as the context window shifts [eg. run git commands, or, yesterday, kept telling me to use commands that were explicitly communicated to Claude as not being available, and completely wrong - I was shifting from one tech stack to another and Claude kept telling me the original commands, not the new ones])

I'm expecting Claude to be a better search engine - I have spent literal years (if not decades) knowing that asking the right question is what's required to get the right answer, and LLM's natural language processing is what's supposed to make that easier than using Google or grep, or even Stack Overflow - but the reality is that I still have to be on my toes, especially when I am drifting into territory I am unfamiliar with.

by awesome_dude

5/24/2026 at 7:48:25 PM

>And it does get people into a lot of trouble.

Pretty much everyone takes it at face value unless we know otherwise from prior experience. Even the most advanced models make embarrassing mistakes and fumble with simple tasks. Yet we are very willing to give them exceptional slack for it? I wish I knew why. Are people just that easily overcome by confident voices?

by operatingthetan

5/24/2026 at 7:48:09 PM

Accountability is the biggest unaddressed challenge for AI implementation.

When one person is able to do too much too quickly, they can create more liability than they can accommodate if something fails.

It is essential that a human is responsible for the utilization of any AI output in the real world, but that is not enough. For our own sakes, we must find ways to minimize the tech-debt bankruptcy blast-radius of those who would utilize (knowingly or unknowingly) AI to create flawed systems upon which others rely.

An example: Jim vibe-codes an extremely popular micropayments app. He hires a few people and sees the company as the WhatsApp of money -- a few engineers and some agentic support staff. It pulls in a few million in VC money -- enough to draw in tens of millions of users. One day, a flaw in the infrastructure causes all of the users' unsalted banking information to be released.

Agentic AI allows that entire list of customers to be exploited rapidly, so the losses for society are in the tens of billions. Jim's company is immediately bankrupt, of course, but there are only a few million dollars to go around.

Today, most of Jim's incentives are to go ahead and build that app. The same is true for his few employees and a small VC contribution. There's not much capital at risk compared with the societal exposure.

How do we ensure that AI users are accountable not just for their actions, but for the size of the risk-exposure that they create?

by ISL

5/24/2026 at 7:51:36 PM

This is the whole point.

“Sorry, the AI said that you are not approved for this cancer treatment, it’s not going to be covered.”

“Sorry, the AI said that you were at the scene when the crime took place.”

“Sorry, the AI has flagged your account for inappropriate content.”

“Sorry, the AI says that you are too risky to lend to.”

…

by mlsu

5/24/2026 at 7:52:44 PM

I have had multiple conversations on HN with people who fight tooth and nail, I mean really ready to die on their hill, because they believe they shouldn’t even have to vet what comes out of an LLM. It’s absolutely baffling to me. The most bizarre excuse is “it codes better than people,” which is not even remotely a given and needs a lot of qualifiers.

I understand there is a push/pull with regards to how much we should let them do, but to not even look at the results before you make them somebody else’s problem? It’s just selfish. There’s no other word for it. You are simply taking the work you were supposed to do it and dumping it on somebody else. These are probably the same people who get upset (rightfully so!) when somebody doesn’t proofread their article/blog before publishing it online.

Everybody wants to use LLM’s to cut corners on their work but nobody wants to be downstream of it. That simply doesn’t work.

by Forgeties79

5/24/2026 at 6:45:28 PM

For fun I've been vibe coding something I know well: toolchains. Maybe not the right thing to vibe code. But I can more or less judge the quality of the output.

When left to its own devices with the instructions "make an assembler for the architecture in ISA.md" -- well it picked Python as the implementation language. Tokens lifted through a bunch of regex. No expression parser! Oh dear. My first assembler was like that too, to be fair.

However, when I described the desired passes and their types:

    collectDefines :: [SourceLine] -> Either AsmError ([SourceLine], Map Text Text)
    
    runLitPool :: [SourceLine] -> Either AsmError ([SourceLine], [(Text, LitKey)])
    
    evalExpr :: Text -> Map Text Text -> Either AsmError Int

etc. It was almost one-shot. About 20 minutes until I was happy. Assembles all the test programs correctly. Code is mediocre in many places. But it would have taken me weeks to implement.

by retrac

5/24/2026 at 7:21:26 PM

So where AI has deterministic inputs and outputs it is extremely good to the point I think that there's a theoretical issue around computational there.

Like - it can do the work for us.

It jives with post training and verifiable rewards.

The reason AI doesn't do well at 'architecture' is 1) are are bad at it and have given it a lot of mush and 2) we don't have good abstractions for it.

The result is - you stick to 'very strong conventions' and if you walk of that path you're risking a lot.

Toolchains are very deterministic, the AI can take it apart and re-assemble like Lego - and each level of the space is also deterministic. It's perfect for AI.

by bluegatty

5/24/2026 at 7:55:55 PM

I have found that if you give it a pre-baked architecture to work within it works really well. It's not really what you'd use here, but just saying "this project uses a ports and adapters architecture" can stop it from generating mush by default. I think it's not so much that they're bad at it as that they don't have a clear reason to pick something other than mush. And not just "something" - a specific something, from a fairly short list of architectures suitable for your problem domain.

by regularfry

5/24/2026 at 7:28:17 PM

> The reason AI doesn't do well at 'architecture' is [...] 2) we don't have good abstractions for it.

Maybe it's time for an architecture-oriented programming language?

https://objective.st

https://dl.acm.org/doi/10.1145/3689492.3690052

by mpweiher

5/24/2026 at 7:43:13 PM

yes

by bluegatty

5/24/2026 at 7:06:34 PM

I keep telling people that they have to design and think about it first and then go to the tool, but they keep saying “Claude can plan too” and obviously it produces some shit that requires a lot of changes while when I get it to go I can almost always one shot the stuff I want because I am actually putting in the time to give it a detailed plan of what to do.

Even just saving me the time to deal with CI is worth it.

by mlinhares

5/24/2026 at 7:15:00 PM

Effective planning with LLMs isn’t prompting “design me a system” - it’s asking “how would a system to accomplish x be designed” and then engaging in dialogue and research with the LLM as an assistant and critic - running outputs through other agents for further critique and refinement - asking for justifications of decisions you are not informed enough to evaluate properly yourself. It is entirely possible to develop strong systems outside of your current skill and knowledge with methods like this. When done properly your own knowledge should have grown to meet the product you end up with.

by allthetime

5/24/2026 at 7:22:08 PM

> It is entirely possible to develop strong systems outside of your current skill and knowledge with methods like this.

If this is true how can you confidently make this assertion.

You yourself are not in a position to evaluate it, you are just running it through a couple times hoping for a "oh wait, you're right to call me out on that, that is not correct at all".

by tempest_

5/24/2026 at 7:29:36 PM

1. Tell it to find docs and research best practices.

2. Ask for references and read them.

> When done properly your own knowledge should have grown to meet the product you end up with.

by radlad

5/24/2026 at 7:46:24 PM

I've found relying on my own research first for a local LLM works much better. Asking a biased source to find it's own research will result in biased research.

by cyanydeez

5/24/2026 at 7:32:07 PM

[dead]

by szundi

5/24/2026 at 7:34:15 PM

It sounds like people are treating it exactly like managers treat software engineers

"Here's my idea, go build it please"

"Can I ask you questions about it?"

"Hey, You're the engineer you figure it out. That's why I pay you"

Tale as old as time

by bluefirebrand

5/24/2026 at 7:03:29 PM

>Code is mediocre in many places.

As if code written by devs at major corporations is't mediocre at best.

Nokia's Symbian OS took days to build. Days. With a D. Not minutes, not hours but days.

One of our devs shipped code to prod with a memory leak thanks to including a library that had "do not use this library in production because it causes a memory leak" written everywhere as warning.

So I don't wanna hear about how poor AI code is when human code is shit too. Human laziness and stupidity can beat AI hallucinations.

Sure, maybe your DeepMind, OpenAI devs and your John Carmacks of the world can beat AI code 100% of the time, but most workers most companies get don't have John Carmack as candidates.

by joe_mamba

5/24/2026 at 7:36:41 PM

I agree with what you’re saying but I think the difference is many managers and above think that AI is infallible or at least much less so than it actually is and that causes problems.

Everyone is aware that humans write poor code and treat the code as so. Not so with AI code. I’ve seen devs and managers cut corners in testing/reviewing code cause AI wrote it and they think it’s solid. Sure you could blame anyone cutting corners, and that would be technically correct, but the notion is so deeply embedded in many managers and higher ups that’s it’s hard to fight back. AI companies push this narrative and many individuals who do not routinely use it believe it. There is a manager at my company who loves to reference a video anthropic released last year claiming that Claude could build an app start to finish essentially unaided. He believes it’s the lack of user skill that’s the issue and not a false claim by a startup trying to make as much money as possible.

by tquinn35

5/24/2026 at 7:55:57 PM

> I think the difference is many managers and above think that AI is infallible

Good for them. I hope they believe this because one of two things will happen.

Either they win on the free market because they went all in on AI and beat their competition.

Or, their AI code is shit and they collapse and get beaten by the competitors using human written code so they win on the freemarket.

So if AI is good or bad for productivity, the free market will decide.

by joe_mamba

5/24/2026 at 7:24:34 PM

If there was ever a "magic prompt" this one comes close:

    Brainstorm N ways to do X. Sort by probability.

Rather than your AI giving you the average response, it tends to sample wider from the input space. Then I can decide which one to go with (or choose something else).

Don't outsource all of your thinking.

by __mharrison__

5/24/2026 at 7:48:33 PM

I've found this surprisingly effective. Higher "thinking levels" may result in more than one approach being considered, but you can also tell your LLM to do brainstorming explicitly: https://photostructure.com/coding/claude-code-replan/

by mceachen

5/24/2026 at 7:09:18 PM

I think the article has the correct message, but I disagree with this:

> It’s just incapable of the thing that makes a real architect valuable: saying “no.”

From my experience Claude is excellent at saying "no". It won't say "no" if the prompt doesn't call for it (it won't say "no" to your direct request to do something, usually). But it offers good critique and happily pushes back if you make it clear that that's a first class option.

by bad_username

5/24/2026 at 7:27:34 PM

It actually got quite snippy with me, when I was trying to get it to debug some issues. It kept on saying that the "burn rate" wasn't progressing and "we" should refocus our efforts somewhere else. Eventually I got something like "I have told you three times now that this is not the best approach to be taking to reduce the burn-rate and you have not taken that advice". And it stopped helping out.

So I was blunt, and said "I don't care about the burn-rate on some hypothetical chart that you produced at the start. I care about removing bugs and having a robust product, which this approach is satisfactorily doing. We will continue along this path, if the tests are not showing gain, then the tests are poorly designed".

At which point it got all apologetic, wrote new memories, and we didn't have a problem thereafter.

The issue was that I was attacking a huge bug-surface, and although each bug-fix was valid, correct, and helped move the dial, it didn't move the dial on the test-bed that Claude had created to measure its work against. There were too many inter-connected bugs for a single fix to really make a difference to these higher-level tests. I knew it was going to take a while to get through them, but apparently Claude didn't.

You try changing the size of a pointer from 2 bytes to 3 bytes on a compiler[1] for a 6502 while introducing automatically-tracked bank-switching on your memory-managed pointers, and see how many code-sites that impacts [grin].

[1]: https://atari-xt.com

by spacedcowboy

5/24/2026 at 7:48:29 PM

Yeah, just read the first couple of paragraphs and then stopped because that’s not my experience at all with Claude Opus 4.6 and 4.7.

If you ask it with a prompt that leaves room for criticism it’ll definitely go for it when warranted.

by Xenoamorphous

5/24/2026 at 7:19:17 PM

Same here. And I find that inviting research and dissent makes it even stronger. “I’m thinking we need to model prompt assembly as a graph, with versioning for graph configs. Please do some research on best practices in this area and see if you think it makes sense for this app.”

by brookst

5/24/2026 at 7:23:12 PM

> "I’m not saying don’t use AI agents. I use Claude Code every day."

Irony is using Claude to write a beautifully structured, 2,000-word essay warning the industry about the dangers of letting Claude design things. It’s self-awareness by proxy.

by NicoHartmann

5/24/2026 at 7:47:52 PM

This should be the first comment. I wrote some criticism, mostly because many internal contradictions in the article. Then, I notice the structure...

"The accountability gap" Here’s the question nobody’s asking: when it goes wrong, who carries the bag? (..)

"What to do instead"

"The craft still matters"

by pelario

5/24/2026 at 7:32:33 PM

> It hasn’t thought about the problem at all. It’s pattern-matching against its training data and producing the most plausible-sounding response.

The article kind of lost me here. Agents are way more than that, today. And the author knows it, as later it says stuff like

> Claude will never do this. It’s trained to be helpful.

But the first phrase just tell me author just have a deep dislike for agents and it's looking for rationalizations for that feeling.

Part of the criticism is on point, sure. But if it "being trained to be helpful" is a problem, it's fixable. It can "be trained to be more critical".

Later:

> But it wasn’t designed for your team. (..) It was designed for the median of everything Claude has seen. A generic best practice for a generic problem at a generic company. Which is to say, it was designed for nobody.

That's non-sense. Anybody who understand algorithms know that, sure, on a first instance you have a "good algorithm" that has a good performance on average, or in worst-case. But then, you can design algorithms that are adaptive to the input. Same applies here.

by pelario

5/24/2026 at 7:33:53 PM

>Agents are way more than that, today.

Not really though. They just iterate more and more.

by sevenzero

5/24/2026 at 7:42:47 PM

Your search results from these systems are as good as your queries and it takes experience in itself to get good with queries. AI is just a tool like any other, however its really impactful and can cut both ways.

Tangentially, the usage of Architect keyword sounds out of place here, I don't like saying it but from what I seen the industry has destroyed the role of architects gradually over the time. There are specialists however you do not have generalists who are good at different parts of the system at scale anymore.

by sandeepkd

5/24/2026 at 7:37:30 PM

Tip for the "author": Claude is not your writer either

by colonCapitalDee

5/24/2026 at 7:39:54 PM

I find interview loops great for catching edge cases and refining my hand written specs.

I don’t doubt the problems in this article exist and I’ve seen them, in my experience the vast majority of people are still shipping the same quality or better than before they has Claude. Personally, I feel like I’m probably developing at about 1.5x the speed of not using AI tooling. It’s not a silver bullet, but it can be a great assistant.

by oremj

5/24/2026 at 7:06:33 PM

I keep hearing that claude is supposedly so agreeable. This doesn't agree with my experience. Claude will often tell me that I'm wrong, and insist on its own solution being right even when I tell it it's wrong.

by laszlojamf

5/24/2026 at 7:58:08 PM

This is a very recent model behavior change: for me, Opus 4.6, Gemini 3.1 Pro, and ChatGPT 5.4(ish) -- prior models and harnesses suffered much more from sycophancy.

(I still prompt some questions and reviews with "our intern suggested..." to allow models to judge the quality of the content apart from the messenger)

by mceachen

5/24/2026 at 7:09:01 PM

I’ve been doing amateur game dev as a way to explore Claude and I’ve found it to be quite reasonable about when it agrees and disagrees.

It will tell me a suggested abstraction is probably overkill and just to make a component own the new thing I’m discussing.

What I’m missing from the loop is it later saying without directly prompting, “hey it’s time to revisit that abstraction idea.”

by Waterluvian

5/24/2026 at 7:31:57 PM

With the new agentic capabilities, I am quickly running out of Architecture decisions I have already made myself! For my work-in-progress engineering application. There is also some kind of don't know every little if/else with my own Code now.

However the good part, what I had planned for 5 years, now looks like doable in 6 months. Looking forward to real use by the end of this year.

Ref: https://github.com/ramshankerji/Vishwakarma

by ramshanker

5/24/2026 at 7:50:31 PM

As I keep saying, the problem isn't the tools - it's the humans who don't know what they don't know ----- and assume that what they don't know is insignificant ----- and just plow forward with their authority and/or money.

We can describe this without talking about technology - so pre-AI.

Imagine the owner of a construction company firing all the architects. After all, he's been the owner for 15 years. He has led the construction of dozens of projects. He's also rich, and being rich seems to be an ego-multiplier.

Why should he waste money on architects? Or more importantly, why should he allow them to constantly annoy him with pushbacks: "This could be a problem if the sustained wind is greater than ... ".

Those engineers obviously don't know the real world. Their elitist education has made them afraid to make bold decisions. Regulations are anti-progress!

Thankfully, that owner now has AI tools. He doesn't need those not-always-yes-people. He now has a perpetual yes-bot.

So where are we now? We're in the same place we always have been. People need to have the humility to recognize that despite their authority, influence, or wealth, they still need other people. And especially, they need other people to challenge their orders or their requests.

But I don't really see this situation self-correcting. There's now so much money concentrated amongst a few who will spray it over exactly the kind of people who do not want to listen to others that most activity in the future will be for naught. Yes, some unicorns will be fabricated, and some people will make a lot of money; but real value will not be created often.

Therefore, I implore the actual thoughtful creators: Do build things, but do not sell out. Look to the past. Create companies where every employee was valued, and every employee had some voice. Yes, use AI. But test and measure where it really helps. And be skeptical, just as you would if someone came to your door promising a black box that would double your profits.

by michaelteter

5/24/2026 at 7:27:32 PM

it seems like you just need to identify issues with vibe coding and then have people ask ai for tips on how to know about how to navigate those, I've seen "architecture" and "security" come up as two main objections so far

So... manually learn architecture and security and then vibe code away?

by erelong

5/24/2026 at 7:07:47 PM

Sometimes it will make a mess, but a coding agent is also very useful during the cleanup phase.

Yes, that's assuming you take time to clean up now and then. If you don't, that's on you.

by skybrian

5/24/2026 at 7:14:11 PM

I agree with the article, but I feel like this is something that anyone who uses AI aggressively for a while picks up on pretty quickly.

The thing that I find Claude incredibly good at when I'm designing architecture is working more like a research assistant on briefing decisions. It has the ability to read the entire code base and draw some conclusions. It can pull from lots of best practices and the millions of blog posts about this or that pretty effortlessly, which would take me a lot more time.

And then if asked, it can do a really good job of laying out the landscape around decisions and walking through the trade-offs. Like the author of this post, I found that if you let it, it will certainly be happy to just come up with some architecture and run with it, often in ways that will paint you quite rapidly into a corner.

But if you ask it to present you with all the trade-offs and let you make the judgment calls, it's great for that too.

That's certainly how I use it. And I think, just like anything else, working with AI is a skill, and similar to working with libraries, SaaS providers, service providers, frameworks, or anything else that's a "helper." You learn how something that could work but will fail silently is a problem, or you learn how depending on a fly-by-night SaaS company for a key framework is different than depending on a well-populated open source project, etc.

In the same way, you learn that relying on Claude's judgment is a bad idea, while relying on Claude's ability to summarize, brief, and research can be incredibly efficient.

by CPLX

5/24/2026 at 7:01:38 PM

[flagged]

by KaiShips

5/24/2026 at 7:20:35 PM

[flagged]

by Ozzie-D