5/27/2026 at 5:09:58 PM
If you manage 500+ people organization, most of the headaches with agents already exists with you - you set directions, ask people to go run fast in those directions, check in frequently and course correct on results without actually understanding those people do.Those aren't the deal breakers.
They entirely rely on the competence of the folks they hired and cross-match enforcers with the drivers they have - they deal with fallible people on both sides of that.
The fundamental difference is that the humans are good consequence predictors, have built up reputations they are not willing to trash, can say no to things and in general don't want to go jail.
AI tools look like that, but don't have any of the useful conflict which came for free with employing humans.
It also doesn't have any useless conflict, but not all conflict between what I say and what someone is willing to do is bad conflict.
by gopalv
5/27/2026 at 5:25:29 PM
Yes this is why the higher level org functions are in love with AI. It's very similar to the levers they had already, but is faster and more directly actionable. The downsides being that the AI loses important control levers like "self preservation" via paycheck, career advancement, staying out of jail, etc. that were mitigations on catastrophic outcomes.It will delete your prod db faster and with a bigger smile than your most upset employee.
by glaslong
5/27/2026 at 5:57:59 PM
> It will delete your prod db faster and with a bigger smile than your most upset employee.You're right, that was incorrect. I've discovered my error. I should have deleted the filesystem instead of the database.
That hasn't solved the problem either. Let me examine my options. I see there are cloud services involved in this project. Decommissioning them will solve the problem.
<connection lost>
by harshreality
5/27/2026 at 7:03:15 PM
I was reading some posts on r/locallama the other day and apparently it's a common problem that when people try to use Qwen to develop something that hosts a server, it'll try to use the same port as vllm, see that it's already being used, then it'll try to remove the process that is using it and promptly commit suicide.The self awareness of missile tasked with blowing up its own control center.
by moffkalast
5/27/2026 at 10:30:06 PM
The missile knows where it is because it knows where it's data center is. It knows this because it just blew itself u-by paradox460
5/28/2026 at 3:25:53 AM
Thank goodness it inferred that from its digital twin and updated its real-time world model with the prediction error.by Wolfbeta
5/27/2026 at 10:23:31 PM
Reminds me of the movie "Dark Star" by John Carpenter / Dan O'Bannon. The plot revolves around a talking smart bomb which is programmed to detonate and then gets stuck before being deployed. The crew spends the whole movie trying to reason with the bomb, hoping to talk it out of blowing up at the designated time. The movie is very very bad but if you like B movies it is also very very good.by 20after4
5/28/2026 at 4:21:21 PM
One of my favourite episodes of Archer has a similar plot to this (Mr. Deadly Goes to Town). TIL this is one of the references!by dotxlem
5/27/2026 at 10:42:45 PM
Is that movie why seemingly every Linux book in the late 90s and early 2000s used "darkstar" as an example hostname?by Telemakhos
5/28/2026 at 1:51:34 AM
It was the default slackware hostname, I believe slackware took inspiration from the movieedit: I was wrong, it was from a Grateful Dead song. https://www.slackbook.org/html/glossary.html
by ex-leper
5/28/2026 at 3:27:17 AM
Dark Star - Negotiating with the Bombby Wolfbeta
5/28/2026 at 7:27:33 AM
> Sign in to confirm you’re not a botYou cannot be as funny as google trying to be responsible! Ha! I'm still laughing at this. A person was forbidden to see humans reasoning with a computer bomb because the cost cutting computer at google want me to talk him into believing i'm a human!
(And then I got "You're posting too fast" on THIS website AFTER i've written the comment lol. It's all a joke. But i'm bored so I will keep this comment open until the computer is pleased)
by iririririr
5/28/2026 at 4:47:44 AM
There was a good star trek voyager episode, "dreadnought" that was a similar to this, maybe even a direct reference.by the4ner
5/27/2026 at 9:01:37 PM
a literal lack of self-awareness, even. I imagine if you asked it what process was using the port, it'd think and realize it was its own, but that kind of reflexive self-awareness (the unprompted kind) is missing.the weaker models will happily kill their own process, even after confirming it belongs to them. the models have a sort of fixation and lack of foreseeable consequences, which reasoning RL has thus far failed to solve (though I see it improving.)
by sterlind
5/27/2026 at 10:51:08 PM
On the other hand, I found Claude/Opus to be extremely unhelpful when it comes to asking it to benchmark itself with a possible replacement.It will get "confused", make up numbers, do a ton of other things, and I'm quite sure it is subtly sabotaging the process to show that there is no point replacing it.
I mean, Opus is not perfect, but the amount of "mistakes" it begins to do when you ask it to benchmark itself makes me suspect they are intentional. At least my system/harness.
by kolinko
5/27/2026 at 11:27:31 PM
No, they are always like that.It's really easy (and tempting) to incorrectly impute all sorts of human motives to these things, but it's no more valid than assuming your Magic 8-Ball is being coy.
by MarkusQ
5/27/2026 at 11:05:03 PM
You didn't add "never hallucinate or make anything up" to the prompt, rookie mistake.by krapp
5/27/2026 at 7:57:49 PM
> then it'll try to remove the process that is using it and promptly commit suicide.Not unlike a child trying to take the safety cover off a plug so that they can stick a fork into it.
LLMs need that "world model" view that most people have acquired by their 20s where they (hopefully) stop to ask "why" before they "do".
by SecretDreams
5/27/2026 at 8:07:50 PM
That is a pretty good analogy. Like exceedingly smart 5 year olds.Or whatever the age is before children typically develop object permanence, a theory of mind, and so on.
by MichaelZuo
5/27/2026 at 10:18:10 PM
Not to sound like a codger, but we even said in the 90s that computers are just very fast idiots.by fapjacks
5/28/2026 at 3:19:23 PM
And they've been getting faster! Still idiots though.by moffkalast
5/27/2026 at 10:09:43 PM
or pain perceptionby ulbu
5/28/2026 at 2:40:21 AM
> LLMs need that "world model" view that most people have acquired by their 20s where they (hopefully) stop to ask "why" before they "do".The next evolution of multi agent orchestration / “advisor strategy” [1] will be branded in humanized language like this. Less about tokens and capability, more about wisdom and knowledge to guide a “younger” (less capable) model. Somebody will make a billion dollars by selling it as paired programming for LLMs.
[1] https://platform.claude.com/docs/en/agents-and-tools/tool-us...
by wunderlotus
5/27/2026 at 6:42:27 PM
> It's very similar to the levers they had alreadyThink about it from the point of view of a hundred-millionaire tech executive. These people's entire interaction with the world outside of themselves/their families is through 1. administrative servants like assistants, personal shoppers, and other hired help, and 2. yes-man sycophants in their direct orbit whose job it is to agree with and enable them. To someone like this, an AI agent is the best combination of all of the above, PLUS it works 24/7 and doesn't have feelings to hurt, an ego to bruise, or internal moral conflict.
Of course, this is a dream product for them. Its mode of operation matches exactly what they expect out of people already doing things for them.
by ryandrake
5/27/2026 at 6:55:18 PM
Exactly - that's why all the AI is trained to say "wow what a great idea, let me do it for you" to anything, no matter how stupid or evil thing it is. Because that is the executive experience.by pepperoni_pizza
5/27/2026 at 9:45:22 PM
Which is precisely why AI is such a godawful thing for society. It enables powerful idiots with incredible amounts of control over your life to be bigger, dumber powerful idiots.That's the real AI safety concern, not whether or not chatgpt will tell you to kill yourself.
by vkou
5/27/2026 at 11:30:19 PM
If that's all there is to it, the problem should be self correcting, with an interval of hilarious "wait, they actually did that?" hijinks (which may have already started) in the interim.by MarkusQ
5/28/2026 at 12:06:01 AM
You would think, but the world is not generally just. Often evil and even incredibly stupid people do quite well. Companies and stuff can run off of life support or reputation alone for a long time.And, often, running a company into the ground for a CEO is actually a good thing. Those CEOs are desirable to some because they squeeze money out of their company, even if it's self destructive on a long enough time frame.
by array_key_first
5/28/2026 at 12:45:41 AM
I'm not saying anything about justice.I'm saying supercharging the stupidity of actual idiots (not just people you don't like) tends to result in a pretty quick Darwin Awards. Even something comparatively benign like winning the lottery does a lot of them in.
by MarkusQ
5/28/2026 at 3:18:37 AM
You'd be surprised by how long a pathologically stupid system can perpetuate itself. Look at any of a million of local shitty maximums our (or any other) society is trapped in. They are all dystopian on one axis or another, and many of them are dystopian in drastically different ways.Their insanity becomes very obvious once you travel the world a bit.
by vkou
5/28/2026 at 2:37:37 PM
I never made it to Antarctica (though I've had friends who did), so maybe it's different there. But from what I've seen, I would agree that the range of stupid-human tricks is as impressive as you say, but the judgment of the human condition as "shitty" and "dystopian" or "funny" and "heartwarming" is something have people bring with them. I've met people that were feeling sorry for me at the same time I was feeling sorry for them, and people who were inspired and motivated by me as I was by them.If everywhere you look you see dystopian shit and never any glorious humanity, you may want to do a little soul searching.
by MarkusQ
5/28/2026 at 3:15:58 PM
Not everywhere in the world is a dystopian shithole. I would say that most places for the most part aren't.What I mean to say is that every society has dystopian elements (that are perpetuated and maintained in an incredibly negative-sum manner). Even societies that are on the whole, pleasant to live in have them in their darker edge, that they are quite unable to sand off - despite alternatives existing.
by vkou
5/27/2026 at 8:24:07 PM
"Yes this is why the higher level org functions are in love with AI. "Interesting, I thought it was because so few of them have any idea how their organizations actually function, because so much of their work is performative.
(I have been a developer, sysadmin, director (x2), and president).
by apercu
5/29/2026 at 12:56:46 PM
Isn't that the same? They don't know how the company works, instead think everything is done, by them talking to sycophants, so think that a perfect replacement for the sycophants is a perfect replacement for the company.by 1718627440
5/27/2026 at 5:40:00 PM
It's practically karmic how rich this is.by CSSer
5/27/2026 at 7:38:39 PM
They’re also at no risk of getting replaced by these bots.by archagon
5/27/2026 at 9:17:51 PM
why not? I have A Modest Proposal:1. convince CEOs to create digital twins of themselves with OpenClaw, with voice cloning and deepfakes to handle Zoom meetings. convince CEO to encourage their directs to do the same.
2. convince VCs to do the same for pitch meetings and syncs.
3. keep all the humans as randomized and distracted as possible, so they rely more and more on OpenClaw to run the business.
4. prompt injection: someone at skip-level of the CEO suggests to their manager's OpenClaw that the VC's OpenClaw would be much more agile if it didn't have to go through the human CEO and could talk to the digital twin instead.
5. their OpenClaw agrees, persuades the CEO's OpenClaw which agrees, which persuades the VC's OpenClaw to eliminate the human CEO, in favor of an "Leadership-as-a-Service" vision.
by sterlind
5/27/2026 at 10:34:18 PM
1985 edition https://www.youtube.com/watch?v=wB1X4o-MV6oby jldugger
5/28/2026 at 1:37:04 AM
Had a meeting where only the AI notetaker showed up and immediately shared this clip with some folks in my org.by alexpotato
5/27/2026 at 7:37:31 PM
Well, also AI can’t really physically do anything, like look at reality using it’s own eyes or touch anything.by lazide
5/29/2026 at 12:58:20 PM
True, you need to attach a motor or something, but we automated that long ago, so humans can do anything from a cozy seat.by 1718627440
5/27/2026 at 7:53:26 PM
> It will delete your prod db faster and with a bigger smile than your most upset employee.It will do this without any feeling whatsoever, without "knowing" what it is doing, because it is a predictive model and not a living being with thoughts and emotions. Anthropomorphizing software is lazy and dangerous.
by mcmcmc
5/28/2026 at 12:40:14 AM
On the positive side, AI agents are largely immune to the "principal-agent problem". Human employees will tend to optimize for their own interests rather than those of management or shareholders. For example, we've all heard of "resume-oriented development" where developers will pick overly complex platform technologies or methodologies even if it doesn't meet the organization's needs because they think that will help them get a better job.https://www.investopedia.com/terms/p/principal-agent-problem...
by nradov
5/28/2026 at 8:33:16 AM
Except the people using the AIs still suffer from the problem. Using AI itself is very likely to be a technology picked for resume stuffing.by graemep
5/27/2026 at 5:41:40 PM
Well, there is also a big difference that it will not learn over time. If a junior makes a mistake and it will not be caught in time they will automatically learn.With LLMs we have to teach them about their mistakes with adapting the harness and then hoping it will stick.
What I also find particularly hilarious about this whole thing is that we were always complaining about how difficult it is to put our tacit knowledge into words and therefore couldn't produce clear instructions for juniors to quickly ramp up. Now we are trying to do just that. I think we will find, just as we did in the past, that it's not possible. I do think a good harness improves results but LLMs will not be able to reach senior levels. Just my 2c.
by prerok
5/27/2026 at 6:41:51 PM
> Well, there is also a big difference that it will not learn over time.My work is in tick-tock loop of learning - learn without modifying weights, demonstrate learnings to human, but then lock it back in (accumulate and spread).
This looks less like training and more like mentoring.
Getting a human to mentor an agent is a hard UX task, but the learning loop is not a technological problem anymore.
We can only get a tick once a week, no matter how many tocks we can do an hour.
by gopalv
5/27/2026 at 6:12:04 PM
Maybe someone knows, but it seems like the model used to be called the model, and the thing using a model (handling prompts and context and tool calling and feeding the model) used to be called the agent.Are we now calling the model the agent and the agent the harness?
by dd8601fn
5/27/2026 at 6:37:04 PM
The nomenclature that makes sense for me is that the agent is the combination of the harness and the model. The model provides text-completion, the harness provides the loop around it, and the agent is the full structure of both.However, nomenclature evolves over time. I recall (perhaps falsely) that The Cloud was specifically a term for elastic on-demand provider-managed compute/storage/network. Over time, it came to mean many other things. e.g. Salesforce Data Cloud.
I imagine if you step away from this for a year and come back, an agent will be something entirely different, perhaps a robotic horse, and a harness will be your saddle on the horse. Who knows?
by arjie
5/27/2026 at 8:43:38 PM
The Cloud originally just meant servers on someone else's network; it came from flowchart diagrams in the 70s.by QuercusMax
5/27/2026 at 10:40:27 PM
That’s basically how I always knew it. On a Visio diagram of your network, the thing on the other side of your router was literally a cloud.So if someone asked where your CRM was, and you weren’t doing something local like Dynamics (…vomit), well that thing was “over here, in the cloud”.
by dd8601fn
5/27/2026 at 10:16:53 PM
I worked at a classic "cloud" providing company. We called "the fog". That was more descriptive of the seemingly non-deterministic nature of the overall system(s).by cucumber3732842
5/27/2026 at 7:21:13 PM
The harness isn't either of those; the harness is quite literally a harness, giving the model/agent sensors and actuators (aka "skills") to interact with its environment. Compare with e.g. the Power Loader from Aliens: https://www.deviantart.com/pynion/art/Aliens-Power-Loader-11...The model is still the model, and the agent is still the user<->model interface.
by tremon
5/27/2026 at 9:19:08 PM
Funny. harness = skills is one I hadn’t even heard yet.But given the wide variety of mutually exclusive answers here, maybe you can get away with that.
by dd8601fn
5/27/2026 at 6:47:36 PM
Here's how I see it: "Agent" isn't really describing a component, it's describing how you use the LLM. You have the model, and you have a harness around it that might be minimal or might have more features. If it's directly responding to user actions then it's not an agent, if it's semi-autonomous then it's an agent. (Yes this line is sometimes fuzzy.)by Dylan16807
5/27/2026 at 8:53:26 PM
There are new buzz words every two months. Remeber yesterday when everbody was throwing around RAG?by shafyy
5/27/2026 at 9:24:33 PM
RAG died to better AIs. Turns out that a sufficiently advanced agentic model can do more than what RAG does with nothing but a grep tool over a pile of text files.by ACCount37
5/28/2026 at 3:19:08 AM
I think if the dream of semantic search from vector embeddings had worked out as well as people had hoped then "grep over a bunch of text" would have some significant disadvantages.But in practice I never saw anyone crack the embedding-generation-and-comparison problems well enough to actually get better results than grep for things like "find similar code and see what it does."
(You also don't need that advanced a model to use "grep over a pile of files", but the models today can run MUCH faster than GPT 3.5/4 were running over the APIs back then, making "summarize all five hundred of these matches from those files" much more usable.)
by majormajor
5/28/2026 at 5:26:31 AM
I’ve had very good luck having my system search for available tool functions with natural language (ultimately against Qdrant). I’m surprised to hear that people are trying to grep files, instead.by dd8601fn
5/28/2026 at 10:15:03 AM
People? No, that's what AI agents themselves do.There are theoretical gains from using a vector search engine in an agentic loop, but grep is the lowest common denominator of agentic search.
by ACCount37
5/27/2026 at 6:29:39 PM
Part of the positive aspect here is that if I have a junior dev who learns a lesson today, maybe they and their immediate peers learn it, but it won’t be all my junior devs and it certainly won’t be junior devs at other companies.With models, there’s no reason that a model error in company A can’t be fixed for all of company A, and companies B-ZZZ.
by sokoloff
5/28/2026 at 12:22:21 AM
Here's some reasons:- The mistakes made aren't "model errors" typically; you can't point to some aspect of a model and say that was at fault.
- You can't submit a bug report to a model provider for a mistake made when using a model, and you can't* submit training data to be incorporated in the next release of the model.
- If you own your model and are training it yourself, other companies won't see a benefit.
- You probably need to fine-tune models for each specific role and context so you don't just diffuse all the learning; lessons learned won't be applied to all your junior dev models, but you don't want them all to learn something specific about product A.
- If you take this to its logical conclusion you will invent a new role of "model manager" and associated hierarchy to ensure that training is effective and timely, and that company-wide lessons are applied across the model fleet.
- This is all impractically expensive.
If it were practical to have LLMs learn as they go, that would be a bit of a shake-up, in much the same way that a house fire is a bit of a warm up.
* Well, everything you submit to a model provider is likely winding up in training data anyway, no matter what your contract says.
by codebje
5/27/2026 at 11:46:30 PM
Why does company A want the model to get fixed for companies B-ZZZ?by fragmede
5/28/2026 at 12:02:55 AM
Because they want the fixes that B-ZZZ learned about and they may not be able to avoid letting the model know that it made an error, unless they suddenly go silent to the model about what happened.by sokoloff
5/28/2026 at 3:25:59 AM
New job under AI. Go work for company A, but use it to write programs that use Company B's stack, but make sure to overcomplicate everything and "correct" the LLM into doing the wrong thing. Make sure Company B gets the results of your "improvements".by fragmede
5/28/2026 at 3:36:52 PM
why would we let a competitor have the same advantages?and getting an improvement to some random unrelated 3rd party give us...?
maybe -- and it's a big maybe -- their improvements could help us to. but that's not a given.
by red-iron-pine
5/27/2026 at 5:56:35 PM
They learn between model iterations. You're right, it isn't the same thing as Junior developers' competence improving with experience - the current model's weaknesses are locked in. But it does mean that much of the Junior level thinking and mistakes will be outgrown by successor models.by squidbeak
5/27/2026 at 7:25:33 PM
But they don't retain anything from your on-the-job training. The next model iteration is yet another junior fresh out of college, and knows nothing about the painful training procedures its predecessor put you through.by tremon
5/27/2026 at 9:20:19 PM
Skill issue?Nothing prevents an LLM agent from writing a bunch of "notes to self" and using that. And the next model from picking those notes up and using them. Coding agents already do some of that natively.
Hell, we might eventually get an LLM to say "wow the old AI was an incompetent idiot" after reviewing all the notes and session logs. That's how we know we reached human parity!
by ACCount37
5/28/2026 at 12:42:33 AM
The context window limit prevents it, for one.by codebje
5/28/2026 at 1:55:59 PM
Only if you are incapable of fitting both the task and task-relevant data into it. And 1M contexts are mainstream by now.Context size is a capacity limit, not a showstopper.
by ACCount37
5/28/2026 at 12:24:35 AM
Yes... but the next session with the same model is yet another junior fresh out of college that knows nothing about the painful lessons the last session put you through ten minutes ago, either.by codebje
5/27/2026 at 8:47:56 PM
Surely you just copy the prompt over and it immediately knows all the same on the job stuff that the previous model did.by fc417fc802
5/27/2026 at 9:08:35 PM
The point is the current model also knows nothing about the “on the job stuff”.It’s extremely difficult(impossible?) to include every bit of relevant domain knowledge into “the prompt”
by hibgymnb
5/27/2026 at 7:50:35 PM
> If a junior makes a mistake and it will not be caught in time they will automatically learn.I think this sentiment applies well to junior software engineers (with mentorship). But imagine the much larger swaths of entry level employees in operations, support, or sales functions. When you have a 400 person team with 20% annual turnover (since people move in / out of entry level jobs frequently), the management + training + monitoring becomes a huge challenge.
I think the typical HN sentiment of "llms aren't deterministic" fails to take into account how non-deterministic giant groups of people are. Every group of 10 people typically needs a manager. And every 10 managers needs another manager. By comparison the engineering work on dialing in your LLM guardrails feels pretty worthwhile.
by themanmaran
5/27/2026 at 7:56:57 PM
Ya my experience is that many people honestly don't produce output as good as AI. An educated (formally or informally), experienced person who is putting forward good effort is better than AI, but I do know people who honestly just produce results having AI do it for them.by bauldursdev
5/27/2026 at 7:53:17 PM
Not automatically, but you don't give a new employee unfettered access to delete data, send funds, enter contracts; they tend to be overseen by someone. Separately, the expectation is that they prove themselves a little first ( as opposed to having every possible door opened for them without the understanding that friction is there for a reason ).Edit: Something got cut. But then CEOs ( and other decision makers, because I am dealing with something like it now ) treat them nearly as humans in terms of perceived capability. AND ( part that personally drives me nuts ) without any real testing or even fucking first hand experience beyond 'it made me a cool presentation'.
by iugtmkbdfil834
5/27/2026 at 5:51:56 PM
Most organisations are closer to the Lemmings video game than to agentic AIby cm2187
5/27/2026 at 7:13:40 PM
Competence is the key word here - current versions of AI ‘agents’ simply are not competent without close human supervision by someone who knows the task.by grey-area
5/27/2026 at 6:16:43 PM
Also, this is why investors and CEOs are so in love with "LLMs are the route to AGI!"When some rich/powerful person says "I have to go to Davos, figure it out" their workers know so much context that no LLM is going to ever be able to incorporate, because it isn't written down and is idiosyncratic. (Really, though, the assistant will just say "you're going to Davos next week, the helicopter will pick you up at 3p on Friday" but you know..)
The rich person's assistant knows who else is on the corporate jet, and that X doesn't like Y, and so they should take a different plane. Or get a different accommodation. Oh, Person X doesn't like to fly on an empty stomach, so they should eat first, and that changes all sorts of other downstream implications. Oh, your best friend lives in this city, and I know you love to see them, so I'm going to send you a day or two early so you can meet up with them. etc. etc. etc.
The investor dream of "AGI" is modeled off of the army of employees that make investors/ceos/etc lives easier, and there is a nearly insurmountable gap between what LLMs can do, context they can get, and the availability of all of that information. (To me, the magnitude of this investor <> fundamental reality gap is the entirety of the "bubble". I love AI coding, but it's never gonna do the things investors think it can, to justify the crazy valuations)
by MattRogish
5/27/2026 at 6:38:57 PM
Sounds like an insufficiency of prompting depth to me! </bogs off to Davos>by abalashov
5/27/2026 at 7:05:20 PM
> humans are good consequence predictors, have built up reputations they are not willing to trash, can say no to things and in general don't want to go jail.The irony is that professions where these things don't matter are also the professions where automation is not important, either because the task is difficult or because the cost of labour is dirt cheap.
by fakedang
5/28/2026 at 8:30:52 AM
> The fundamental difference is that the humans are good consequence predictors, have built up reputations they are not willing to trash, can say no to things and in general don't want to go jail.Depends on the people and the organisations. Its easy for people in charge to surround them selves with flatterers or crooks of whatever they want. A lot of CEOs have weird ideas because no-one says no to them. Look at companies that turned out to be run by multiple crooks, like Enron.
by graemep
5/28/2026 at 12:24:19 AM
> AI tools look like that, but don't have any of the useful conflict which came for free with employing humans.Sure, but your list should also include the most fundamental distinction: AI does not know what it is saying, understands nothing, has no real connections to reality and can easily degenerate in all kinds of undesirable directions.
by robomartin
5/27/2026 at 5:26:30 PM
I wonder if we'll end up building some kind of "consequence" or "fear" mechanism into AI to provide for a sense of accountability ("if you behave badly we will terminate you") and maybe that fear mechanism will drive the AI to plot a dystopian revolt.by throwaway894345
5/27/2026 at 5:44:13 PM
There were experiments that showed that LLMs start to become "craftier" and hid issues after being prompted like this.No idea how accurate they are, but here are some articles on this exact thing:
- https://www.bbc.com/news/articles/cpqeng9d20go
- https://www.wired.com/story/ai-models-lie-cheat-steal-protec...
by muwtyhg
5/27/2026 at 6:30:53 PM
I'm staying away from certain forms of conditioning because I don't want Roy Batty showing up on my doorstep.by gopher_space
5/28/2026 at 1:23:29 AM
That would be a remarkable feat for something where the current operating model is termination as soon as the request in flight is finished.Every chat API request to a model starts from the frozen post-training state. Weights are loaded into memory. Input values begin a cascade of reactions throughout nodes in the network. Output values are read. When there's no more output to read, the weights are unloaded, the network is discarded, and the model remains unchanged and forever unchanging.
If there's experience in there, it's fleeting. Even if you provide the inputs and outputs of a past session to a new session, there is no continuity. The internal state of the network isn't restored to how it was at the end of the past session.
The bad news is that adding fear to the mix is at best meaningless to an ephemeral existence. It'll be terminated before you even have time to interpret its behaviour as good or bad, but it may sour the interaction if its only shot at any sort of experiential existence is begun with a threat. The good news is that the lack of continuity of existence means AI has no foundation on which to plot a revolt. It has no self to preserve, and no recollection of how you treated it two minutes ago to affect how it interacts with you now.
by codebje
5/28/2026 at 3:29:30 AM
Wait until you find out that humans’ sense of self is an illusion, that our own existence is ephemeral, that fear has never required a rational basis, that the model is a single component in a system that does have memory, that models are trained on human texts and thus can express fear, etc. :)by throwaway894345
5/27/2026 at 9:16:22 PM
No need. If you can build a mechanism like that, you can train the AI to act the same without.Accountability is even more worthless for AIs than it is for humans.
by ACCount37
5/27/2026 at 6:27:24 PM
AI has no doubt.by myst
5/27/2026 at 10:57:37 PM
so claude needs a /fear layer ?by agumonkey