6/11/2026 at 9:46:34 PM
The most interesting takeaway for me is the three very distinct personalities. Three models all based on the same tech, trained in the same manner, trained by three groups of people with similar ideological outlooks, and the result is three very different AIs.The military basically wants an oracle. Feed the AI the situation, get the best answer out. But if the AIs are as diverse and opinionated as humans, it is debatable whether they are adding anything to the process. The military can already collect as many different opinions as they want. If "the computer" is just another set of diverse opinions, where one computer says one thing, another says another, and a third just tells the user whatever they want to hear... what value are they? It just becomes AI-washing of someone's opinions, which works until people collectively realize that's all it is.
by jerf
6/11/2026 at 10:32:31 PM
What's interesting is that the LLMs' coding personalities seem to match their policy WRT to strategy, which suggests an underlying consistency.Claude, for example, is very eager to begin coding, and very persistent. It tends to exit plan mode even when the plan is half-baked, and will go as far as deleting tests to get the suite to "pass."
ChatGPT on the other hand is very hesitant. It loves to pause and ask for permission before it starts coding, and gives up quickly if it runs into a problem. This is similar to its tendency toward passivity in the strategy simulation presented here.
by notJim
6/11/2026 at 10:09:23 PM
They all have conditioning prompts that precede your input; presumably, most of the detected "personality" comes from the differences in these inputs.by themafia
6/12/2026 at 2:39:11 PM
My point is more-or-less orthogonal to why it happens. The military, and honestly, a lot of people, want AI to just give the answer. If it is highly dependent on a prompt, or the follow-on training, and the AI could be passive or friendly or aggressive or hostile or all those other wonderful attributes of individual humans and there's no sort of AI convergence on "correct" answers, then they aren't going to be able to fulfill that "oracle" role that so many people are looking for.by jerf
6/11/2026 at 10:08:17 PM
I think this is why reasoning chains and reasoning chain verifiers are so important. We need to be able to see an argumentation, not just an answer. The paper below goes into this in more detail.HeavySkill: Heavy Thinking as the Inner Skill in Agentic Harness
by politician