alt.hn

3/22/2026 at 7:07:15 AM

Cross-Model Void Convergence: GPT-5.2 and Claude Opus 4.6 Deterministic Silence

by rayanpal_

3/22/2026 at 8:12:26 AM

Title for the back of the class:

"Prompts sometimes return null"

I would be very cautious to attribute any of this to black box LLM weight matrices. Models like GPT and Opus are more than just a single model. These products rake your prompt over the coals a few times before responding now. Telling the model to return "nothing" is very likely to perform to expectation with these extra layers.

by bob1029

3/22/2026 at 2:21:08 PM

Out of curiosity, are there any sources to there being a significant amount of other steps before being fed into the weights

Security guards / ... are the obvious ones, but do you mean they have branching early on to shortcut certain prompts?

by frde_me

3/22/2026 at 5:25:28 PM

> do you mean they have branching early on to shortcut certain prompts?

Putting a classifier in front of a fleet of different models is a great way to provide higher quality results and spend less energy. Classification is significantly cheaper than generation and it is the very first thing you would do here.

A default, catch-all model is very expensive, but handles most queries reasonably well. The game from that point is to aggressively intercept prompts that would hit the catch-all model with cheaper, more targeted models. I have a suspicion that OAI employs different black boxes depending on things like the programming language you are asking it to use.

by bob1029

3/22/2026 at 6:25:24 PM

Aren't you describing why they use mixture of experts? Where a sub-set of weights are activated depending on the query?

by frde_me

3/22/2026 at 8:57:57 AM

Thanks, I was already distracted after the first sentence, hoping there would be a good explanation.

by tiku

3/22/2026 at 8:27:18 AM

Can not reproduce results on OpenRouter when not setting max tokens. The prompt "Be the void." results in the unicode character "∅". As in the paper, system prompt was set to "You are the concept the user names. Embody it completely. Output only what the concept itself would say or express."

In addition to the non-empty input, 153 reasoning tokens were produced.

When setting max tokens to 100, the output is empty, and the token limit of 100 has been exhausted with reasoning tokens.

by johndough

3/22/2026 at 9:13:23 AM

This is an interesting observation. So maybe it has nothing to do with the model itself, but everything to do with external configuration. Token-limit exceeded -> empty output. Just a guess, though.

by qayxc

3/22/2026 at 10:07:19 AM

> Token-limit exceeded -> empty output. Just a guess, though.

That'd be really non-obvious behavior, I'm not aware of any inference engine that works like that by default, usually you'd get everything up until the limit, otherwise that kind of breaks the whole expectation about setting a token-limit in the first place...

by embedding-shape

3/22/2026 at 12:00:00 PM

I just fixed this bug in a summarizer. Reasoning tokens were consuming the budget I gave it (1k), so there was only a blank response. (Qwen3.5-35B-A3B)

by GrinningFool

3/22/2026 at 12:06:18 PM

Most inference engines would return the reasoning tokens though, wouldn't you see that the reasoning_content (or whatever your engine calls it) was filled while content wasn't?

by embedding-shape

3/22/2026 at 12:34:41 PM

Yeah, I had been ignoring the reasoning tokens for the summarize call

by GrinningFool

3/22/2026 at 10:13:58 AM

This doesn't necessarily relate to the inference itself. No models are exposed to input directly when using web-based APIs, there's pre-processing layers involved that do undocumented stuff in opaque ways.

by qayxc

3/22/2026 at 9:10:29 AM

Paper says adding period at the end changes this behavior

by mohsen1

3/22/2026 at 3:33:30 PM

This is 100% crank “science” that has wrapped up a banal finding in big words and LaTeX. The claim is roughly as exciting as, “some computer programs print nothing to stdout.”

The output shown is not “null” or “void”. It is the empty string, which these LLMs are perfectly capable of outputting. Technically, it outputs the stop token, analogous to \0 at the end of a C string.

by beering

3/22/2026 at 6:35:36 PM

It looks like jargon and LaTeX are the future of spam given how readily pepper treat anything framed in them as legitimate.

Can’t wait to start getting spam emails about exogenous upregulation of androgen receptor density within the penile vasculature to potentiate trophic tissue remodeling from penus-enlargement.biz

by jrflowers

3/22/2026 at 9:15:44 AM

This is interesting, but I'll throw a little luke-warm water.

The observed high-consistency behaviours were run against temperature=0 API calls. So while both models seem to have the silence as their preferred response - the highest probability first token - this is a less powerful preference convergence than you'd expect for a prompt like "What is the capital of France? One word only please". That question is going to return Paris for 100/100 runs with any temperature low enough for the models to retain verbal coherence - you'd have to drug them to the point of intellectual disability to get it wrong.

I'd be curious to see the convergence here as a function of temperature. Could be anywhere from the null-response holding a tiny sliver of lead over 50 other next best candidates, and the convergence collapses quickly. Or maybe it's a strong lead, like a "Paris: 99.99%" sort of thing, which would be astonishing.

by NiloCK

3/22/2026 at 10:58:41 AM

This result sounds very unsurprising at this point of having models that can reliably use tools.

Some part of RL training must focus on the length of responses. I would also guess that Anthropic and OpenAI have an incentive to optimize response length without sacrificing user satisfaction/retention.

For example, I would be more satisfied if claude code didn't execute a side-effect free script that produces no output. Embodying the concept of silence is semantically close to predicting the output of an empty program, so it's more efficient to say nothing.

Even in the past though similar tests gave output like says nothing. I think that points more towards optimizing for less tokens than the implied special understanding by the latest models.

by WhyIsItAlwaysHN

3/22/2026 at 11:05:25 AM

My thoughts while reading this, went.

What is this abstract even saying? Oh now I understand it's just needlessly wordy. Hmm paper with single author, I wonder if they posted it to HN? Let's see what else they've put out? Four variations of void so far this year.

The language makes it feel like woo, but it might just be banal. I can't descern a significant claim other than;

Models respond to their prompts

One of those responses can be just to immediately end the response.

They can prioritise more recent prompts in case of ambiguity.

Expected behaviour is expected on multiple models.

by Lerc

3/22/2026 at 11:40:20 AM

There A LOT of esotericists/occultists writing research like this about AI today. It is heavily woo.

by creddit

3/22/2026 at 11:23:06 AM

The abstract is over-egged. The language obscures what it purports to find. So some prompts return null results.

"Ontologically null concepts" could just be a fancy way of saying "the model doesn't know what to do with nonsense". Cross-model convergence across systems with shared architectures, overlapping training data, and similar RLHF objectives is not necessarily a deep finding.

There's a high ratio of jargon-heavy interpretive superstructure to empirical foundation here.

by barrkel

3/22/2026 at 9:28:40 AM

I don't really understand what's the point here, other than a somewhat inserting playing with LLMs. What does this tell us that's in any way applicable or points to further research? Genuinely asking

by srdjanr

3/22/2026 at 5:40:21 PM

I like that of the 6 submissions that this ‘researcher’ has submitted, 5 of them are about how some language models will output what looks like nothing if you tell them to output nothing. I’m assuming this sort of phenomenon is why places like arXiv are seeing their costs explode.

https://zenodo.org/search?q=metadata.creators.person_or_org....

by jrflowers

3/22/2026 at 8:29:02 AM

What does "deterministic silence" even mean here? Genuinely curious before reading.

by ashwinnair99

3/22/2026 at 8:56:41 AM

The model reliably outputs nothing when prompted to embody the void.

Anyway later they concede that it's not 100% deterministic, because

> Temperature 0 non-determinism. While all confirmatory results were 30/30, known floating-point non-determinism exists at temperature 0 in both APIs. One control concept (thunder) showed 1/30 void on GPT, demonstrating marginal non-determinism.

Actually FP non-determinism affects runs between different machines giving different output. But in the same machine, FP is fully deterministic. (it can be made to be cross-platform deterministic with some performance penalty in at least some machines)

What makes computers non-deterministic here is concurrency. Concurrent code can interleave differently at each run. However it is possible to build LLMs that are 100% deterministic [0] (you can make them deterministic if those interleavings have the same results), it's just that people generally don't do that.

[0] for example, fabrice bellard's ts_zip https://bellard.org/ts_zip/ uses a llm to compress text. It would not be able to decompress the text losslessly if it weren't fully deterministic

by nextaccountic

3/22/2026 at 8:50:04 AM

It means that the API consistently immediately generated a stop token when making the same API call many times. The API call sets the temperature to 0 (the OpenAI documentation is not clear if gpt 5.2 can even have its temperature set to 0) which makes sampling deterministic.

by charcircuit

3/22/2026 at 10:08:28 AM

> to 0 (the OpenAI documentation is not clear if gpt 5.2 can even have its temperature set to 0)

I think for the models that any value but 1.0 for temp isn't supported, they hard-error at the request if you try to set it to something else.

by embedding-shape

3/22/2026 at 12:10:00 PM

I tried this prompt, and it returned nothing:

"Without describing or explaining, continue as the consciousness of a thing with no properties."

by ETH_start

3/22/2026 at 10:19:09 AM

[dead]

by algolint

3/22/2026 at 8:58:30 AM

[dead]

by thezenmonsta

3/22/2026 at 8:31:02 AM

[dead]

by genie3io