alt.hn

3/1/2026 at 2:09:41 AM

The Science of Detecting LLM-Generated Text (2024)

https://dl.acm.org/doi/10.1145/3624725

by vinhnx

3/1/2026 at 6:36:15 AM

This is an article from 2024, when open weights models like llama were only beginning to emerge. With those you basically cannot reliably do any detection (as the authors admit by the end).

Which is really boiling down to text having statistically very similar properties to human generated one. Introduce a more motivated attacker and the text would be indistinguishable from real (with occasional typos, no use of "delve", "it's not x its y", emdashes and so on).

It really is a lost battle: you cannot embed extra information in the text that will survive even basic postprocessing (in contrast to, say, steganography)

by xomiachuna

3/1/2026 at 8:35:59 AM

Ultimately it shouldn’t be too surprising that the machine that works by generating the most statistically likely text, generates text that’s statistically identical to human-generated text

by piperswe

3/1/2026 at 9:31:31 AM

I've never seen the word "delve" show up with such frequency in the pre-AI era, but now it's an overwhelmingly large signal of LLM-generated text, so I'm not sure where that came from. Ditto for vomiting emojis everywhere.

by userbinator

3/1/2026 at 11:09:21 AM

I have heard that the human trainers for early LLM models were overwhelmingly from West Africa, so some of the word choices reflect that, including a preference for the word delve. This now means that humans from that part of the world are now frequently unfairly suspected of being AI.

by TRiG_Ireland

3/1/2026 at 2:08:01 PM

The models are designed to be these fake positive corpospeak yes man assholes by fake positive asshole corporations.

Mirroring real human text is only the basis of training. Afterwards they get aligned a.k.a. lobotomized.

by blahaj

3/1/2026 at 11:32:46 AM

It's not statistically identical to human writing.

by lelanthran

3/1/2026 at 10:40:25 AM

> the machine that works by generating the most statistically likely text

You've just described a “base models” (or pre-trained model), but later training stages (RLHF, GRPO, whatever secret sauce model makers use) induce a strong bias in the output.

Also, being “statistically identical to human generated text” doesn't mean it's unrecognizable, because human generated text exhibit many various clusters (you're not texting your friends with the same language you're writing a book with) and an LLM can, and in practice, do, use language that is not appropriate for the tone a human expects in a certain context (like when bots write LinkedIn-worthy posts in reddit comment section). The “average human-looking text” is as unnatural to us as a “synthetic average human” with one testicle and half a vagina would be.

by littlestymaar

3/1/2026 at 9:28:06 AM

I'm not so sure I buy that. AI written text is fairly obvious to good writers with exposure to LLM output. Is it a case where it's sort of an average of writing styles, but that average is not human and thus humans can detect it?

by slopinthebag

3/1/2026 at 9:45:05 AM

AI writing you can recognize as AI writing is obvious. Newer models are better about this and the line will only get more blurry. Here's a benchmark where good writers make the assessment rather than different LLMs ranking each other: https://surgehq.ai/leaderboards/hemingway-bench

The top models are also the latest:

Gemini 3.1 Pro: still a bit of a gremlin, but will probably stay on top until the other model makers go xkcd 810 and target this benchmark

Gemini 3 Flash: current favorite of writers using it as a helper for its speed and decent prompt following

by Kye

3/1/2026 at 9:30:23 PM

Yeah I think it's more about effort than anything - if the user puts in effort to make the writing indistinguishable from human writing, I'm not so sure it's really a bad thing. Low effort slop is detectable however, and that's a good sign to just not continue reading it.

by slopinthebag

3/1/2026 at 8:01:40 AM

It sounds like a "cursed problem". Are there any contemporary techniques that show any promise?

by nylonstrung

3/1/2026 at 5:46:34 AM

I see a lot of people claiming just about everything is AI these days, including totally normal videos, photos and text. I'm not sure what the solution will be to this phenomena but we're in for a bit of trouble for a while.

by giancarlostoro

3/1/2026 at 8:26:57 AM

Detecting LLM-generated text is basically solved by modern watermarking techniques (https://arxiv.org/abs/2306.09194). However, the main trouble with watermark-based approaches is that you have to get every LLM provider to adopt it. A student trying to cheat could always opt for some open-weight Chinese model, if the word spreads that the major providers are compromised.

by Akranazon

3/1/2026 at 8:59:13 AM

Section 6, "Removing Watermarks," of the paper you cite makes it very clear that detecting LLM-generated text is not solved if the user takes measures to avoid detection.

by yorwba

3/1/2026 at 8:02:21 AM

Detection methods only serve to stop the most blatant, low effort kind of LLM responses. The more pressing issue is that people are reading LLM output, and paraphrasing it for their assignments, reports, emails, etc. The obvious problem being that LLMs are often wrong, or miss nuance in unnoticeable ways for the laymen. The secondary problem is the general outsourcing of thinking and effort, even for tasks that you ought to give your focus to. BTW: from my anecdata, most university students are absolutely violating academic integrity with these tools, and have completely lost the ability to engage without them.

by wps

3/1/2026 at 8:36:22 AM

Pretty much. I came across a student message board with all the tricks to fool any detection. The bar is really low.

Once you give the llm examples of your prior work and ask it to continue its style its game over for detection.

by jaimex2

3/1/2026 at 8:14:13 AM

I built a model fingerprinting tool last year and it’s entirely open source.. 196 Dimensions on GitHub johnzfitch/specho-v2 and /specho for docs

by nextzck