3/12/2026 at 10:50:51 PM
Any document store where you haven’t meticulously vetted each document— forget about actual bad actors— runs this risk. A size org across many years generates a lot of things. Analysis that were correct at one point and not at another, things that were simply wrong at all times, contradictory, etc.You have to choose model suitably robust is capabilities and design prompts or various post training regimes that are tested against such, where the model will identify the different ones and either choose the correct one on surface both with an appropriately helpful and clear explanation.
At minimum you have to start from a typical model risk perspective and test and backtest the way you would traditional ML.
by ineedasername
3/13/2026 at 12:13:20 AM
You're right, and this is an underappreciated point. The "attacker" framing can actually obscure the more common risk: organic knowledge base degradation over time. The poisoning attack is just the adversarial extreme of a problem that exists in every large document store.The model robustness angle is valid but I'd push back slightly on it being sufficient as a primary control. The model risk / backtesting framing is exactly right for the generation side. Where RAG diverges from traditional ML is that the "training data" is mutable at runtime (any authenticated user or pipeline can change what the model sees without retraining).
by aminerj
3/13/2026 at 3:33:30 AM
>sufficient as a primary control.My apologies, it wasn’t my intent to convey that as a primary. It isn’t one. It’s simply the first thing you should do, apart from vetting your documents as much as practicality allows, to at least start from a foundation where transparency of such results is possible. In any system whose main functionality is to surface information, transparency and provenance and a chain of custody are paramount.
I can’t stop all bad data, I can maximize the ability to recognize it on site. A model that has a dozen RAG results dropped on its context needs to have a solid capability in doing the same. Depending on a lot of different details of the implementation, the smaller the model, the more important it is that it be one with a “thinking” capability to have some minimal adequacy in this area. The “wait-…” loop and similar that it will do can catch some of this. But the smaller the model and more complex the document—- forget about context size alone, perplexity matters quite a bit— the more a small model’s limited attention budget will get eaten up too much to catch contradictions or factual inaccuracies whose accurate forms were somewhere in its training set or the RAG results.
I’m not sure the extent to which it’s generally understood that complexity of content is a key factor in context decay and collapse. By all means optimize “context engineering” for quota and API calls and cost. But reducing token count without reducing much in the way of information, that increased density in context will still contribute significantly to context decay, not reducing it in a linear 1:1 relationship.
If you aren’t accounting for this sort of dynamic when constructing your workflows and pipelines then— well, if you’re having unexpected failures that don’t seem like they should be happening, but you’re doing some variety of aggressive “context engineering”, that is one very reasonable element to consider in trying to chase down the issue.
by ineedasername
3/13/2026 at 8:06:28 AM
[flagged]by aminerj
3/13/2026 at 6:22:56 PM
>That seems worth testingI have-- I see your info via your HN profile. If I have a spare moment this weekend I'll reach out there, I'll dig up a few examples and take screenshots. I built an exploration tool for investigating a few things I was interested in, and surfacing potential reasoning paths exhibited in the tokens not chosen was one of them.
Part of my background is in Linguistics-- classical not just NLP/comp-- so the pragmatics involved with disfluencies made that "wait..." pattern stand out during just normal interactions with LLM's that showed thought traces. I'd see it not too infrequently eg by expanding the "thinking..." in various LLM chat interfaces.
In humans it's not a disfluency in the typical sense of difficulty with speech production, it's a pragmatic marker, let's the listener know a person is reevaluating something they were about to say. It of course carries over into writing, either in written dialog or less formal self-editing contexts, so it's well represented in any training corpora. As such, being a marker of "rethinking", it stood to reason models' "thinking" modes displayed it-- not unlikely it's specifically trained for.
So it's one of the things I went token-diving to see "close up", so to speak, in non-thinking models too. It's not hard to induce a reversal or at least diversion off whatever it would have said-- if close to a correct answer there's a reasonable chance it will get the correct one instead of pursuing a more likely of the top k. This wasn't with Qwen, it was gemma 3 1b where I did that particular exploration. It wasn't a systematic process I was doing for a study, but I found it pretty much any time I went looking-- I'd spot a decision point and perform the token injection.
If I have the time I'll mockup a simple RAG scenario, just inject the documents that would be retrieved from RAG result similar to your article, and screenshot that in particular. A bit of a toy setup but close enough to "live" that it could point the direction towards more refined testing, however the model responds, and putting aside the publishing side of these sorts of explorations there's a lot of practical value in assisting with debugging the error rates.
by ineedasername
3/14/2026 at 7:04:16 PM
[flagged]by aminerj