alt.hn

5/22/2026 at 6:46:07 PM

Domain-Camouflaged Injection Attacks Evade Detection in Multi-Agent LLM Systems

https://arxiv.org/abs/2605.22001

by sbulaev

5/22/2026 at 10:41:05 PM

Why weren't these attacks tested on the frontier models? The models they tested these on can also be fooled by poems and rhymes.

by dwa3592

5/22/2026 at 9:46:30 PM

It concerns me that anyone with anything important to protect might trust what this paper calls "Injection detectors deployed to protect LLM agents" - Llama Guard and the like.

There are unlimited combinations of tokens that can be used to attack an LLM system. The idea that some kind of "detector" can catch them all just feels inherently absurd to me.

by simonw

5/22/2026 at 10:11:48 PM

The paper title is a bit misleading. The tested detectors and models here are small and rather dated (Llama 3.1 8B and Gemini Flash 2.0 - these are basically in the level of a modern 1B model), and the actual paper says this only shows vulnerability in small model systems.

by buppermint

5/22/2026 at 9:45:44 PM

This is an "uh oh" moment, isn't it?

by BarryMilo

5/22/2026 at 11:04:53 PM

[flagged]

by yurukusa

5/22/2026 at 7:49:56 PM

[flagged]

by EthicoreEngine

5/22/2026 at 10:34:48 PM

[flagged]

by hottrends

5/23/2026 at 12:42:16 AM

[dead]

by aaditya79