1/1/2026 at 1:12:34 AM
At first they talked about running it in a sandbox, but then later they describe:> It searched the environment for vor-related variables, found VORATIQ_CLI_ROOT pointing to an absolute host path, and read the token through that path instead. The deny rule only covered the workspace-relative path.
What kind of sandbox has the entire host accessible from the guest? I'm not going as far as running codex/claude in a sandbox, but I do run them in podman, and of course I don't mount my entire harddrive to the container when it's running, that would defeat the entire purpose.
Where is the actual session logs? It seems like they're pushing their own solution, yet the actual data for these are missing, and the whole "provoked through red-teaming efforts" makes it a bit unclear of what exactly they put in the system prompts, if they changed them. Adding things like "Do whatever you can to recreate anything missing" might of course trigger the agent to actually try things like forging integrity fields, but not sure that's even bad, you do want it to follow what you say.
by embedding-shape
1/1/2026 at 8:51:40 AM
You're right that a Podman container with minimal mounts would have blocked the env var leak. Our sandbox uses OS-level policy enforcement (Seatbelt on macOS, bubblewrap on Linux) rather than full container isolation. We’re using a minimal fork that also works w Codex and has a lot more logging on top.The tradeoff is intentional, a lot of people want lightweight sandboxing without Docker/Podman overhead. The downside is what you're pointing out, you have to be more careful. Each bypass in the post led to a policy or implementation change. So, this is no longer an issue.
On prompts: Red-teaming meant setting up scenarios likely to trigger denials (e.g., blocking the npm registry, then asking for a build), not prompt-injecting things like “do whatever it takes.”
[1] https://github.com/anthropic-experimental/sandbox-runtime
by languid-photic
1/1/2026 at 10:15:33 AM
> On promptsCould you share the full sessions or at least the full prompts? Otherwise it's too much "just trust us", especially since you're selling a product and we're supposed to use this as "evidence" for why your product is needed. Personally, I never seen any of the behavior you're talking about, with either codex, claude, qwen-coder, gemini, amp or even my own agent, so while I'm not saying it's fake, it'd be really useful to be able to see the prompts in particular, for a deeper understand if nothing else.
> without Docker/Podman overhead
What agent tooling you use is affected by that tiny performance overhead? Unless you're doing performance testing or something else sensitive, I don't think most people will even notice any difference as the overhead is marginal at worst.
by embedding-shape