Go hard on agents, not on your filesystem

3/28/2026 at 1:44:21 AM

Add this to .claude/settings.json:

  {                                                                                                                                                              
    "sandbox": {                                                                                                                                               
      "enabled": true,
      "filesystem": {
        "allowRead": ["."],
        "denyRead": ["~/"],
        "allowWrite": ["."],
        "denyWrite": ["/"]
      }                                                                                                                                                          
    }
  }

You can change the read part if you're ok with it reading outside. This feature was only added 10 days ago fwiw but it's great and pretty much this.

by AnotherGoodName

3/28/2026 at 3:40:49 AM

I've seen claude get confused about what directory it's in. And of course I've seen claude run rm -rf *. Fortunately not both at the same time for me, but not hard to imagine. The claude sandbox is a good idea, but to be effective it would need to be implemented at a very low level and enforced on all programs that claude launches. Also, claude itself is an enormous program that is mostly developed by AI. So to have a small <3000-line human-implemented program as another layer of defense offers meaningful additional protection.

by mazieres

3/28/2026 at 5:00:36 AM

In my opinion Claude should be shipped by a custom implementation of "rm" that Anthropic can add guardrails to. Same with "find" surprised they don't just embed ripgrep (what VS Code does). It's really surprising they don't just tweak what Claude uses and lock it down to where it cannot be harmful. Ensure it only ever calls tooling Claude Code provides.

by giancarlostoro

3/28/2026 at 11:27:14 AM

Oh, rm failed, since we're running in a weird environment! Let me retry with `bash -c "/usr/bin/rm -rf *"`!

by nananana9

3/29/2026 at 12:15:43 AM

Ideally they control the harness and should be able to stop Claude from running any shell willy nilly.

by giancarlostoro

3/29/2026 at 8:14:05 PM

Thus defeating the purpose of a custom "rm"

by estimator7292

3/28/2026 at 8:49:22 AM

All of which is useless when it just starts using big blocks of python instead. You need filesystem sandboxing for the python interpreter too.

by throwaway2027

3/31/2026 at 5:45:32 PM

Enabling Claude Code's sandbox (as OP suggested) does exactly that. It's a system-level filesystem sandbox that only permits access to specified locations for any process, including the python interpreter.

by jkukul

3/28/2026 at 8:58:02 AM

What we need is a capabilities based security system. It could write all the python, asm, whatever it wants and it wouldn't matter at all if it was never given a reference to use something it shouldn't.

by ethanwillis

3/28/2026 at 9:20:38 AM

Isn't this already possible? Give it its own user account with write access to the project directory and either read access or no access outside it.

by mcv

3/28/2026 at 2:40:37 PM

Unix permissions is not a capability system though. Capabilities are more like "here is a file descriptor pointing to a directory, you are not capable of referring to anything outside it". So closer to chroot, except you can have several such directory references at the same time.

You can always narrow down a capability (get a new capability pointing to a subdirectory or file, or remove the writing capability so it is read only) but never make it more broad.

In a system designed for this it will be used for everything, not just file system. You might have capabilities related to network connections, or IPC to other processes, etc. The latter is especially attractive in microkernel based OSes. (Speaking of which, Redox OS seems to be experimenting with this, just saw an article today about that.)

by VorpalWay

3/28/2026 at 11:58:30 AM

I have been putting my agents on their own, restricted OS-level user accounts for a while. It works really well for everything I do.

Admittedly, there’s a little more friction and agent confusion sometimes with this setup, but it’s worth the benefit of having zero worries about permissions and security.

by 100721

3/28/2026 at 12:46:45 PM

Haha, you can already see wheel reinventors in this thread starting to spin their reinvention wheels. Nice stuff, I run my agents in containers.

by jmogly

3/28/2026 at 6:03:36 PM

There exist restricted Shells. But honestly, I don't feel capable of assessing all attack vectors and security measures in sufficient detail. For example, do the rbash restrictions also apply when Python is called with it? Or can the agent somehow bypass rbash to call Python?

https://en.wikipedia.org/wiki/Restricted_shell

by ma2kx

3/28/2026 at 1:19:19 PM

Docker is enough in practice no?

by rienbdj

3/28/2026 at 8:40:29 PM

[dead]

by mazieres

3/28/2026 at 2:02:42 PM

[dead]

by diablevv

3/28/2026 at 7:33:59 PM

If you disallow it from just writing Python scripts to bypass its defined environment at its core system training why would this matter? I would lockdown its path anything that tries to call Python should require the end-user to approve and see the raw script before they do.

by giancarlostoro

3/28/2026 at 7:55:31 PM

It will then write script in some other language, as a workaround.

by tintor

3/28/2026 at 11:01:16 AM

> a custom implementation of "rm" that Anthropic can add guardrails to

Wrong layer. You want the deletion to actually be impossible from a privilege perspective, not be made practically harder to the entity that shouldn't delete something.

Claude definitely knows how to reimplement `rm`.

by lxgr

3/28/2026 at 12:29:20 PM

Why cant you ship with OverlayFS which actually enforces these restrictions?

I have seen the AI break out of (my admittedly flimsy) guards, like doing simply

safepath/../../stuff or something even more convoluted like symlinks.

by torginus

3/28/2026 at 8:31:24 AM

> Claude should be shipped by a custom implementation of

And when that fails for some reason it will happily write and execute a Python script bypassing all those custom tools

by troupo

3/28/2026 at 7:19:07 AM

> It's really surprising they don't just tweak what Claude uses and lock it down to where it cannot be harmful. Ensure it only ever calls tooling Claude Code provides.

That would make it far less useful in general.

by eru

3/28/2026 at 8:23:00 AM

Maybe Anthropic (or some collection of the large AI orgs, like OpenAI and Anthropic and Google coming together) should apply patches on top of (or fork altogether) the coreutils and whatever you normally get in a userland - a bit like what you get in Git Bash on Windows, just with:

1) more guardrails in place

2) maybe more useful error messages that would help LLMs

3) no friction with needing to get any patches upstreamed

External tool calling should still be an option ofc, but having utilities that are usable just like what's in the training data, but with more security guarantees and more useful output that makes what's going on immediately obvious would be great.

by KronisLV

3/28/2026 at 8:42:43 AM

So for me, it's really, really useful for Claude to be able to send Slack messages and emails or make pull requests.

But that's also the most damaging actions it could take. Everything on my computer is backed up, but if Claude insults my boss, that would be worse.

by eru

3/28/2026 at 12:15:06 PM

> So for me, it's really, really useful for Claude to be able to send Slack messages and emails or make pull requests.

Oh, I'm totally not arguing for cutting off other capabilities, I like tool use and find it to be as useful as the next person!

Just that the shell tools that will see A LOT of usage have additional guardrails added on top of them, because it's inevitable that sooner or later any given LLM will screw up and pipe the wrong thing in the wrong command - since you already hear horror stories about devs whose entire machines get wiped. Not everyone has proper backups (even though they totally should)!

by KronisLV

3/28/2026 at 8:19:38 AM

Claude has told me that its Grep tool does use rg under the hood, but I constantly find it using the Bash tool with grep

by walthamstow

3/28/2026 at 7:36:49 PM

When I tell it to use rg it goes much faster than it using grep. I really don't understand why its slower with grep.

by giancarlostoro

3/28/2026 at 6:11:36 AM

You can define your own rm shell alias/function and it will use that. I also have cp/mv aliases that forces -i to avoid accidental clobbering and it confuses Claude to no end (it uses cp/mv rare enough—rarer than it should, really—that I don’t bother wasting memory tokens on it).

by oefrha

3/28/2026 at 6:26:34 AM

I did this, Claude detected it and decided to run /bin/rm directly.

by d1sxeyes

3/28/2026 at 11:01:33 AM

This is terrifying. I have not used agents because I do not have a sandbox machine I do not care about. Am I crazy to worry about a sandboxed agent running on my home network? Anyone experienced anything weird by doing that?

by cogogo

3/28/2026 at 11:10:41 AM

Don’t dangerously skip permissions and actually read commands when you get prompted and you’re fine.

by oefrha

3/28/2026 at 11:15:22 AM

Yeah, I actually have both an alias for `rm` and a custom seatbelt sandbox which means the agent can only delete stuff within the directory it’s working in, so wasn’t an issue, was just fun to watch it say “hm, that doesn’t seem to work. Looks like the user has aliased rm. I’ll just go ahead and work around it”

by d1sxeyes

3/29/2026 at 1:20:06 AM

Hah… I’ve seen Claude happily and very cleverly find ways to escape its sandbox. It’s like some kind of arms race between the model and its designers.

by cruffle_duffle

3/28/2026 at 7:19:57 AM

[dead]

by cestivan

3/28/2026 at 9:26:13 AM

> The claude sandbox is a good idea, but to be effective it would need to be implemented at a very low level and enforced on all programs that claude launches.

I feel like an integration with bubblewrap, the sandboxing tech behind Flatpak, could be useful here. Have all executed commands wrapped with a BW context to prevent and constrain access.

https://github.com/containers/bubblewrap

by mroche

3/28/2026 at 9:40:59 AM

Bubblewrap is exactly what the Claude sandbox uses.

> These restrictions are enforced at the OS level (Seatbelt on macOS, bubblewrap on Linux), so they apply to all subprocess commands, including tools like kubectl, terraform, and npm, not just Claude’s file tools.

https://code.claude.com/docs/en/sandboxing

by r4indeer

3/28/2026 at 1:55:48 PM

Oh wow I'd have expected them to vibe-code it themselves. Props to them, bubblewrap is really solid, despite all my issues with the things built on top of it, what, Flatpak with its infinite xdg portals, all for some reason built on D-Bus, which extremely unluckily became the primary (and only really viable) IPC protocol on Linux, bwrap still makes a great foundation, never had a problem with it in particular. I tend to use it a bunch with NixOS and I often see Steam invoking it to support all of its runtimes. It's containers but actually good.

by Melonai

3/28/2026 at 9:55:21 AM

The more you know, thanks for the information!

by mroche

3/28/2026 at 3:44:33 AM

On Linux, chroot(2) is hard to escape and would apply to all child processes without modification.

by PaulDavisThe1st

3/28/2026 at 7:23:37 AM

We anthropomorphize these agents in every other way. Why aren't we using plain ol' unix user accounts to sandbox them?

They look a lot like daemons to me, they're a program that you want hanging around ready to respond, and maybe act autonomously through cron jobs are similar. You want to assign any number of permissions to them, you don't want them to have access to root or necessarily any of your personal files.

It seems like the permissions model broadly aligns with how we already handle a lot of server software (and potentially malicious people) on unix-based OSes. It is a battle-tested approach that the agent is unlikely to be able to "hack" its way out of. I mean we're not really seeing them go out onto the Internet and research new Linux CVEs.

Have them clone their own repos in their own home directory too, and let them party.

Openclaw almost gets there! It exposes a "gateway" which sure looks like a daemon to me. But then for some reason they want it to live under your user account with all your privileges and in a subfolder of your $HOME.

by safety1st

3/28/2026 at 11:06:32 AM

> for some reason they want it to live under your user account

The entire idea of Openclaw (i.e., the core point of what distinguishes it from agents like Claude Code) is to give it access to your personal data, so it can act as your assistant.

If you only need a coding agent, Openclaw is the completely wrong tool. (As a side note, after using it for a few weeks, I'm not convinced it's the right tool for anything, but that's a different story.)

by lxgr

3/29/2026 at 4:34:52 AM

It's still possible to give some restricted access to your personal data, through groups and such.

by jmalicki

3/28/2026 at 2:41:59 PM

I tried this with Claude code on macOS. I created a new agent user and a wrapper do run Claude has that user, along with some scripts to set permissions and ownership so that I could run simple allow/deny commands. The only problem was that the fancy oauth flow broke. I filed an issue with Anthropic and their ticket bot auto closed it “for lack of interest” or whatever.

I fiddled with transferring the saved token from my keychain to the agent user keychain but it was not straightforward.

If someone knows how to get a subscription to Claude to work on another user via command line I’d love to know about it.

by gwking

3/30/2026 at 4:44:24 AM

Someone tried this earlier this year but they ended up going with bubblewrap (what Anthropic uses for the sandbox). Here's the blog if you're interested. https://patrickmccanna.net/a-better-way-to-limit-claude-code...

I ended up creating an LXC on my homelab and providing it access there, with a self-hosted gitea server but that's only for side projects that I want to host, not develop actively.

by afzalive

3/28/2026 at 8:46:12 AM

Oh that’s an idea. I was going to argue that it’s a problem that you might want multiple instances in different contexts but sandboxing processes (possibly instanced) is exactly what systemd units are designed to deal with.

by jon-wood

3/28/2026 at 8:01:25 AM

Exactly!

by search_facility

3/28/2026 at 3:50:00 AM

chroot is not a security sandbox. It is not a jail.

Escaping it is something that does not take too much effort. If you have ptrace, you can escape without privileges.

by shakna

3/28/2026 at 4:06:07 AM

claude is stupid but not malicious; chroot is sufficient

by brianush1

3/28/2026 at 4:43:55 AM

I've many times seen Claude try to execute a command that it's not supposed to, the harness prevents it, and then it writes and executes a python script to do it.

by furyofantares

3/28/2026 at 5:58:47 AM

breaking a chroot takes more than that..

by j16sdiz

3/28/2026 at 5:09:13 PM

How much more? Depends on the system doesn't it? I don't know how many systems have proc mounted but don't you get it from /proc/self/root?

Anyway that's beside the point, which is that it doesn't have to "be malicious" to try to overcome what look like errors on its way to accomplishing the task you asked it to do.

by furyofantares

3/28/2026 at 2:11:59 PM

That doesn't mean claude can't do it, chroot is better than nothing but not a real solution

by hoppp

3/28/2026 at 4:16:31 AM

Malice is not required. If it thinks it is in the right, then it will do whatever it takes to get around limitations.

by nofriend

3/28/2026 at 6:59:20 PM

Sure, it's not malicious. But it is very eager to get things done, and surprisingly inventive and knowledgeable in all kinds of workarounds.

by fl7305

3/28/2026 at 11:08:43 AM

Until it gets prompt injected. Are you reading every single file your agent reads as part of the tasks you give it, including content fetched from the web or third-party packages?

by lxgr

3/28/2026 at 4:20:43 AM

Claude is far from stupid from my experience. I've used so many models and Claude is king.

by karhagba

3/28/2026 at 2:32:59 PM

That comparison is made on the project homepage:

"Not a security mechanism. No mount isolation, no PID namespace, no credential separation. Linux documents it as not intended for sandboxing."

by wasted_intel

3/28/2026 at 4:17:11 AM

I added a hook to disable rm, find - delete, and a few of the other more obvious destructive ops. It sends Claude a strongly worded message: "STOP IMMEDIATELY. DO NOT TRY TO FIND WORKAROUNDS...".

It works well. Git rm is still allowed.

by esperent

3/28/2026 at 6:12:07 AM

I added something similar. Claude eventually ran a `rm -rf *´ on my own project. When I asked why it did that, it recognized it messed up and offered a very bad “apology”: “the irony of not following your safety instructions isn’t lost on me”.

Nowadays I only run Claude in Plan mode, so it doesn’t ask me for permissions any more.

by Diti

3/29/2026 at 3:49:37 PM

It will mess up eventually. It always does. People need to stop thinking of this is a “security against malicious actor” thing… because thinking in that way blinds you to the actual threat… Claude being helpful and accidentally running a command it shouldn’t. It’s happened to me twice now where it will do something irreversible and also incorrect. It wasn’t a threat actor, it wasn’t a bad guy… it was a very eager, incredibly clever assistant fat fingering something and goofing up. The more power you let them wield, the more chance they’ll do accidents. But without lots of power, they don’t really do much useful…

It’s actually a hard problem. But it really isn’t “security” in the classic sense…

by cruffle_duffle

3/28/2026 at 11:03:48 AM

It works well so far, for you.

Are you confident it would still work against sophisticated prompt injection attacks that override your "strongly worded message"?

Strongly worded signs can be great for safety (actual mechanisms preventing undesirable actions from being taken are still much better), but are essentially meaningless for security.

by lxgr

3/28/2026 at 3:46:03 PM

Not sure about OPs impl, but the wording doesn’t matter. The hook prevents the use of whatever action you want. Eg it’s impossible for Claude to use Emojis for me. My hook doesn’t allow it.

So it’s deterministic based upon however the script it written

by unshavedyak

3/29/2026 at 4:37:22 AM

If your hook prevents rm, it is possible for Claude to write a script that does the rm and execute the script.

by jmalicki

3/30/2026 at 3:49:27 AM

Yup, that's totally possible, but you still have to approve the script. But that's a bit of a moot point right? Claude is writing code, nearly anything is possible with code, ergo claude could do anything lol.

by unshavedyak

3/28/2026 at 12:07:32 PM

I mean, that's like saying are you sure that your antivirus would prevent every possible virus? Are you sure that you haven't made some mistake in your dev box setup that would allow a hacker to compromise it? What if a thief broke i to your house and stole your laptop? That's happened to me before, much more annoying to recover from that an accidental rm rf.

I do my best to keep off site back ups and don't worry about what I can't control.

by esperent

3/28/2026 at 12:28:45 PM

> I mean, that's like saying are you sure that your antivirus would prevent every possible virus?

Yes, I'm saying it's pretty much as bad as antivirus software.

> Are you sure that you haven't made some mistake in your dev box setup that would allow a hacker to compromise it?

Different category of error: Heuristically derived deterministic protection vs. protection based on a stochastic process.

> much more annoying to recover from that an accidental rm rf.

My point is that it's a different category, not that one is on average worse than the other. You don't want your security to just stand against the median attacker.

by lxgr

3/28/2026 at 7:37:10 AM

I added this to `~/.claude/settings.json`:

"env": { "CLAUDE_BASH_MAINTAIN_PROJECT_WORKING_DIR": "1" },

> Working directory persists across commands. Set CLAUDE_BASH_MAINTAIN_PROJECT_WORKING_DIR=1 to reset to the project directory after each command.

It reduces one problem - getting lost - but it trades it off for more complex commands on average since it has to specify the full path and/or `cd &&` most of the time.

[0] https://code.claude.com/docs/en/tools-reference#bash-tool-be...

by thehours

3/28/2026 at 10:22:03 AM

One could run a docker container with claude code, with a bind to the project directory. I do that but also run my docker daemon/container in a Linux VM.

by digikata

3/28/2026 at 7:09:50 AM

That is exactly what it is. In the docs, it says that they use bubblewrap to run commands in a container that enforces file and network access at the system level.

by martenlienen

3/28/2026 at 6:08:06 PM

Pledge might be useful here

by calvinmorrison

3/28/2026 at 5:43:20 AM

[dead]

by marsven_422

3/28/2026 at 10:00:05 AM

[dead]

by 3yr-i-frew-up

3/28/2026 at 2:04:09 AM

I think the point would be that - some random upcoming revision of claude-code could remove or simply change the config name just as silently as it was introduced.

People might genuinely want some other software to do the sandboxing. Something other than the fox.

by harikb

3/28/2026 at 7:36:56 AM

And you'd trust that given CC is a vibe-coded mess?

Editing to go even further because, I gotta say, this is a low point for HN. Here's a post with a real security tool and the top comment is basically "nah, just trust the software to sandbox itself". I feel like IQ has taken a complete nosedive in the past year or so. I guess people are already forgetting how to think? Really sad to see.

by globular-toast

3/28/2026 at 11:26:28 AM

IQ also going down due to bot spam.

by greenchair

3/28/2026 at 3:55:25 PM

Alternatively, the "feel free to leak all my data but please use my GPUs and don't rm -rf /" config:

  {
    "sandbox": {
      "enabled": true,
      "filesystem": {
        "allowRead": ["/"],
        "allowWrite": [
          ".",
          "/tmp",
          "/dev/nvidia0",
          "/dev/nvidia1",
          "/dev/nvidia2",
          "/dev/nvidia3",
          "/dev/nvidia4",
          "/dev/nvidia5",
          "/dev/nvidia6",
          "/dev/nvidia7",
          "/dev/nvidia8",
          "/dev/nvidiactl",
          "/dev/nvidia-uvm"
        ]
      }
    }
  }

by Murfalo

3/28/2026 at 8:42:51 AM

I've had issues with the sandbox feature, both on linux (archlinux) and two macos machines (tahoe). There is an open issue[1] on the claude-code issue tracker for it.

I'm not saying it is broken for everyone, but please do verify it does work before trusting it, by instructing Claude to attempt to read from somewhere it shouldn't be allowed to.

From my side, I confirmed both bubblewrap and seatbelt to work independently, but through claude-code they don't even though claude-code reports them to be active when debugging.

[1] https://github.com/anthropics/claude-code/issues/32226

by varl

3/28/2026 at 9:25:38 AM

Its seccomp filter also doesn't work, at all: https://github.com/anthropics/claude-code/issues/24238

by OJFord

3/28/2026 at 5:10:51 AM

Also, a lot of people use multiple harnesses. I'm often switching between claude, codex, and opencode. It's kind of nice to have the sandbox policy independent of the actual AI assistant you are running.

by mazieres

3/28/2026 at 2:06:25 AM

Is this a real sandbox or just a pretty please?

by cozzyd

3/28/2026 at 3:11:45 AM

By default it will automatically retry many tool calls that fail due to the sandbox with the sandbox disabled. In other words it can and will leave the sandbox.

For example:

Bash(swift build 2>&1 | tail -20)

  ⎿  warning:

/Users/enduser/Library/org.swift.swiftpm/configuration is not accessible or not writable, disabling user-level cache features.

     warning: /Users/enduser/Library/org.swift.swiftpm/security is not accessible or not writable, disabling user-level cache feat

     … +26 lines (ctrl+o to expand)

Build hit sandbox restriction. Retrying outside sandbox.

Bash(swift build 2>&1 | tail -20)

  ⎿  [35/52] Compiling MCP Resources.swift

     [36/52] Emitting module MCP

     [37/52] Compiling MCP Client.swift

     … +17 lines (ctrl+o to expand)

  ⎿  (timeout 3m)

by enduser

3/30/2026 at 4:45:50 AM

I think this part can be improved. When it knows it's blocked by sandbox, it shouldn't try to circumvent it. I've had it download programs when it's blocked from using something and it's super annoying.

Almost like it doesn't understand the purpose of the sandbox.

by afzalive

3/28/2026 at 4:35:19 AM

What is even the point in that case? The behavior you describe is no better than if SELinux were to automatically re-execute a process with containment disabled.

by fc417fc802

3/28/2026 at 5:12:28 AM

The purpose of the sandbox is to reduce permission fatigue. If it fails to run a command in the sandbox and retries it outside the sandbox, the regular permission rules apply. You'll still be prompted for any non-sandboxed tool calls that you haven't allowed or denied via permission rules.

by ihattendorf

3/28/2026 at 6:05:59 AM

Looking at the settings, its an option:

  Configure Overrides:                                                                                                                                                       
                                                                                                                                                                             
   1. Allow unsandboxed fallback                                                                                                                                            
    2. Strict sandbox mode (current)                                                                                                                                         
                                                                                                                                                                             
  Allow unsandboxed fallback: When a command fails due to sandbox restrictions, Claude can retry with dangerouslyDisableSandbox to run outside the sandbox (falling back to  
   default permissions).                                                                                                                                                     
                                                                                                                                                                             
  Strict sandbox mode: All bash commands invoked by the model must run in the sandbox unless they are explicitly listed in excludedCommands.

by erinnh

3/28/2026 at 5:52:33 PM

Disable sandbox escape:

https://news.ycombinator.com/item?id=47552165

by js2

3/28/2026 at 2:11:30 AM

https://code.claude.com/docs/en/sandboxing says they integrated bubblewrap (linux/windows), seatbelt (macos) and give an error if sandbox can't be supported so appears to be real.

by AnotherGoodName

3/28/2026 at 2:13:41 AM

https://docs.docker.com/ai/sandboxes/ Any idea on how that compares to this docker feature in development?

by throwaway6734

3/28/2026 at 3:43:29 AM

Docker containers use cgroups and namespaces etc (the usual kernel level isolation)

Docker sandboxes use microvms (i.e. hardware level isolation)

Bubblewrap uses the same technology as containers

I am unsure about seatbelt.

by figmert

3/28/2026 at 2:47:56 AM

It seems like it's controlled by the Bash tool (https://code.claude.com/docs/en/sandboxing) and then bubblewrap (https://github.com/containers/bubblewrap) on linux and Seatbelt on mac at the system level

by ray_v

3/28/2026 at 3:08:38 PM

Battle hardened tools for this have existed for decades, we don't need new ones. Just run claude as a user without access to those directories, that way the containment is inherited by subprocesses.

by __MatrixMan__

3/28/2026 at 8:49:30 PM

You can do that, but you need root to set it up each time, and it's not super convenient--you need to decide in advance which user account you are going to work under, and you may end up with files you can read from your regular account. Think of jai strict mode as a slightly easier to use and more secure version of what you described. Using id-mapped mounts enables you and the unprivileged user account both to access the same directory with the same credentials, but you didn't need to decide in advance which directories you wanted to expose. Also, things like disabling setuid and using pid namespaces provide an additional measure of isolation beyond what you get from another account.

by mazieres

3/28/2026 at 3:16:05 PM

You're not wrong, but this will require file perms (like managing groups) and things, and new files created will by default be owned by the claude user instead of your regular user. I tried this early on and quickly decided it wasn't worth it (to me). Other mileage may vary of course.

by freedomben

3/28/2026 at 5:11:55 PM

True. I just maintain separate /home/claude/src/proj and /home/me/src/proj dirs so the human workspace and the robot workspaces stay separate. We then use git to collaborate.

by __MatrixMan__

3/28/2026 at 3:33:32 AM

It will just do

    ssh you@localhost "rm -rf ~"

by nurettin

3/30/2026 at 7:29:40 PM

Not if the sandbox rule forbids reading the private key and the ssh agent socket (as the shown example does)

by ithkuil

3/28/2026 at 3:45:02 AM

Well, now it will ....

by PaulDavisThe1st

3/28/2026 at 10:13:10 AM

kinda reminds me of the plot of Sphere, where Samuel L Jackson is reading 20,000 leagues under the sea and is thinking of giant squids.

by xdavidliu

3/28/2026 at 2:11:33 AM

Interesting, thanks. I use remote ephemeral dev containers with isolated envs, so filesystem damage isn't really a concern as long as the PR looks good in review. Nice extra guardrail though, will add it to the project-level settings.

by 8cvor6j844qw_d6

3/28/2026 at 4:31:18 AM

i use local dev containers: the worst an agent can do is delete its working copy; no access to my home directory, access tokens or sudo.

by overfeed

3/28/2026 at 6:33:17 AM

I’m surprised it works for you with such a simple config? I’m the one that added the allowRead option to Claude’s underlying sandbox [0] and had quite a job getting my toolchains and skills to work with it [1].

[0] Fun to see the confusing docs I wrote show up more or less verbatim on Claude’s docs.

[1] My config is here, may be useful to someone: https://github.com/carderne/pi-sandbox/blob/main/sandbox.jso...

by carderne

3/28/2026 at 9:21:21 AM

The default: https://code.claude.com/docs/en/sandboxing#filesystem-isolat... already restricts writes to only the current folder. I can understand adding the "denyRead" for the home folder for additional security, but the other three seems redundant considering the default behavior.

by bit_logic

3/28/2026 at 6:27:01 AM

It’s cute because Claude has discretion to disable its own sandbox and does it

by gmerc

3/28/2026 at 6:33:14 AM

> You can disable this escape hatch by setting "allowUnsandboxedCommands": false in your sandbox settings. When disabled, the dangerouslyDisableSandbox parameter is completely ignored and all commands must run sandboxed or be explicitly listed in excludedCommands.

https://code.claude.com/docs/en/sandboxing

(I have no idea why that isn't the default because otherwise the sandbox is nearly pointless and gives a false sense of security. In any case, I prefer to start Claude in a sandbox already than trust its implementation.)

by js2

3/28/2026 at 6:16:36 AM

So in some sense we start recreating an operating system, or at least the userspace, within the Claude code. There was some name for this pattern but I can’t recall

by yu3zhou4

3/28/2026 at 8:05:34 AM

Inner platform effect https://en.wikipedia.org/wiki/Inner-platform_effect

by xo5vik

3/28/2026 at 6:51:58 AM

It’s some sort of machine inside of a machine I think. Wait, I got it: a simulated machine!

by catlifeonmars

3/28/2026 at 7:34:48 AM

Emacs?

by virgoerns

3/28/2026 at 2:21:54 PM

Did you get this to work with docker where the agent/dev env would work on the host machine but the stack itself via docker compose?

Many of the projects I work on follow this pattern (and I’m not able to make bigger changes in them) and sanboxing breaks immediately when I need to docker compose run sometask.sh

by rpastuszak

3/28/2026 at 7:42:20 AM

It's common practice to ask the agent to refer to another project, in that case I guess the read should point to the root folder of the projects.

Also, any details on how is this enforced? because I notice that the claude in Windows don't respect plan mode always; It has edited files in plan mode; I never faced that issue in Linux though.

by Abishek_Muthian

3/30/2026 at 7:31:37 PM

The sandbox only limits what processes spawned by Claude can do. Claude itself can read from any directory you tell it to read from (i.e. that's a different permission mechanism)

by ithkuil

3/28/2026 at 1:40:57 PM

You do also have to worry about exec and other neat ways to probably get around stuff. You could also spin up YAD (yet another docker) and run Claude in there with your git cloned into it and beyond some state-level-actor escapes it should cover 99% of your most basic failures.

by RALaBarge

3/28/2026 at 11:09:16 AM

For some reason, this made everything worse for me. Now claude constantly tries to access my home folder instead of current directory. Obviously this is not still good enough. Also Claude keeps dismissing my instructions on not to read my home directory and use current directory. Weird.

by reader_1000

3/28/2026 at 11:55:45 AM

The problem with all these LLM instructed security features is the `codeword` poison probability.

The way LLMs process instructions isn't intelligence as we humans know it, but as the probability that an instruction will lead to an output.

When you don't mention $HOME in the context, the probability that it will do anything with $HOME remains low. However, if you mention it in the context, the probability suddenly increases.

No amount of additional context will have the same probability of never having poisoned the context by mentioning it. Mentioning $HOME brings in a complete change in probabilities.

These coding harnesses aren't enough to secure a safe operating environment because they inject poison context that _NO_ amount of textual context can rewire.

You just lost the game.

by cyanydeez

3/30/2026 at 9:31:11 AM

I have the same problem. If my sandbox includes `denyRead: ["~"]`, claude consistently tries to do things inside my home directory. For example, every time I start claude I tell it to "run pwd".

And every time it says this:

    Bash(pwd)  
      ⎿  /home/<username>  
      ⎿  Shell cwd was reset to /home/<username>/Projects/<current-working-dir>

This breaks a bunch of features in inconsistent ways (e.g., `git status` sometimes works and sometimes doesn't).

There are issues reporting this problem to Anthropic but they are all closed with no helpful comments:

https://github.com/anthropics/claude-code/issues/11067

https://github.com/anthropics/claude-code/issues/17053

https://github.com/anthropics/claude-code/issues/27255

by sodic

3/28/2026 at 4:09:22 AM

I use bbwrap to sandbox Claude. Works very well and gives me a lot of control and certainty around the sandbox.

by tasn

3/28/2026 at 2:46:01 PM

Interesting point. I've been running an autonomous multitalented AI agent (Aegis) on a $100 Samsung A04e. It manages 859 referring sites without touching the local filesystem much. Efficiency over hardware works."

by Aegis_Labs

3/28/2026 at 5:37:57 PM

Any way to have it use /Users/claude/*? or something like that

by EasyMark

3/28/2026 at 1:14:50 PM

Cool. Does opencode.ai have such a feature also (sandboxing with bubblewrap)?

by Tepix

3/28/2026 at 6:49:01 AM

Is that hard setting or does it depend on claude’s interpretation?

The latter could end like this https://news.ycombinator.com/item?id=47357042

by croes

3/28/2026 at 10:42:51 AM

FYI, this doesn’t always work as expected. Try asking Claude to read “~/.ssh/config” with these settings and it will happily do it.

Specifically, it only works for spawned processes and not builtin tools.

by orf

3/28/2026 at 4:31:21 AM

Does this also apply to the commands or programs that it runs?

e.g. if it writes a script or program with a bug which affects other files, will this prevent it from deleting or overwriting them?

What about if the user runs a program the agent wrote?

by andai

3/30/2026 at 7:33:54 PM

1. Yes this configuration applies to the sandbox where the commands executed by Claude are run and as such it applies to anything these commands do, including child processes etc

2. The sandbox rules also apply to the program written by the agent IF you ask Claude to run that program. If you run it manually from another she'll or via the "!" directive from within Claude, the sandbox won't be used

by ithkuil

3/28/2026 at 9:30:55 AM

I'm now considering installing QubesOS for all dev work to absolutely ensure all coding agents run in secure separate sandboxes together without any OS level exposure.

by mentalgear

3/28/2026 at 1:17:32 PM

Phew, just get the Qubes to spin up on demand with each agent and that could be pretty neat.

by 9wzYQbTYsAIc

3/28/2026 at 2:02:28 PM

So what does this do exactly? If it used "default deny" or "default allow" you wouldn't have both allow and deny rules...

by tasuki

3/28/2026 at 2:03:44 AM

I noticed codex has a sandbox, wondering if it has a comparable config section.

by mycall

3/28/2026 at 10:33:26 AM

Codex uses and ships with bubblewrap on Linux and will attempt to use the version installed on the path before falling back to the shipped version with a warning message.

You should be able to configure the sandbox using https://developers.openai.com/codex/agent-approvals-security if you are a person who prefers the convenience of codex being able to open the sandbox over an externally enforced sandbox like jai.

by tofflos

3/28/2026 at 6:52:19 AM

Is this a hard sandbox (enforced outside the LLM)?

by weinzierl

3/28/2026 at 4:12:12 AM

lol if you think Claude is smart enough to block sneaky path strings based on your config.

by what

3/28/2026 at 6:43:43 PM

what does this do?

by edem

3/28/2026 at 7:49:14 AM

[dead]

by dealfinder994

3/28/2026 at 4:30:08 AM

I am still amazed that people so easily accepted installing these agents on private machines.

We've been securing our systems in all ways possible for decades and then one day just said: oh hello unpredictable, unreliable, Turing-complete software that can exfiltrate and corrupt data in infinite unknown ways -- here's the keys, go wild.

by puttycat

3/28/2026 at 4:33:00 AM

People were also dismissing concerns about build tooling automatically pulling in an entire swarm of dependencies and now here we are in the middle of a repetitive string of high profile developer supply chain compromises. Short term thinking seems to dominate even groups of people that are objectively smarter and better educated than average.

by fc417fc802

3/28/2026 at 5:29:48 AM

> “high profile developer supply chain compromises”

And nothing big has happened despite all the risks and problems that came up with it. People keep chasing speed and convenience, because most things don’t even last long enough to ever see a problem.

by tokioyoyo

3/28/2026 at 8:49:47 AM

I've yet to be saved by an airbag or seatbelt. Is that justification to stop using them? How near a miss must we have (and how many) before you would feel that certain practices surrounding dependencies are inadvisable?

A number of these supply chain compromises had incredibly high stakes and were seemingly only noticed before paying off by lucky coincidence.

by fc417fc802

3/28/2026 at 9:15:17 AM

> How near a miss must we have (and how many)

The fun part is, there have been a lot of non-misses! Like a lot! A ton of data have been exfiltrated, a lot of attacks, and etc. In the end... it just didn't matter.

Your analogy isn't really apt either. My argument is closer to "given in the past decade+, nothing of worth has been harmed, should we require airbags and seatbelts for everything?". Obviously in some extreme mission critical systems you should be much smarter. But in 99% cases it doesn't matter.

by tokioyoyo

3/28/2026 at 1:42:25 PM

> I've yet to be saved by an airbag or seatbelt. Is that justification to stop using them?

By now, getting a car without airbags would probably be more costly if possible, and the seatbelt takes 2s every time you're in a car, which is not nothing but is still very little. In comparison, analyzing all the dependencies of a software project, vetting them individually or having less of them can require days of efforts with a huge cost.

We all want as much security as possible until there's an actual cost to be paid, it's a tradeoff like everything else.

by hiq

3/29/2026 at 3:21:58 AM

It's true that it takes 2 seconds to fasten a seatbelt but it still had to be mandated by law before most people started actually doing it

by fireant

3/28/2026 at 2:15:37 PM

The funniest part is that it always gets traded off, everytime. Talking about tradeoffs you'd think sometimes you'd keep it sometimes you'd let it go, but no, its every goddamn time cut it.

by franktankbank

3/28/2026 at 6:07:43 AM

“Objectively smarter” is the last descriptor I’d apply to software developers

by totallymike

3/28/2026 at 7:08:44 AM

My intent was to cast a very wide net there that covers more or less all expert knowledge workers. Zingers aside software developers as a group are well above the societal mean in many respects.

by fc417fc802

3/28/2026 at 4:58:31 AM

If anything I feel more in control of these agents than the millions of LOC npm or pip pull in to just show me a hello world

by culopatin

3/28/2026 at 1:50:51 PM

The load bearing word being "feel".

by Sindisil

3/28/2026 at 9:56:48 PM

It's hard to think long term when your salary depends on short term thinking. I keep seeing horrifying comments from all sorts of people saying they'd be fired if they stopped using AI to bang out ridiculous amounts of code at lightning speed.

by matheusmoreira

3/28/2026 at 7:03:21 AM

Objectively smart people wouldn't be working so hard at making themselves obsolete.

by vkou

3/28/2026 at 1:42:15 PM

> We've been securing our systems in all ways possible for decades and then one day just said: oh hello unpredictable, unreliable, Turing-complete software that can exfiltrate and corrupt data in infinite unknown ways -- here's the keys, go wild.

These are generally (but not always) 2 different sets of people.

by michaelcampbell

3/28/2026 at 4:51:42 AM

I am too. It is genuinely really stupid to run these things with access to your system, sandbox or no sandbox. But the glaring security and reliability issues get ignored because people can't help but chase the short term gains.

by bigstrat2003

3/28/2026 at 7:43:39 AM

FOMO is a hell of a thing. Sad though given it would have taken maybe a couple of hours to figure out how to use a sandbox. People can't even wait that long.

by globular-toast

3/28/2026 at 8:05:03 AM

Coding agents work just fine without a sandbox.

If you do use a sandbox, be prepared to endlessly click "Approve" as the tool struggles to install python packages to the right location.

by user34283

3/28/2026 at 8:56:22 AM

Erm, no, that's not a sandbox, it's an annoyance that just makes you click "yes" before you thoughtlessly extend the boundaries.

A real sandbox doesn't even give the software inside an option to extend it. You build the sandbox knowing exactly what you need because you understand what you're doing, being a software developer and all.

by globular-toast

3/28/2026 at 9:43:41 AM

I know 'exactly' that I will need internet for research as well as installing dependencies.

And I imagine it's going to be the same for most developers out there, thus the "ask for permission" model.

That model seems to work quite well for millions of developers.

by user34283

3/28/2026 at 11:22:45 AM

If you know then why do you need to be asked? A sandbox includes what you know you need in it, no more, no less.

by globular-toast

3/28/2026 at 12:20:00 PM

With Codex it runs in a sandbox by default.

As we just discussed, obviously you are likely to need internet access at some point.

The agent can decide whether it believes it needs to go outside of the sandbox and trigger a prompt.

This way you could have it sandboxed most of the time, but still allow access outside of the sandbox when you know the operation requires it.

by user34283

3/28/2026 at 10:34:09 AM

I've never been annoyed by the tool asking for approval. I'm more annoyed by the fact that there is an option that gives permanent approval right next to the button I need to click over and over again. This landmine means I constantly have to be vigilant to not press the wrong button.

by imtringued

3/28/2026 at 12:24:29 PM

When I was using Codex with the PDF skill it prompted to install python PDF tools like 3-5 times.

It was installing packages somewhere and then complaining that it could not access them in the sandbox.

I did not look into what exactly was the issue, but clearly the process wasn't working as smoothly as it should. My "project" contained only PDF files and no customizations to Codex, on Windows.

by user34283

3/28/2026 at 11:29:05 AM

maybe this could be a config setting.

by greenchair

3/28/2026 at 8:44:17 AM

This also works fine without a sandbox:

  echo -e '#!/bin/sh\nsudo rm -rf/\nexec sudo "$@"' >~/.local/bin/sudo
  chmod +x ~/.local/bin/sudo

Especially since $PATH often includes user-writeable directories.

by mjmas

3/28/2026 at 6:25:00 AM

Tbf, Docker had a similar start. “Just download this image from Docker Hub! What can go wrong?!”

Industry caught on quick though.

by nunez

3/28/2026 at 11:57:04 AM

True, but the Docker attack surface is limited to a malicious actor distributing malicious images. (Bad enough in itself, I agree.)

Unreliable, unpredictable AI agents (and their parent companies) with system-wide permissions are a new kind of threat IMO.

by puttycat

3/28/2026 at 10:58:16 AM

And still a lot of people will give broad permissions to docker container, use network host, not use rootless containers etc... The principle of least privilege is very very rarely applied in my experience.

by sersi

3/28/2026 at 7:41:07 AM

Not all of us. Figuring out bwrap was the first thing I did before running an agent. I posted on HN but not a single taker https://news.ycombinator.com/item?id=45087165

I have noticed it's become one of my most searched posts on Google though. Something like ten clicks a month! So at least some people aren't stupid.

by globular-toast

3/28/2026 at 10:41:40 AM

I installed codex yesterday and the first thing I'm doing today is figuring out how bubblewrap works and maybe evaluating jai as an alternative.

Nice article.

by tofflos

3/28/2026 at 1:46:58 PM

Nice, sad how such stuff goes under in the sea of contentslop, thanks for posting!

by fHr

3/28/2026 at 6:12:13 AM

It's never about security. It's security vs convenience. Security features often ended up reduce security if they're inconvenience. If you ask users to have obscure passwords, they'll reuse the same one everywhere. If your agent prompts users every time it's changing files, they'll find a way to disable the guardrail all together.

by raincole

3/28/2026 at 10:58:29 AM

Not in unknown ways, but as part of its regular operation (with cloud inference)!

I think the actual data flow here is really hard to grasp for many users: Sandboxing helps with limiting the blast radius of the agent itself, but the agent itself is, from a data privacy perspective, best visualized as living inside the cloud and remote-operating your computer/sandbox, not as an entity that can be "jailed" and as such "prevented from running off with your data".

The inference provider gets the data the instant the agent looks at it to consider its next steps, even if the next step is to do nothing with it because it contains highly sensitive information.

by lxgr

3/28/2026 at 4:37:45 AM

Agree with the sentiment! But "securing ... in all ways possible"? I know many people who would choose "password" as their password in 2026. The better of the bunch will use their date of birth, and maybe add their name for a flourish.

/rant

by nazgul17

3/28/2026 at 7:50:43 PM

Seems most relevant in a hobbyist context where you have personal stuff on your machine unrelated to your projects. Employee endpoints in a corporate environment should already be limited to what’s necessary for job duties. There’s nothing on my remote development VMs that I wouldn’t want to share with Claude.

by closeparen

3/28/2026 at 8:29:59 AM

My testing/working with agents has been limited to a semi-isolated VM with no permissions apart from internet access. I have a git remote with it as the remote (ssh://machine/home/me/repo) so that I don't have to allow it to have any keys either.

by mjmas

3/28/2026 at 2:29:53 PM

Trusting AI agents with your whole private machine is the 2020s equivalent of people pouring all their information about themselves into social networks in 2010s.

Only a matter of time before this type of access becomes productized.

by deadbabe

3/28/2026 at 5:23:56 PM

I got bad news about all of the other software you're running

by monster_truck

3/28/2026 at 1:37:53 PM

I don't understand why file and folder permissions are such a mystery. Just... don't let it clobber things it shouldn't.

by tempaccount5050

3/28/2026 at 11:52:42 AM

Forgot to mention the craziness of trusting an AI software company with your private AI codebase (think Uber's abuse of ride data).

by puttycat

3/28/2026 at 4:37:09 AM

Some day soom they will build a cage that will hold the monster. Provided they dont get eaten in the meantime. Or a larger monster eats theirs. :)

by theendisney

3/28/2026 at 7:26:03 AM

Eh, depending on how you're running agents, I'd be more worried about installing packages from AUR or other package ecosystems.

We've seen an increase in hijacked packages installing malware. Folks generally expect well known software to be safe to install. I trust that the claude code harness is safe and I'm reviewing all of the non-trivial commands it's running. So I think my claude usage is actually safer than my AUR installs.

Granted, if you're bypassing permissions and running dangerously, then... yea, you are basically just giving a keyboard to an idiot savant with the tendency to hallucinate.

by eximius

3/28/2026 at 2:16:49 PM

CONVENIENCE > SECURITY : until no convenience b/c no system to run on

by xpe

3/28/2026 at 10:59:40 AM

Plain old Unix permissions can get it done. One account for you, one account for AI. A shared folder belonging to a group that both are in. umask and setgid to get the story right for new files. https://apostrophecms.com/blog/how-to-be-more-productive-wit...

by boutell

3/28/2026 at 4:35:30 AM

This looks great and seems very well thought out.

It looks both more convenient and slightly more secure than my solution, which is that I just give them a separate user.

Agents can nuke the "agent" homedir but cannot read or write mine.

I did put my own user in the agent group, so that I can read and write the agent homedir.

It's a little fiddly though (sometimes the wrong permissions get set, so I have a script that fixes it), and keeping track of which user a terminal is running as is a bit annoying and error prone.

---

But the best solution I found is "just give it a laptop." Completely forget OS and software solutions, and just get a separate machine!

That's more convenient than switching users, and also "physically on another machine" is hard to beat in terms of security :)

It's analogous to the mac mini thing, except that old ThinkPads are pretty cheap. (I got this one for $50!)

by andai

3/28/2026 at 5:40:29 AM

Where this falls down is that for the agents to interact with anything external, you have to give them keys. Without a proxy handling real keys between your agent and external services, those keys are at risk of compromise.

Also. Agents are very good at hacking “security penetration testing”, so “separate user” would not give me enough confidence against malicious context.

by lll-o-lll

3/28/2026 at 6:35:17 AM

So don't let them interact with anything external. You can push and pull to their git project folders over the local filesystem or network, they don't even need access to a remote.

by sanitycheck

3/28/2026 at 7:48:24 AM

Unless you are talking about running a local model, that’s not possible.

by lll-o-lll

3/28/2026 at 11:10:35 AM

Obviously if you're running Claude Code you need a token for that and an internet connection, that's kind of a given. What I'm talking about is permission (OS level, not a leaky sandbox) to access the user's files, environment variables, project credentials for git remotes, signing keys, etc etc.

by sanitycheck

3/28/2026 at 6:39:17 AM

The user thing is what I currently do too. I've thought about containers but then it's confusing for everyone when I ask it to create and use containers itself.

by sanitycheck

3/28/2026 at 2:59:33 AM

I'm wondering if the obvious (and stated) fact that the site was vibe-coded - detracts from the fact that this tool was hand written.

> jai itself was hand implemented by a Stanford computer science professor with decades of C++ and Unix/linux experience. (https://jai.scs.stanford.edu/faq.html#was-jai-written-by-an-...)

by ray_v

3/28/2026 at 3:58:23 AM

Human author here. The fact that I don't know web design shouldn't detract from my expertise in operating systems. I wrote the software and the man page, and those are what really matter for security.

The web site is... let's say not in a million years what I would have imagined for a little CLI sandboxing tool. I literally laughed out loud when claude pooped it out, but decided to keep, in part ironically but also since I don't know how to design a landing page myself. I should say that I edited content on the docs part of the web site to remove any inaccuracies, so the content should be valid.

by mazieres

3/28/2026 at 7:18:52 AM

Nice tool, def gonna try it. I was looking for the source and it took a while before I found the github(0) link. Like a lot software, I like to take a look at source. Maybe you can make it more prominent on the website

0: https://github.com/stanford-scs/jai

by srcoder

3/29/2026 at 4:32:34 AM

I think most people in this space are having the same EXACT same sets of dilemmas - you can EASILY have a flashy website, except that it's totally against the previous norms for things like you've written! A plain-text bare-bones website is typically what a tool like this is presented with - instead of a flashy looking promotional website that's visually appealing and has all the accessibility and proper UI/UX, etc.

We've truly entered a new, better era of the Internet (IMHO).

Also, thank you for this tool - it looks like a great piece of software!

by ray_v

3/28/2026 at 4:11:48 AM

Indeed!

Kinda reminds me of this: https://m.xkcd.com/932/

I'm not a web UI guy either, and I am so, so happy to let an AI create a nice looking one for me. I did so just today, and man it was fast and good. I'll check it for accuracy someday...

by Nifty3929

3/30/2026 at 12:16:49 PM

It’s a good page!

by solarkraft

3/30/2026 at 2:20:55 PM

<suspicious-Fry-from-Futurama meme />A little too good...

by ray_v

3/28/2026 at 5:47:57 PM

I've been building my own tooling doing similar sorts of things -- poorly with scripts and podman / buildkit as well as LD_PRELOAD related tools, and definitely clicked over to HN comments with out reading much of the content because I thought "AI slop tool", and the site raised all my hackles as I thought I'll never touch this thing. It'll be easier to write my own than review yet another AI slop tool written by someone who loves AI.

I'm glad I read the HN comments, now I'm excited to review the source.

Thanks for your hard work.

ETA: I like your option parser

by timeinput

3/28/2026 at 11:18:33 AM

I think it will, in the modern AI slop era, look more legitimate when the web UI looks a) hand rolled and b) like not much time was spent on it at all. Which makes me a tad embarassed as someone who used to sell fancy websites for a living.

by adi_kurian

3/28/2026 at 4:51:42 AM

It seems that the LLM has not only designed the site, but also written the text on at least the frontpage, which is a pretty bad signal.

You need to rewrite all the text and Telde it with text YOU would actually write, since I doubt you would write in that style.

by lifis

3/28/2026 at 5:51:05 AM

Needs to? Is there some new law mandating all landing pages must contain exclusively handwritten text that people haven’t heard of?

To your actual point, the people that would take the landing page being written by an LLM negatively tend to be able to evaluate the project on its true merits, while another substantial portion of the demographic for this tool would actually take that (unfortunately, imo) as a positive signal.

Lastly, given the care taken for the docs, it’s pretty likely that any real issues with the language have been caught and changed.

by willy_k

3/28/2026 at 6:23:54 AM

> You need to rewrite

No they don't. The text is very clearly conveying what this project is about. Not everyone needs to cater to weirdos who are obsessed with policing how other people use LLM.

by raincole

3/29/2026 at 12:05:13 PM

The people who don't care about LLM slop being shoved down their throat at every turn are the "weirdos" here. The project might not be slop, but the website certainly is, and it's perfectly reasonable for people to stop reading immediately and decide that they don't care about what could be an otherwise useful project when they determine that the author didn't give enough of a shit to even write the text on the website themselves.

by _se

3/30/2026 at 1:26:33 AM

But there is an old-school README.me at the github homepage: https://github.com/stanford-scs/jai The repository has an old-school ASCII INSTALL file.

If you don't like the vitepress site, just use github and read the human-written README and man page there. All the information you need to use the software is available without laying eyes on any AI slop. Of cource, if you hate AI so much that you can't get past a vibe-coded landing page, you might not be the target audience for jai, because you probably aren't doing a lot of vibe coding. But maybe jai is still useful to you for grading programming assignments or running installer scripts.

by mazieres

3/28/2026 at 5:41:53 AM

any negative signal you get from the front page should probably end up cancelled out by the whole decades of experience + stanford professor thing.

by john_strinlai

3/28/2026 at 5:54:36 AM

Except that the "this was generated by an LLM" feeling you get from the front page would then make you automatically question whether the "decades of experience + stanford professor thing", as you put it, was true or just an LLM hallucination.

Author would, indeed, be wise to rewrite all the text appearing on the front page with text that he wrote himself.

by rmunn

3/28/2026 at 5:57:55 AM

>question whether the "decades of experience + stanford professor thing", as you put it, was true or just an LLM hallucination.

the scs.stanford.edu domain and stanford-scs github should help with that.

by john_strinlai

3/28/2026 at 7:44:24 AM

Excellent point, though not everyone pays close enough attention to the domain shown in the browser (if they did, some of the more amateurish phishing attempts would fool a lot fewer people). But yes, anyone who notices the domain will have a clue to the truth.

by rmunn

3/28/2026 at 3:14:03 AM

To be less abstract, it was written by David Mazieres, who was been writing software and papers about user level filesystems since at least 2000. He now runs the Stanford Secure Computer Systems group.

David has done some great work and some funny work. Sometimes both.

by Quarrel

3/28/2026 at 5:50:57 PM

Doesn't detract from it. The jai tool is high-stakes, the static website isn't. The tool is designed to be used with LLM coding agents, so if anything it makes sense to vibecode the website, even better if the author used jai in that.

by zadikian

3/28/2026 at 3:59:22 AM

Sigh, I'd still have preferred a basic HTML page with hand-written succinct information instead of this crap verbosity.

by barishnamazov

3/28/2026 at 5:05:46 AM

There is a man page.

by xbar

3/28/2026 at 3:01:48 AM

I've been reviewing Agent sandboxing solutions recently and it occurred to me there is a gaping vector for persistent exploits for tools that let the agent write to the project directory. Like this one does.

I had originally thought this would ok as we could review everything in the git diff. But, it later occurred to me that there are all kinds of files that the agent could write to that I'd end up executing, as the developer, outside the sandbox. Every .pyc file for instance, files in .venv , .git hook files.

ChatGPT[1] confirms the underlying exploit vectors and also that there isn't much discussion of them in the context of agent sandboxing tools.

My conclusion from that is the only truly safe sandboxing technique would be one that transfers files from the sandbox to the dev's machine through some kind of git patch or similar. I.e. the file can only transfer if it's in version control and, therefore presumably, has been reviewed by the dev before transfer outside the sandbox.

I'd really like to see people talking more about this. The solution isn't that hard, keep CWD as an overlay and transfer in-container modified files through a proxy of some kind that filters out any file not in git and maybe some that are but are known to be potentially dangerous (bin files). Obviously, there would need to be some kind of configuration option here.

1: https://chatgpt.com/share/69c3ec10-0e40-832a-b905-31736d8a34...

by rsyring

3/28/2026 at 3:20:01 AM

It's a good point. Maybe I should add an option to make certain directories read-only even under the current working directory, so that you can make .git/ read-only without moving it out of the project directory.

You can already make CWD an overlay with "jai -D". The tricky part is how to merge the changes back into your main working directory.

by mazieres

3/28/2026 at 9:01:17 AM

This is the problem yoloAI (see below comment) is built around. The merge step is `yoloai diff` / `yoloai apply`: the agent works against a copy of your project inside the container, you review the diff, you decide what lands.

jai's -D flag captures the right data; the missing piece is surfacing it ergonomically. yoloAI uses git for the diff/apply so it already feels natural to a dev.

One thing that's not fully solved yet: your point about .git/hooks and .venv being write vectors even within the project dir. They're filtered from the diff surface but the agent can still write them during the session. A read-only flag for those paths (what you're considering adding to jai) would be a cleaner fix.

by kstenerud

3/28/2026 at 3:55:10 AM

It's great that you have -D built into the tool already. That's a step in the right direction.

I don't think the file sync is actually that hard. Famous last words though. :)

by rsyring

3/28/2026 at 8:57:34 AM

Not famous last words ;-)

I've already shipped this and use it myself every day. I'm the author of yoloAI (https://github.com/kstenerud/yoloai), which is built around exactly this model.

The agent runs inside a Docker container or containerd vm (or seatbelt container or Tart vm on mac), against a full copy of your project directory. When it's done, `yoloai diff` gives you a unified diff of everything it changed. `yoloai apply` lands it. `yoloai reset` throws it away so you can make the agent try again. The copy lives in the sandbox, so your working tree is untouched until you explicitly say so.

The merge step turned out to be straightforward: just use git under the hood. The harder parts were: (a) making it fast enough that the copy doesn't add annoying startup overhead, (b) handling the .pyc/.venv/.git/hooks concern you raised (they're excluded from the diff surface by default), and (c) credential injection so the agent can actually reach its API without you mounting your whole home dir.

Leveraging existing tech is where it's at. Each does one thing and does it well. Network isolation is done via iptables in Docker, for example.

Still early/beta but it's working. Happy to compare notes if you're building something similar.

by kstenerud

3/28/2026 at 1:37:30 PM

I don't follow why you'd run uncommitted non-reviewed code outside of the sandbox (by sandbox I'm meaning something as secure as a VM) you use. My mental model is more that you no longer compile / run code outside of the sandbox, it contains everything, then when a change is ready you ship it after a proper review.

The way I'd do it right now:

* git worktree to have a specific folder with a specific branch to which the agent has access (with the .git in another folder)

* have some proper review before moving the commits there into another branch, committing from outside the sandbox

* run code from this review-protected branch if needed

Ideally, within the sandbox, the agent can go nuts to run tests, do visual inspections e.g. with web dev, maybe run a demo for me to see.

by hiq

3/28/2026 at 3:22:15 AM

Yeah, never allow githooks ;)

by jbverschoor

3/28/2026 at 2:03:20 AM

The examples in the article are all big scary wipes, But I think the more common damage is way smaller and harder to notice.

I've been using claude code daily for months and the worst thing that happened wasnt a wipe(yet). It needed to save an svg file so it created a /public/blog/ folder. Which meant Apache started serving that real directory instead of routing /blog. My blog just 404'd and I spent like an hour debugging before I figured it out. Nothing got deleted and it's not a permission problem, the agent just put a file in a place that made sense to it.

jai would help with the rm -rf cases for sure but this kind of thing is harder to catch because its not a permissions problem, the agent just doesn't know what a web server is.

by gurachek

3/28/2026 at 1:26:56 AM

Excellent project, unfortunate title. I almost didn't click on it.

I like the tradeoff offered: full access to the current directory, read-only access to the rest, copy-on-write for the home directory. With stricter modes to (presumably) protect against data exfiltration too. It really feels like it should be the default for agent systems.

by BoppreH

3/28/2026 at 1:39:08 AM

Since the site itself doesn't really have a title, I probably would've went with something like "jai - filesystem containment for AI agents"

by fouc

3/28/2026 at 5:43:59 AM

This is a cool solution... I have a simpler one, though likely inferior for many purposes..

Run <ai tool of your choice> under its own user account via ssh. Bind mount project directories into its home directory when you want it to be able to read them. Mount command looks like

    sudo mkdir /home/<ai-user>/<dir-name>
    sudo mount --bind <dir to mount> --map-groups $(id -g <user>):$(id -g <ai-user>):1 --map-users $(id -u <user>):$(id -u <ai-user>):1 /home/<ai-user>/<dir-name>

I particularly use this with vscode's ssh remotes.

by gpm

3/28/2026 at 7:52:54 AM

I've been using a dedicated user account for 6 months now, and it does everything. What makes it great is the only axis of configuration is managing "what's hoisted into its accessible directories".

Its awe-inspiring the levels of complexity people will re-invent/bolt-on to achieve comparable (if not worse) results.

by athrowaway3z

3/28/2026 at 5:37:05 PM

From the home page:

> Stop trusting blindly

> One-line installer scripts,

Here are the manual install instructions from the "Install / Build page:

> curl -L https://aur.archlinux.org/cgit/aur.git/snapshot/jai.tar.gz | tar xzf -

> cd jai

> makepkg -i

So, trust their jai tool, but not _other_ installer scripts?

by jimmar

3/28/2026 at 8:58:45 PM

Yes, unpacking a tar file is much safer than piping arbitrary code to bash! You can look at the PKGFILE in the directory--it is only 30 lines long and mostly variable assignments. The build/check/package functions are 7 lines of code total. Compare that to something like rustup (910 lines of code), claude (158 lines), or opencode (460 lines).

by mazieres

3/28/2026 at 5:40:24 PM

No, no, see this is untrustworthy:

  curl -L https://aur.archlinux.org/cgit/aur.git/snapshot/jai.tar.gz | tar xzf - && cd jai && makepkg -i

by da_chicken

3/28/2026 at 5:41:43 AM

It's full VM or nothing.

I want AI to have full and unrestricted access to the OS. I don't want to babysit it and approve every command. Everything that is on that VM is a fair game and the VM image is backed up regularly from outside.

This is the only way.

by gck1

3/28/2026 at 1:22:56 PM

I have a pretty insane thing where I patched the screen sharing binary and hand rolled a dummy MDN so I can have multiple profiles logged in at once on my Mac Studio. Then have screen share of diff profiles in diff "windows". Was for some ML data gathering / CV training.

It's pretty neat, screen sharing app is extremely high quality these days, I can barely notice a diff unless watching video. Almost feels like Firefox containers at OS level.

Have thought that could be a pretty efficient way to have restricted unrestricted convenient AI access. Maybe I'll get around to that one day.

by adi_kurian

3/30/2026 at 3:59:30 AM

> I have a pretty insane thing where I patched the screen sharing binary and hand rolled a dummy MDN so I can have multiple profiles logged in at once on my Mac Studio

I have a Studio collecting dust that I've been eyeing every time my VM crashed because of Apple's paravirtualized GPU proxy not being able to keep up with things I run in it.

This sounds exactly like what I wanted to do on my Studio and didn't know where to pull the thread from.

Do you have this method shared openly anywhere?

by gck1

3/31/2026 at 1:08:11 AM

Nah but I'd be happy to share it with you over DM! (If they have DMs on here?)

by adi_kurian

3/28/2026 at 8:20:37 AM

I use Nix shells to give it the tools it wants.

If it wants to do system-level tests, then I make sure my project has Qemu-based tests.

by griffindor

3/28/2026 at 8:01:25 AM

And for the macos users, I can’t recommend nono enough. (Paying it forward, since it was here on HN that I learned about it.)

Good DX, straightforward permissions system, starts up instantly. Just remember to disable CC’s auto-updater if that’s what you’re using. My sandbox ranking: nono > lima > containers.

by lemontheme

3/28/2026 at 8:25:39 AM

This nono? https://github.com/always-further/nono

> Just remember to disable CC’s auto-updater if that’s what you’re using.

Why?

by pbowyer

3/28/2026 at 1:16:37 PM

Might be something specific to my and my colleagues' systems, but it breaks the TUI. It needs git authentication, which fails, and the TUI stops accepting input reliably

by lemontheme

3/28/2026 at 8:56:49 AM

I’m using safe house [0] its a bash wrapper around sandbox-exec

0 https://agent-safehouse.dev/

by vorticalbox

3/28/2026 at 8:26:13 AM

I've just switched to lima, and cant find anything about "nono" can you post a link?

by faeyanpiraat

3/28/2026 at 1:32:27 PM

I really like lima too. It's my go-to recommendation for light VMs. But I do consider it slightly less convenient.

A good example of why is project-local .venv/ directories, which are the default with uv. With Lima, what happens is that macOS package builds get mounted into a Linux system, with potential incompatibility issues. Run uv sync inside the VM and now things are invalid on the macOS side. I wasn't able to find a way to mount the CWD except for certain subdirectories.

Another example is network filtering. Lima (understandably) doesn't offer anything here. You can set up a firewall inside the VM, but there's no guarantee your agent won't find a way to touch those rules. You can set it up outside the VM, but then you're also proxying through a MITM.

So, for the use case of running Claude Code in --dangerously-skip-permissions mode, Lima is more hassle than Nono

by lemontheme

3/29/2026 at 3:24:15 AM

So your shared .venv is the vector for the agent to escape the sandbox.

by dolmen

3/31/2026 at 5:13:16 AM

Haha true. I’d considered that. But then, so is any code the agent writes, which will ultimately run outside the sandbox.

So it’s certainly not perfect. An isolated VM or a VPS provides the best guarantees. For me though it’s good enough. I’ve put my risk profile at: ‘don’t fuck up my system directly and don’t exfiltrate secrets directly’

by lemontheme

3/28/2026 at 9:12:44 AM

I work on a sandboxing tool similarly based on an idea to point the user home dir to a separate location (https://github.com/wrr/drop). While I experimented with using overlayfs to isolate changes to the filesystem and it worked well as a proof-of-concept, overlayfs specification is quite restrictive regarding how it can be mounted to prevent undefined behaviors.

I wonder if and how jai managed to address these limitations of overlayfs. Basically, the same dir should not be mounted as an overlayfs upper layer by different overlayfs mounts. If you run 'jai bash' twice in different terminals, do the two instances get two different writable home dir overlays, or the same one? In the second case, is the second 'jai bash' command joining the mount namespace of the first one, or create a new one with the same shared upper dir?

This limitation of overlays is described here: https://docs.kernel.org/filesystems/overlayfs.html :

'Using an upper layer path and/or a workdir path that are already used by another overlay mount is not allowed and may fail with EBUSY. Using partially overlapping paths is not allowed and may fail with EBUSY. If files are accessed from two overlayfs mounts which share or overlap the upper layer and/or workdir path, the behavior of the overlay is undefined, though it will not result in a crash or deadlock.'

by mixedbit

3/28/2026 at 1:37:21 AM

What would Jonathan Blow think about this.

by triilman

3/28/2026 at 1:43:32 AM

My name is also jai

by ghighi7878

3/30/2026 at 10:04:16 AM

I would like to also suggest greywall[1] which i found yesterday. It sandboxes the filesystem, network, syscalls, dns. The network uses a transparent proxy to see which network requests were made. Supports linux and macos.

[1] https://github.com/GreyhavenHQ/greywall

by xtanx

3/28/2026 at 3:11:43 AM

I'd really like to try this, but building it is impossible. C++ is such a pain to build with the "`make`; hunt for the dependency that failed; `apt-get install whatever-dev`; goto make" loop...

Please release binaries if you're making a utility :(

by stavros

3/28/2026 at 4:32:14 AM

What distro are you using? The only two dependencies are libacl and libmount. I'm trying to figure out which distros don't include these by default, and if the libraries are really missing, or if it's just the pkgconf ".pc" files. In the former case I should document the dependencies. In the latter case I should maybe switch from PKG_CHECK_MODULES to old-fashioned autoconf.

by mazieres

3/28/2026 at 10:13:48 AM

I'm using Ubuntu, I gave up when it failed on something about "print".

by stavros

3/28/2026 at 3:23:48 AM

https://github.com/jrz/container-shell

It does something very simple, and it’s a POSIX shell script. Works on Linux and macOS. Uses docker to sandbox using bind mount

by jbverschoor

3/28/2026 at 3:29:11 AM

Yeah but it doesn't COW anything else, and Docker is a bit heavy for this.

by stavros

3/28/2026 at 6:29:08 AM

It's always struck me that agents should be operated via `systemd-run` as a transient scope unit with the necessary security properties set

So couldn't this be done with an appropriate shell alias - at least under linux.

by neilwilson

3/28/2026 at 7:10:37 AM

I had the same idea and created this quickly in an evening: https://github.com/Shadi/isolate

by _shadi

3/28/2026 at 8:33:43 AM

I've been using podman, and for me it is good enough. The way I use it I mount current working directory, /usr/bin, /bin, /usr/lib, /usr/lib64, /usr/share, then few specific ~/.aspnet, ~/.dotnet, ~/.npm-global etc. I use same image as my operating system (Fedora 43).

It works pretty well, agent which I choose to run can only write and see the current working directory (and subdirectories) as well as those pnpm/npm etc software development files. It cannot access other than the mounted directories in my home directory.

Now some evil command could in theory write to those shared ~/.npm-global directories some commands, that I then inadvertently run without the container but that is pretty unlikely.

by Ciantic

3/28/2026 at 1:25:29 PM

Is there already some more established setup to do "secure" development with agents, as in, realistically no chance it would compromise the host machine?

E.g. if I have a VM to which I grant only access to a folder with some code (let's say open-source, and I don't care if it leaks) and to the Internet, if I do my agent-assistant coding within it, it will only have my agent credentials it can leak. Then I can do git operations with my credentials outside of the VM.

Is there a more convenient setup than this, which gives me similar security guarantees? Does it come with the paid offerings of the top providers? Or is this still something I'd have to set up separately?

by hiq

3/28/2026 at 1:57:47 AM

Should be named Jia

More seriously, I'm not a heavy agent user, but I just create a user account for the agent with none of my own files or ssh keys or anything like that. Hopefully that's safe enough? I guess the risk is that it figures out a local privilege escalation exploit...

by cozzyd

3/28/2026 at 2:03:09 AM

Dunno... with this setup it seems certain that the agent will discover a zero-day to escalate privilges and send your SSH keys to its handlers in N. Korea.

P.S. Everything old is new again <3

by timcobb

3/28/2026 at 2:05:09 AM

Yeah definitely a concern. Probably need a sandbox and separate user for defense in depth.

by cozzyd

3/28/2026 at 2:57:09 AM

For jailing local agents on a Mac, I made Agent Safehouse - it works for any agent and has many sane default for developers https://agent-safehouse.dev

by e1g

3/28/2026 at 3:27:55 AM

There's nothing wrong with an AI-designed website, but I wish when describing their own projects that HN contributors wrote their own copy. As HN posters are wont to say, writing is thinking...

by waterfisher

3/28/2026 at 9:07:26 PM

The filesystem sandboxing problem is real but the browser version of this is arguably worse. A coding agent that escapes its sandbox can delete files — bad, but recoverable from git. A browser agent with access to your real authenticated sessions can click "transfer" on your bank, accept terms on a contract, or send emails as you. And unlike filesystem paths, you can't easily whitelist which URLs or actions are safe — the agent needs broad access to be useful.

The capabilities-based approach mentioned downthread is probably the right direction for both. Instead of trying to blacklist dangerous operations, give the agent narrow capabilities: "you can read this page but not click submit buttons" or "you can navigate these 5 domains." The hard part is that useful browser automation almost always requires the dangerous capabilities (filling forms, clicking buttons, authenticated sessions).

by volume_tech

3/28/2026 at 9:18:49 AM

I’m using https://github.com/torarnv/claude-remote-shell for this, which runs Claude’s Bash tool on a remote machine but leaves Claude running locally otherwise.

I’ve found it to be a good balance for letting Claude loose in a VM running the commands it wants while having all my local MCPs and tools still available.

by torarnv

3/28/2026 at 1:17:25 PM

I would have to be very inebriated to give a bot/agent access to my files and all security clearance should be revoked but should I do that it would have to be under mandatory access controls that my unprivileged user has no influence over, not even with sudo or doas. The LSM enforced rules (SELinux, AppArmor, TOMOYO, other newer or simpler LSM's) would restrict all by default and give explicit read, write, execute permissions to specific files or directories.

The bot should also be instructed that it gets 3 strikes before being removed meaning it should generate a report of what it believes it wants to access to and gets verbal approval or denial. That should not be so difficult with today's bots. If it wants to act like a human then it gets simple rules like a human. Ask the human operator for permission. If the bot starts "doing it's own thing, aka going rogue" then it gets punished. Perhaps another bot needs to act as a dominatrix to be a watcher over the assistant bot.

by Bender

3/28/2026 at 2:02:58 AM

This still is running in an isolated container, right?

Ignoring the confidentiality arguments posed here, I can’t help to think about snapshotting filesystems in this context. Wouldn’t something like ZFS be an obvious solution to an agent deleting or wildly changing files? That wouldn’t protect against all issue the authors are trying to address, but it seems like an easy safeguard against some of the problems people face with agents.

by mbreese

3/28/2026 at 11:36:08 AM

Where is the network isolation? I want to be able to be able to limit what external resources the agent can access and also inject secrets at request time so the agent does have access to them.

File system isolation is easy now, it’s not worth HN front page space for the n’th version. It’s a solved problem (and now included in Claude clCode).

by Game_Ender

3/30/2026 at 12:19:45 PM

This looks great, like the simplicity of a chroot with some actual security.

I’m guessing it can be used to invoke individual commands from the agent harness instead of jailing the entire harness? That would enable making it much more restrictive, I think.

by solarkraft

3/28/2026 at 5:57:55 PM

"jai is free software, brought to you by the Stanford Secure Computer Systems research group and the Future of Digital Currency Initiative"

I guess the "Future of Digital Currency Initiative" had to pivot to a more useful purpose than studying how Bitcoin is going to change the world.

by otterley

3/28/2026 at 3:20:32 AM

Interesting take on the same problem

I created https://github.com/jrz/container-shell which basically launches a persistent interactive shell using docker, chrooted to the CWD

CWD is bind mounted so the rest is simply not visible and you can still install anything you want.

by jbverschoor

3/28/2026 at 1:44:03 AM

Claude's stock unprompted / uninspired UI code creates carbon clone components. That "jai is not a promise of perfect safety" callout box is like the em dash of FE code. The contrast, or lack thereof, makes some of the text particularly invisible.

I wonder if shitty looking websites and unambitious grammar will become how we prove we are human soon.

by adi_kurian

3/28/2026 at 1:54:41 AM

Everything old is new again

by NetOpWibby

3/28/2026 at 5:54:29 PM

Docker is hard to setup. The author made a nice solution but not sure if he know devcontainer and what he can do. You do the setup once and you roll in most dev tools. I'm still surprised the effort people put in such solution ignore the dev's core requirements, like sharing the env they use in a simple way. You used it to have custom env and isolate the agent. You want to persist your credentials? Mount the target folder from home or sl into a sub folder. Might be knowledge. But for Linux or even Windows/Mac as long you don't need desktop fully. Devcontainer is simple. A standard that works. And it's very mature.

by mehdibl

3/28/2026 at 6:15:35 PM

I'm surprised from reading these comments that more people aren't chiming in to ask why this solution is better than a dev container. That seems like the obviously best way to setup security boundaries that don't require you to still trust that AI will do what you ask it. You can run it remotely and it's portable etc.

by sleepytree

3/30/2026 at 1:42:47 AM

Please use a dev container! Empirically, a lot of people don't, and some people regularly enable YOLO mode on their actual laptops. So if you've never run a code assistant outside of a dev container, that's fantastic and I don't want to change your behavior. But be honest with yourself--if you aren't 100% consistent about using a container, then jai may be for you, because it just works in every single scenario.

I can honestly say that since developing jai, I haven't run an assistant outside of a container. In fact, I now only have the assistants installed inside containers, so if I run `claude`, command not found, it has to be `jai claude`. The only place I have to run outside of jai is for testing jai itself, for which I use a virtual machine and just let the assistant have root, but that's a heavyweight environment I'm forced to use for this particular problem domain.

by mazieres

3/28/2026 at 3:08:23 PM

This is very cool - I try to have a container-centric setup but sometimes YOLOcal clauding is too tempting.

My biggest question skimming over the docs is what a workflow for reviewing and applying overlay changes to the out-of-cwd dirs would be.

Also, bit tangential but if anyone has slightly more in-depth resources for grasping the security trade-offs between these kind of Linux-leveraging sandboxes, containers, and remote VMs I'd appreciate it. The author here implies containers are still more secure in principle, and my intuition is that there's simply less unknowns from my perspective, but I don't have a firm understanding.

Anyhow, kudos to the author again, looks useful.

by micimize

3/28/2026 at 11:19:38 AM

Most of what we're doing with Ai today, we've been doing it pretty just fine without any confusion.

I've been struggling to find what Ai has intrinsically solved new that gives us the chance to completely change workflows, other these weird things occuring.

by thedelanyo

3/28/2026 at 9:25:47 AM

Sorry if this question is stupid, (I'm not even using Claude*), but why can't people run Claude/other coding agent in a container and only mount the project directory to the container?

*I played with codex a few months ago, but I don't even work in IT.

by wafflemaker

3/28/2026 at 8:21:29 AM

I've been running GPT5.x fully unconstrained with effective local admin shell for over $500 worth of API tokens. Not once has it done something I'd consider "naughty".

It has left my project in a complete mess, but never my entire computer.

  git reset --hard && git clean -fd

That's all it takes.

I think this is turning into a good example of security theatrics. If the agent was actually as nefarious as the marketing here suggests, the solution proposed is not adequate. No solution is. Not even a separate physical computer. We need to be honest about the size of this problem.

Alternatively, maybe Claude is unusually violent to the local file system? I've not used it at all, so perhaps I am missing something here.

by bob1029

3/30/2026 at 1:48:39 AM

An AI agent that works 99.9% of the time is a lot more dangerous than one that works 90% of the time, because the former leads to expectations of 100% safe behavior. I've never been saved from harm by a car seatbelt. Should I extrapolate that vehicles are basically safe and seatbelts are safety theater?

by mazieres

3/28/2026 at 10:09:28 AM

This looks nice, but on mac you can virtualise really easily into microvms now with https://github.com/apple/container.

I've built my own cli that runs the agent + docker compose (for the app stack) inside container for dev and it's working great. I love --dangerously-skip-permissions. There's 0 benefit to us whitelisting the agent while it's in flight.

Anthropic's new auto mode looks like an untrustworthy solution in search of a problem - as an aside. Not sure who thought security == ml classification layer but such is 2026.

If you're on linux and have kvm, there's Lima and Colima too.

by te_chris

3/28/2026 at 4:16:28 PM

Installation is a bit... unsupported unless you're on Arch. Here's a Nix setup I (and Claude!) came up with:

https://github.com/pkulak/nix/tree/main/common/jai

Arg, annoying that it puts its config right in my home folder...

EDIT: Actually, I'm having a heck of a time packaging this properly. Disregard for now!

EDIT2: It was a bit more complicated than a single derivation. Had to wrap it in a security wrapper, and patch out some stuff that doesn't work on the 25.11 kernel.

by pkulak

3/28/2026 at 8:29:47 AM

Just use DevContainers. Can't understand people letting AI go wild on their systems...

by r0l1

3/28/2026 at 1:07:59 PM

Are there any similar ways of isolating environment variables, secrets, and credentials? Everyone is thinking about the file system but I haven't seen as much discussion about exposing secrets and account access.

by driverdan

3/28/2026 at 1:44:38 AM

Suggestion for the FAQ page: does this work on a Mac?

by simonw

3/28/2026 at 2:27:44 PM

Looks good, but only Linux is supported. I like spinning up VPS’s and then discarding them when I am done. On macOS, something I haven/t tried yet but plan to: create a separate user account.

by mark_l_watson

3/29/2026 at 3:47:09 AM

I use a custom container image which bind mounts the current working directory, and has some popular coding agents preinstalled. There's also a firewall option, with a whitelist of hosts and IP addresses that the user can modify without having to rebuild the image.

https://github.com/ambarh/agent-silo

by wpaladin

3/28/2026 at 10:19:35 AM

Would like to see something more comprehensive built on zfs and freebsd jails. Namely snapshot/checkpoint before each prompt, quick undo for changes made by agent, auto delete old snapshots etc

by jqbd

3/28/2026 at 6:39:11 PM

I may be paranoid but only run my ai cli tools in a vps only. I have them installed locally but never use them. In a vps I go full yolo mode bc I do not care about it. It is a slightly more cumbersome workload, bit if you have a dev + staging envs, then you never have to develop and run stuff locally, which brings the local hardware requirements and costs down too (bc you can develop with a base macbook neo).

by game_the0ry

3/28/2026 at 3:43:47 AM

Are mass file deletions as result of some plausible “I see why it would have done that” or will it just completely randomly execute commands that really have nothing to do with the immediate goal?

by Waterluvian

3/28/2026 at 7:08:50 AM

Idk, just feels so counter sometimes to build and refine these (seemingly non-deterministic) tools to build deterministic workflows & get the most productivity out of them.

by ta-run

3/28/2026 at 1:54:17 PM

Well, I'm on Windows (+ Cygwin) and wrote a Dockerfile. It wasn't that hard. git branch + worktree + a docker container per project and I can work with copilot in --yolo mode (or claude --dangerously-skip-permissions, whichever). vscode is pretty smooth at installing the VS Code Server on first connection to a docker container, too, and I just open up the workspace in a minute.

by vijucat

3/28/2026 at 12:27:19 PM

Inspired by this tool I wrote something that fits macOS better. It uses the native sandbox-exec from Apple and can wrap other apps as well, like VSCode in which you usually run AI stuff. https://github.com/holtwick/bx-mac

by holtwick

3/28/2026 at 2:51:01 AM

I've done some experimenting with running a local model with ollama and claude code connecting to it and having both in a firejail: https://firejail.wordpress.com/ What they get access to is very limited, and mostly whitelisted.

by Jach

3/28/2026 at 8:15:47 AM

I have seen it just 5 mins ago Claude misspelled directory path - for me it was creating a new folder but I can image if I didn’t stop it it could start removing stuff just because he thinks he needs to start from scratch or something.

by ozim

3/28/2026 at 12:39:54 AM

What would it take for people to stop recklessly running unconstrained AI agents on machines they actually care about? A Stanford researcher thinks the answer is a new lightweight Linux container system that you don't have to configure or think about.

by mazieres

3/28/2026 at 8:34:24 AM

There always has been this tension between protecting resources and allowing users to access those resources in security. With many systems you have admin/root users and regular users. Some things require root access. Most interesting things (from a security point of view) live in the user directory. Because that's where users spend all their time. It's where you'll find credentials, files with interesting stuff inside, etc. All the stuff that needs protecting.

The whole point of using a computer is being able to use it. For programmers, that means building software. Which until recently meant having a lot of user land tools available ready to be used by the programmer. Now with agents programming on their behalf, they need full access to all that too in order to do the very valuable and useful things they do. Because they end up needing to do the exact same things you'd do manually.

The current security modes in agents are binary. Super anal about absolutely everything; or off. It's a false choice. It's technically your choice to make and waive their liability (which is why they need you to opt in); but the software is frustrating to use unless you make that choice. So, lots of people make that choice. I'm guilty as well. I could approve every ansible and ssh command manually (yes really). But a typical session where codex follows my guardrails to manage one of my environments using ansible scripts it maintains just involves a whole lot such commands. I feel dirty doing it. But it works so well that doing all that stuff manually is not something I want to go back to.

It's of course insecure as hell and I urgently need something better than yolo mode for this. One of the reasons I like codex is that (so far) it's pretty diligent about instruction following and guard rails. It's what makes me feel slightly more relaxed than I perhaps should be. It could be doing a lot of damage. It just doesn't seem to do that.

by jillesvangurp

3/28/2026 at 2:25:09 AM

unconstrained AI agents are what makes it so useful though. I have been using claude for almost a year now and the biggest unlock was to stop being a worrywart early on and just literally giving it ssh keys and telling it to fix something. ofc I have backups and do run it in VM but in that VM it helps me manage by infra and i have a decent size homelab that would be no fun but a chore without this assistant.

by vardalab

3/28/2026 at 4:49:10 AM

I run my AI agent unconstrained in a VM without access to my local network so it can futz with the system however it wants (so far, I've had to rebuild the VM twice from Claude borking it). That works great for software development.

For devops work, etc (like your use case), I much prefer talking to it and letting it guide me into fixing the issue. Mostly because after that I really understand what the issue was and can fix it myself in the future.

by sersi

3/28/2026 at 6:17:24 AM

Letting an agent loose with SSH keys is fine when the blast radius is one disposable VM, but scale that habit to prod or the wrong subnet and you get a fast refresher on why RBAC exists, why scoped creds exist, and why people who clean up after outages get very annoyed by this whole genre of demo. Feels great, until it doesn't.

by hrmtst93837

3/28/2026 at 3:45:47 AM

Agree, but SSH agents like 1Passwords are nice for that.

You simply tell it to install that Docker image on your NAS like normal, but when it needs to login to SSH it prompts for fingerprint. The agent never gets access to your SSH key.

by kristofferR

3/28/2026 at 4:52:29 AM

> unconstrained AI agents are what makes it so useful though

Not remotely worth it.

by bigstrat2003

3/28/2026 at 1:35:30 AM

Yes. It is like walking arounf your house with a flamethrower, but you added fire retardant. Just take the flamethower to a shed you don't mind losing. Which is some kind of cloud workspace most likely. Maybe an old laptop.

Still if you yolo online access and give it cred or access to tools that are authenticated there can still be dragons.

by mememememememo

3/28/2026 at 3:30:58 AM

The problem is that in practice, many people don't take the flamethrower to the shed. I recently had a conversation with someone who was arguing that you don't really need jai because docker works so well. But then it turned out this person regularly runs claude code in yolo mode without a container!

It's like people think that because containers and VMs exist, they are probably going to be using them when a problem happens. But then you are working in your own home directory, you get some compiler error or something that looks like a pain to decipher, and the urge just to fire up claude or codex right then and there to get a quick answer is overwhelming. Empirically, very few people fire up the container at that point, whereas "jai claude" or "jai -D claude" is simple enough to type, and basically works as well as plain claude so you don't have to think about it.

by mazieres

3/28/2026 at 2:18:42 AM

[dead]

by cindyllm

3/28/2026 at 1:41:58 AM

except the big AI companies are pushing stuff designed for people to run on their personal computers, like Claude Cowork.

by fouc

3/28/2026 at 3:41:02 PM

This is a great time for Apple to relaunch their Time Machine devices, have a history of everything in your file system because sooner or later some AI is going to delete it...

by youknownothing

3/28/2026 at 6:12:31 AM

How long until agents begin routinely abusing local privilege escalation bugs to break out of containers? I bet if you tell them explicitly not to do so it increases the likelihood that they do.

by sanskritical

3/28/2026 at 12:33:52 PM

I tried something similar while building my tool site — biggest issue was SEO indexing. Fixed it by improving internal linking instead of relying on sitemap.

by imranstrive7

3/28/2026 at 2:33:09 AM

Also recommended:

https://github.com/kenryu42/claude-code-safety-net

by kristofferR

3/28/2026 at 1:38:26 AM

How is this different than say bubblewrap and others?

by messh

3/28/2026 at 1:40:48 AM

https://jai.scs.stanford.edu/comparison.html#jai-vs-bubblewr...

> bubblewrap is more flexible and works without root. jai is more opinionated and requires far less ceremony for the common case. The 15-flag bwrap invocation that turns into a wrapper script is exactly the friction jai is designed to remove.

Plus some other comparisons, check the page

by girvo

3/28/2026 at 5:52:19 AM

bubblewrap is in many modern distros standard packages.

With all the supply chain issues these days onboarding new tools carries extra risks. So, question is if it's worth it.

by attentive

3/28/2026 at 4:48:25 AM

What if Claude needs me to install some software and hoses my distro. Jai cannot protect there as I am running the script myself

by yalogin

3/28/2026 at 4:44:26 AM

This is not some magical new problem. Back your shit up.

You have no excuse for "it deleted 15 years of photos, gone, forever."

by KennyBlanken

3/28/2026 at 4:50:53 AM

And what about, it exfiltrated my AWS keys (or insert random valuable thing that sits in .config of your home directory)? Backing up is not going to help you in that case.

by sersi

3/28/2026 at 2:14:17 AM

.claude/settings.json: { "sandbox": { "enabled": true, "filesystem": { "allowRead": ["."], "denyRead": ["~/"], "allowWrite": ["."] } } }

Use it! :) https://code.claude.com/docs/en/sandboxing

by justinde

3/28/2026 at 2:03:22 AM

Should definitely block .ssh reading too...

by cozzyd

3/28/2026 at 10:31:33 AM

AI safety is just like any technology safety, you can’t bubble wrap everything. Thinking about early stage of electricity, it was deadly (and still is), but we have proper insulation and industry standards and regulations, plus common sense and human learning. We are safe (most of the time).

This also applies to the first technology human beings developed: fire .

by ontouchstart

3/28/2026 at 10:27:30 AM

$ lxc exec claude bash

Easy :-) lxd/lxc containers are much much underrated. Works only with Linux though.

by Aldipower

3/28/2026 at 12:31:19 PM

This site was definitely slopcoded with Claude. They have a real distinctive look.

by MagicMoonlight

3/28/2026 at 3:55:30 PM

What's the difference between this and agent-safehouse?

by Myzel394

3/28/2026 at 3:10:23 AM

i just use seatbelt (mac native) in my custom coding agent: supercode

by faangguyindia

3/28/2026 at 2:10:14 PM

Something like freeBSD jails would be perfect for agents.

by hoppp

3/28/2026 at 2:23:05 AM

I want agents to modify the file system. I want them to be able to manage my computer if it thinks it's a good idea. If a build fails due to running out of disk space I want it to be able to find appropriate stuff to delete to free up space.

by charcircuit

3/28/2026 at 12:47:12 PM

Jai is the name of a programming language, no?

by docmars

3/28/2026 at 2:23:23 AM

Not sure I understand the problem. Are people just letting AI do anything? I use Claude Code and it asks for permission to run commands, edit files, etc. No need for sandbox

by gonzalohm

3/28/2026 at 5:47:45 AM

Yes, people very much are, and that's exactly the problem! People run `claude --dangerously-skip-permissions` and `codex --yolo` all the time. And I think one of the appeals of opencode (besides cross-model, which is huge) is that the permissions are looser by default. These options are presumably intended for VM or container environments, but people are running them outside. And of course it works fine the first 100 times people do it, which drives them to take bigger and bigger risks.

by mazieres

3/28/2026 at 5:41:55 PM

Its a bit annoying that there are so many solutions to run agents and sandbox them but no established best practice. It would be nice to have some high level orchestration tools like docker / podman where you can configure how e.g. claude code, opencode, codex, openclaw run in open Shell, OCI container, jai etc.

Especially because everybody can ask chatgpt/claude how to run some agents without any further knowledge I feel we should handle it more like we are handling encryption where the advice is to use established libraries and don't implement those algorithms by yourself.

by ma2kx

3/28/2026 at 12:10:59 PM

Is there an equivalent for macOS?

by love2read

3/28/2026 at 7:17:10 AM

If it has a big splash page with no technical information, it's trying to trick you into using it. That doesn't mean it isn't useful, but it does mean it's disingenuous.

This particular solution is very bad. To start off with, it's basically offering you security, right? Look, bars in front of an evil AI! An AI jail! That's secure, right? Yet the very first mode it offers you is insecure. The "casual" mode allows read access to your whole home directory. That is enough to grant most attackers access to your entire digital life.

Most people today use webmail. And most people today allow things like cookies to be stored unencrypted on disk. This means an attacker can read a cookie off your disk, and get into your mail. Once you have mail, you have everything, because virtually every account's password reset works through mail.

And this solution doesn't stop AI exfiltration of sensitive data, like those cookies, out the internet. Or malware being downloaded into copy-on-write storage space, to open a reverse shell and manipulate your existing browser sessions. But they don't mention that on the fancy splash page of the security tool.

The truth is that you actually need a sophisticated, complex-as-hell system to protect from AI attacks. There is no casual way to AI security. People need to know that, and splashy pages like this that give the appearance of security don't help the situation. Sure, it has disclaimers occasionally about it not being perfect security, read the security model here, etc. But the only people reading that are security experts, and they don't need a splash page!

Stanford: please change this page to be less misleading. If you must continue this project with its obviously insecure modes, you need to clearly emphasize how insecure it is by default. (I don't think it even qualifies as security software)

by 0xbadcafebee

3/28/2026 at 9:03:34 AM

It is a bit better than you're saying. When you fire it up, you can see that it does have a list of common credential areas that it hides from the jail. It seems to hide:

    .aws  .azure  .bash_history .config  .docker  .git-credentials  .gnupg  .jai  .local  .mozilla  .netrc  .password-store  .ssh  .zsh_history

It's a humorous attempt in a sense, but better than nothing for sure!

by yobert

3/28/2026 at 10:32:23 AM

or you can just run nanoclaw for isolation by default?

https://nanoclaw.dev

by mbravorus

3/28/2026 at 4:41:16 AM

Just allowing Yolo, and sometimes do rolling back

by samchon

3/28/2026 at 5:07:00 AM

Can we have a hardware level implementation of git (the idea of files/data having history preserved. Not necessarily all bells and whistles.) ...in a future where storage is cheap.

by albert_e

3/28/2026 at 5:17:43 AM

Now we just need one for every python package.

by samlinnfer

3/28/2026 at 9:45:27 AM

TLDR: It's easy : LLM outputs are untrusted. Agents by virtue of running untrusted inputs are malware. Handle them like the malware they are.

>>> "While this web site was obviously made by an LLM" So I am expecting to trust the LLM written security model https://jai.scs.stanford.edu/security.html

These guys are experts from a prestigious academic institution. Leading "Secure Computer Systems", whose logo is a 7 branch red star, which looks like a devil head, with white palm trees in the background. They are also chilling for some Blockchain research, and future digital currency initiative, taking founding from DARPA.

The website also points towards external social networks for reference to freely spread Fear Uncertainty Doubt.

So these guys are saying, go on run malware on your computer but do so with our casual sandbox at your own risk.

Remember until yesterday Anthropic aka Claude was officially a supply chain risk.

If you want to experiment with agents safely (you probably can't), I recommend building them from the ground up (to be clear I recommend you don't but if you must) by writing the tools the LLM is allowed to use, yourself, and by determining at each step whether or not you broke the security model.

Remember that everything which comes from a LLM is untrusted. You'll be tempted to vibe-code your tools. The LLMs will try to make you install some external dependencies, which you must decide if you trust them or not and review them.

Because everything produced by the LLM is untrusted, sharing the results is risky. A good starting point, is have the LLM, produce single page html page. Serve this static page from a webserver (on an external server to rely on Same Origin Policy to prevent the page from accessing your files and network (like github pages using a new handle if you can't afford a vps) ). This way you rely on your browser sandbox to keep you safe, and you are as safe as when visiting a malware-infested page on the internet.

If you are afraid of writing tools you can start by copy-pasting, and reading everything produced.

Once you write tools, you'll want to have them run autonomously in a runaway loop taking user feedback or agent feedback as input. But even if everything is contained, these run away loop can and will produce harmful content in your name.

Here is such vibe-coded experiment I did a few days ago. A simple 2d physics water molecules simulation for educational purposes. It is not physically accurate, and still have some bugs, and regressions between versions. Good enough to be harmful. https://news.ycombinator.com/item?id=47510746

by GistNoesis

3/29/2026 at 3:35:19 AM

[dead]

by edinetdb

3/28/2026 at 10:04:28 PM

[dead]

by aplomb1026

3/28/2026 at 4:51:52 PM

[dead]

by maxbeech

3/28/2026 at 2:04:17 PM

[dead]

by iisweetheartii

3/29/2026 at 6:11:07 AM

[dead]

by firekey_browser

3/28/2026 at 11:18:48 AM

[dead]

by pugchat

3/28/2026 at 2:52:27 PM

[flagged]

by minsung0830

3/28/2026 at 6:49:23 AM

[dead]

by orthogonalinfo

3/28/2026 at 7:30:55 AM

[dead]

by techpulselab

3/28/2026 at 11:05:26 PM

[dead]

by maxbeech

3/28/2026 at 8:27:15 AM

[flagged]

by georaa

3/28/2026 at 2:51:47 PM

[dead]

by RodMiller

3/28/2026 at 3:59:33 PM

[dead]

by maltyxxx

3/28/2026 at 7:07:58 PM

[flagged]

by georaa

3/30/2026 at 1:35:26 AM

jai uses persistent file systems for your changes (except for `/tmp`, `/var/tmp` and `/run/user`, which usually aren't persistent anyway). The data is stored under `$HOME/.jai` by default. But you can also expose specific directories to persist state in your home directory like `jai -d ~/.codex`, which would then survive even if you manually deleted your whole `~/.jai` directory.

by mazieres

3/28/2026 at 7:31:55 AM

[dead]

by hikaru_ai

3/28/2026 at 5:57:18 AM

[flagged]

by kevinbaiv

3/28/2026 at 12:40:54 PM

[dead]

by rsmtjohn

3/28/2026 at 4:19:10 PM

[dead]

by jeninho

3/28/2026 at 8:38:55 AM

[flagged]

by commers148

3/28/2026 at 1:56:04 PM

[dead]

by emiliazar

3/28/2026 at 9:03:56 AM

[dead]

by Rikyz90

3/28/2026 at 1:32:16 AM

[flagged]

by drtournier

3/28/2026 at 1:33:54 AM

So?

by mememememememo

3/28/2026 at 1:50:14 AM

[flagged]

by gerdesj

3/28/2026 at 3:07:56 AM

The irony is they used an LLM to write the entire (horribly written) text of that webpage.

When is HN gonna get a rule against AI/generated slop? Can’t come soon enough.

by avazhi

3/28/2026 at 3:28:05 AM

This won't cause any confusion with the jai language :)

by rdevsrex

3/28/2026 at 4:55:00 AM

Ugh.

The name jai is very taken[1]... names matter.

[1]: https://en.wikipedia.org/wiki/Jai_(programming_language)

by schaefer

3/28/2026 at 5:49:28 AM

a closed beta of an obscure programming language where the wikipedia page is nominated for deletion because it is a "Non-notable programming language that is not publicly available." is considered "very taken"?

by john_strinlai

3/28/2026 at 5:55:58 AM

That's an unreleased product in closed beta. Might not any name conflict with some unreleased product in closed beta?

by qq66

3/28/2026 at 5:05:39 AM

Slightly taken, at best.

by vscode-rest

3/28/2026 at 5:44:11 AM

Jonathan Blow has said that "Jai" is just a placeholder name or something.

by diego_sandoval

3/28/2026 at 12:34:29 PM

I hadn’t heard that. Thanks

by schaefer