2/25/2026 at 10:35:53 AM
> Then a brick hits you in the face when it dawns on you that all of our tools are dumping crazy amounts of non-relevant context into stdout thereby polluting your context windows.I've found that letting the agent write its own optimized script for dealing with some things can really help with this. Claude is now forbidden from using `gradlew` directly, and can only use a helper script we made. It clears, recompiles, publishes locally, tests, ... all with a few extra flags. And when a test fails, the stack trace is printed.
Before this, Claude had to do A TON of different calls, all messing up the context. And when tests failed, it started to read gradle's generated HTML/XML files, which damaged the context immensely, since they contain a bunch of inline javascript.
And I've also been implementing this "LLM=true"-like behaviour in most of my applications. When an LLM is using it, logging is less verbose, it's also deduplicated so it doesn't show the same line a hundred times, ...
> He sees something goes wrong, but now he cut off the stacktraces by using tail, so he tries again using a bigger tail. Not satisfied with what he sees HE TRIES AGAIN with a bigger tail, and … you see the problem. It’s like a dog chasing its own tail.
I've had the same issue. Claude was running the 5+ minute test suite MULTIPLE TIMES in succession, just with a different `| grep something` tacked at the end. Now, the scripts I made always logs the entire (simplified) output, and just prints the path to the temporary file. This works so much better.
by skerit
2/25/2026 at 12:53:33 PM
> Claude is now forbidden from using `gradlew` directly, and can only use a helper script we made. It clears, recompiles, publishes locally, tests, ... all with a few extra flags. And when a test fails, the stack trace is printed.I think my question at this point is what about this is specific to LLMs. Humans should not be forced to wade through reams of garbage output either.
by majewsky
2/25/2026 at 1:18:04 PM
Humans have the ability to ignore and generally not remember things after a short scan, prioritize what's actually important etc. But to an LLM a token is a token.There's attempts at effectively doing something similar with analysis passes of the context - kinda what things like auto-compaction is doing - but I'm sure anyone who has used the current generation of those tools will tell you they're very much imperfect.
by kimixa
2/25/2026 at 2:53:33 PM
The “a token is a token” effect makes LLMs really bad at some things humans are great at, and really good at some things humans are terrible at.For example, I quickly get bored looking through long logfiles for anomalies but an LLM can highlight those super quickly.
by pennomi
2/25/2026 at 4:32:06 PM
Isn’t the purpose of self attention exactly to recognize the relevance of some tokens over others?by dcrazy
2/25/2026 at 6:05:56 PM
That may help with tokens being "ignored" while still being in the context window, but not context window size costs and limitations in the first place.by kimixa
2/25/2026 at 8:49:41 PM
[dead]by melecas
2/25/2026 at 7:28:37 PM
In my experience, it's the old time-invested vs time-saved trade off. If you're not looking at these reams of output often enough, the incentive to figure out all the flags and configs for verbosity to write these script is lower: https://xkcd.com/1205/And because these issues are often sporadic, doing all this would be an unwanted sidequest, so humans grit their teeth and wade through the garbage manually each time.
With LLMs, the cost is effectively 0 compared to a human, so it doesn't matter. Have them write the script. In fact, because it benefits the LLM by reducing context pollution, which increases their accuracy, such measures should be actively identified and put in place.
by keeda
2/25/2026 at 1:09:21 PM
Lots of tools have a --quiet or --output json type option, which is usually helpfulby adammarples
2/25/2026 at 10:59:06 AM
The way I've solved this issue with a long running build script is to have a logging scripts which redirects all outputs into a file and can be included with ``` # Redirect all output to a log file (re-execs script with redirection) source "$(dirname "$0")/common/logging.sh" ``` at the start of a script.Then when the script runs the output is put into a file, and the LLM can search that. Works like a charm.
by ViktorEE
2/25/2026 at 10:40:44 AM
This has been my exact experience with agents using gradle and it’s beyond frustrating to watch. I’ve been meaning to set up my own low-noise wrapper script.This post just inspired me to tackle this once and for all today.
by quintu5
2/25/2026 at 3:53:03 PM
[dead]by co_king_5
2/25/2026 at 11:06:45 AM
Wow, I'd love to do this. Any tips on how to build this (or how to help an LLM build this), specifically for ./gradlew?by petedoyle
2/25/2026 at 3:23:12 PM
How is it forbidden? I tell agents to use my wrappers in AGENTS but they ignore it half the time and use the naked tool.by esafak
2/25/2026 at 4:26:09 PM
If you get desperate, I've given my agent a custom $PATH that replaces the forbidden tools with shims that either call the correct tool, or at least tell it what to do differently.~/agent-shims/mvn:
#!/bin/bash
echo "Usage of 'mvn' is forbidden. Use build.sh or run-tests.sh"
That way it is prevented from using the wrong tools, and can self-correct when it tries.
by Squid_Tamer
2/25/2026 at 3:42:52 PM
Permissions scopingby simsla
2/25/2026 at 3:45:33 PM
Then they attempt to download the missing tool or write a substitute from scratch. Am I the only one who runs into this??by esafak