2/17/2026 at 11:35:48 PM
> Especially curious about your current workflows when you receive an alert from any of these channels like Sentry (error tracking), Datadog (APM), or user feedback.I have a github action that runs hourly. It pulls new issues from sentry, grabs as much json as it can from the API, and pipes it into claude. Claude is instructed to either make a PR, an issue, or add more logging data if it’s insufficient to diagnose.
I would say 30% of the PRs i can merge, the remainder the LLM has applied a bandaid fix without digging deep enough into the root cause.
Also the volume of sentry alerts is high, and the issues being fixed are often unimportant, so it tends to create a lot of “busy work”.
by nojs
2/18/2026 at 12:35:21 AM
To avoid this 'busy work', we group alerts by RCA (so no duplicate PRs) and filter by severity (so no PRs for false positives or not-that-important issue). We realized early on that turning every alert into a PR just moves the problem from Sentry to GitHub, which defeats the purpose.Is having a one-hour cron job enough to ensure the product’s health? do you receive alerts by email/slack/other for specific one or when a PR is created?
by Dimittri
2/18/2026 at 2:56:38 AM
interesting. yeah the only reason it’s on cron is because the sentry-github integration didnt work for this (can’t remember why), and i didnt want to maintain another webhook.the timing is not a huge issue though because the type of bugs being caught at this stage are rarely so critical they need to fixed in less time than that - and the bandwidth is limited by someone reviewing the PR anyway.
the other issue is crazy token wastage, which gets expensive. my gut instinct re triaging is that i want to do it myself in the prompt - but if it prevents noise before reaching claude it may be useful for some folks just for the token savings.
no, I don’t receive alerts because i’m looking at the PR/issues list all day anyway, it would just be noise.
by nojs
2/18/2026 at 4:08:49 AM
totally get the 'token wastage' point—sending noise to an LLM is literally burning money.but an other maybe bigger cost might be your time reviewing those 'bandaid fixes.' if you're merging only 30%, that means you're spending 70% of your review bandwidth on PRs that shouldn't exist right?
we deduplicate before the claude analysis with the alert context and after based on the rca so we ensure we have no noise in the PRs you have to review
why don't you trust an agent to triage alerts+issues?
by Dimittri
2/18/2026 at 7:59:31 AM
Yeah. what I find in practice is that since the majority of these PRs require manual intervention (even if minor, like a single follow up prompt), it's not significantly better than just hammering them all out in one session myself a few times per week, and giving it my full attention for that period of time.The exception is when a fix is a) trivial or b) affecting a real user and therefore needs to be fixed quickly, in which case the current workflow is useful. But yeah, the real step-change was having Claude hitting the Sentry APIs directly and getting the info it needs, whether async or not.
I'd also imagine that people's experiences with this vary a lot depending on the size and stage of the company - our focus is developing new features quickly rather than maintaining a 100% available critical production service, for example.
by nojs
2/18/2026 at 6:21:49 PM
Interesting. it makes sense that it depends on the number of alerts you receive. but I’d think that if 70% of the PRs you receive are noise, an AI triager could be useful—if you give it the context it needs based on your best practices. I’m very curious about the kinds of manual intervention you do on PRs when one is required. What does the follow-up prompt look like? Is it because the fix was bad, because the RCA itself was wrong, or because of something else?by Dimittri