4/8/2026 at 3:31:19 AM
Someone needs to make a compilation of all these classic OpenAI moments. Including hits like GPT-2 too dangerous, the 64x64 image model DALL-E too scary, "push the veil of ignorance back", AGI achieved internally, Q*/strawberry is able to solve math and is making OpenAI researchers panic, etc. etc.I use Codex btw, and I really love it. But some of these companies have been so overhyping the capabilities of these models for years now that it's both funny to look back and tiresome to still keep hearing it.
Meanwhile I am at wits end after NONE OF Codex GPT-5.4 on Extra High, Claude Opus 4.6-1M on Max, Opus 4.6 on Max, and Gemini 3.1 Pro on High have been able to solve a very straightforward and basic UI bug I'm facing. To the point where, after wasting a day on this, I am now just going to go through the (single file) of code and just fix it myself.
Update: some 20 minutes later, I have fixed the bug. Despite not knowing this particular programming language or framework.
by SilverSlash
4/8/2026 at 3:32:55 AM
> I am now just going to go through the (single file) of code and just fix it myself.That's front page news, in this era.
by DougMerritt
4/8/2026 at 3:42:42 AM
I understand how laughable that sounds when I say it out loud. But the reality is, when I'm in a state of 'Tell LLM what to do, verify, repeat', it's really hard to sometimes break out of that loop and do manual fixes.Maybe the brain has some advanced optimization where once you're in a loop, roughly staying inside that loop has a lower impedance than starting one. Maybe that's why the flow state feels so magical, it's when resistance is at its lowest. Maybe I need sleep.
by SilverSlash
4/8/2026 at 4:00:25 AM
> it's really hard to sometimes break out of that loop and do manual fixesYou're aware of the MIT Media Lab study[0] from last summer regarding LLM usage and eroding critical thinking skills...?
[0] Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task June 2025 DOI:10.48550/arXiv.2506.08872
by logifail
4/8/2026 at 4:17:45 AM
>> it's really hard to sometimes break out of that loop and do manual fixesit's not just an erosion of skills, it can also break the whole LLM toolchain flow.
Easy example: Put together some fairly complicated multi-facet program with an LLM. You'll eventually hit a bug that it needs to be coaxed into fixing. In the middle of this bug-fixing conversation go and ahead and fire an editor up and flip a true/false or change a value.
Half the time it'll go un-noticed. The other half of the time the LLM will do a git diff and see those values changed. It will then proceed to go on a tangent auditing code for specific methods or reasons that would have autonomously flipped those values.
This creates a behavior where you not only have to flip the value, the next prompt to the LLM has to be "I just flipped Y value.." in order to prevent the tangent that it (quite rightfully in most cases) goes off on when it sees a mysteriously changed value.
so you either lean in and tell the llm "flip this value", or you flip the value yourself and then explain. It takes more tokens to explain, in most cases, so you generally eat the time and let the LLM sort it.
so yeah, skill erosion, but it's also just a point of technical friction right now that'll improve.
by serf
4/8/2026 at 7:48:59 PM
Every time I change something outside the chat interface claude tells me a linter made a change.by malfist
4/8/2026 at 2:15:29 PM
This was a great comment. I don't know if it's common knowledge, but this really helped clarify how the shift happens.I also remember half coding and half prompting a few months back, only to be frustrated when my manual changes started to confuse the LLM. Eventually you either have to make every change through prompting, or be ok with throwing away an existing session and add back in the relevant context in a fresh one.
by crakhamster01
4/8/2026 at 1:39:39 PM
When I have to pop in and solve a problem, I tell it I fixed it and what was wrong.Depending on the depth of its misunderstanding it could become a memory note or a readme update. I haven’t had any real issues with that approach.
by dd8601fn
4/8/2026 at 4:04:14 PM
It sucks that you have to do this.I'm not yet at the point where I'm comfortable with just vibe coding slop and committing to source control. I'm always going in and correcting things the LLM does wrong, and it really sucks to have to keep a mental list of all the changes you made, just so you can tell your Eager Electronic Intern that you made them deliberately and to not undo them or agonize over them.
by ryandrake
4/8/2026 at 12:50:07 PM
> But the reality is, when I'm in a state of 'Tell LLM what to do, verify, repeat', it's really hard to sometimes break out of that loop and do manual fixes.My experience is rather that I am annoyed by bullshit really fast, so if the model does not get me something that is really good, or it can at least easily be told what needs to be done to make it exceptional, I tend to use my temper really fast, and get annoyed by the LLM.
With this in mind, I rather have the feeling that you are simply too tolerant with respect to shitty code.
by aleph_minus_one
4/8/2026 at 8:12:57 AM
I have the same problem. I had lines directly in front of me where I needed to change some trivial thing and I still prompted the AI to do it. Also for some tasks AI are just less error prone and vice versa. But it seems the context switch from prompting to coding isn't trivial.by raxxorraxor
4/8/2026 at 4:08:01 AM
I think it’s called "sunk cost fallacy".by Sharlin
4/8/2026 at 5:09:11 AM
"The last output is so close to exactly what I wanted, I can't not pull the machine's lever a few more times to finally get the jackpot..."by Terr_
4/8/2026 at 3:58:40 AM
> Maybe the brain…is already damaged by reliance on AI.
by Traubenfuchs
4/8/2026 at 2:06:13 PM
And that’s exactly why I’ve stopped using llm’s entirely.People who are using them frequently: you’re delusional if you think your brain is not harmed. I won’t go into great detail because I can’t be bothered and I’m sure this post will be down voted - but - I can share my own experience. Ever since I stopped using them my ability to focus, think hard and hold concepts in my brain and reason about them has increased immensely. Not only that but I re-gained the conditioning of my brain to ‘deal with the pain’ that comes with deep thought - all of that gets lost by spending too much time interacting with llm’s.
by ieijdd
4/8/2026 at 7:37:02 AM
[dead]by baby6343
4/8/2026 at 3:36:25 AM
Thank you for the belly laugh.by abnercoimbre
4/8/2026 at 3:36:16 AM
Are you sure they are not just refusing to solve your UI bug due to safety concerns? They may be worried you'll take over the world once your UX becomes too good.by loveparade
4/8/2026 at 4:02:23 AM
> a very straightforward and basic UI bugShow us the code, or an obfuscated snippet. A common challenge with coding-agent related posts is that the described experiences have no associated context, and readers have no way of knowing whether it's the model, the task, the company or even the developer.
Nobody learns anything without context, including the poster.
by jeswin
4/8/2026 at 7:16:37 AM
A pretty easy way to construct a bug that is easy for a human to solve but difficult for an AI is to have it to do something with z-indexes. For instance, if your element isn't rendering because something else is on top of it, Claude will struggle, because it's not running a browser, so the only way it could possibly know there was a bug would be to read every single CSS and HTML file in your entire repo. On the other hand, a human can trivially observe the failure in a browser and then fix it.This is a pretty simple thing, but you can imagine how CSS issues get progressively more difficult for AIs to solve. A CSS bug can be made to require reading arbitrarily much code if you solve by only reading code, but by looking at relatively few elements, if you look at the page with your eyes.
This can be somewhat solved by hooking up a harness to screenshot the page and feed it into the AI, but it isn't perfect even then.
by johnfn
4/8/2026 at 4:21:56 AM
That's hard to believe in my case. I tried a variety of prompts, 3 different frontier models, provided manual screenshot(s), the agent itself also took its own screenshots from tests during the course of debugging. Nothing worked. I have now fixed the bug manually after 15-20 minutes of playing around with a codebase where I don't know the language and didn't write a single line of code until now.by SilverSlash
4/8/2026 at 7:30:15 PM
What's hard to believe? OP just asked what the bug was.by johnfn
4/8/2026 at 5:31:02 AM
> after wasting a day on this, I am now just going to go through the (single file) of code and just fix it myself.Seriously, you wasted a whole day just so you wouldn't have to look at a single file of code?
> Update: some 20 minutes later, I have fixed the bug. Despite not knowing this particular programming language or framework.
Be really careful there, you might have accidentally learned something.
by isolay
4/8/2026 at 7:07:52 AM
It is entirely plausible they were just experimenting with AI tooling to better understand how to use it and what it is capable of. Their saying, 'Despite not knowing this particular programming language or framework.' indicates to me this is probably the case.by OuterVale
4/8/2026 at 8:09:48 AM
Nope. I've been working on this project for a couple of days now and things were mostly going well. A significant portion of the mvp backend and frontend was built and working. Then this one seemingly simple bug appeared and just totally stumped both Codex and Claude Code.There was even another UI component (in the same file) which was almost the same but slightly different and that one was correct. That's what I copy pasted and tweaked when I fixed the problem. But for some reason the models were utterly incapable of making that connection.
With Codex and Claude Code I thought maybe because these agentic coding tools are trained to be conservative with tokens and aggressively use grep that they weren't looking at the full file in one go.
But with Gemini I used the web version and literally pasted that entire file + screenshots detailing what was wrong (including the other component which was rendering correctly) and it still couldn't solve it. It was bewildering.
by SilverSlash
4/8/2026 at 10:31:41 AM
I had the exact same issue, I had a UI scrollbar bug that claude couldn't fix, it tried 4-5 different ideas that it was sure was causing the issue, none of them worked.Tried the same with codex, it did a little better but still 4x times around.
This is with playwright automation, screen shots, full access, etc..
by mlrtime
4/8/2026 at 3:48:53 AM
I told my manager I wrote my code line by line (most of it) in a check-in. I showed him @author my name, and we laughed for a bit.But I think that is the best way to have a clear mental model. Otherwise, no matter how careful, you always have tech debt building and churning.
Also they really suck at UI bugs and CSS. Unit test that stuff.
by rain-princess
4/8/2026 at 4:24:31 AM
I had a problem that required a recursive solution and Opus4.6 nearly used all my credits trying to solve it to no avail. In the AI apocalypse I hope I'm not judged too harshly for my words near the end of all those sessions lol.by derangedHorse
4/8/2026 at 3:47:14 AM
yeah they all suck at ui. have you given it a feedback loop? update code, screenshot, read image repeat etc. that's the best i've found as long as tokens aren't a concernby willsmith72
4/8/2026 at 6:07:05 AM
Haven't you heard? "Coding is solved."by BobbyJo
4/8/2026 at 3:44:54 AM
> I am now just going to go through the (single file) of code and just fix it myself.You can't it's all vibed, you'll face the art vs build internal struggle and end up re-coding the entire thing by hand.
by saltyoldman