5/20/2025 at 6:33:42 PM
I was a Plus subscriber and upgraded to Pro just to test Codex, and at least in my experience, it’s been pretty underwhelming.First, I don’t think they got the UX quite right yet. Having to wait for an undefined amount of time before getting a result is definitely not the best, although the async nature of Codex seems to alleviate this issue (that is, being able to run multiple tasks at once).
Another thing that bugs me is having to define an environment for the tool to be useful. This is very problematic because AFAIK, you can’t spin up containers that might be needed in tests, severely limiting its usefulness. I guess this will eventually change, but the fact that it’s also completely isolated from the internet seems limiting, as one of the reasons o3 is so powerful in ChatGPT is because it can autonomously research using the web to find updated information on whatever you need.
For comparison, I also use Claude a lot, and I’ve found it to work really well to find obscure bugs in a somewhat complex React application by creating a project and adding the GitHub repo as a source. What this allows me is to have a very short wait time, and the difference with Codex is just night and day. Gemini also allows you to do this now, and it works very well because of its massive context window.
All that being said, I do understand where OpenAI is going with this. I guess they want to achieve something like a real coworker (they even say that in their promotional videos for Codex) because you are supposed to give tasks to Codex and wait until it’s done, like a real human, but again, IMHO, it’s too “pull-request-focused”
I guess I’ll be downgrading to Plus again and wait a little to see where this ends up.
by rmonvfer
5/21/2025 at 2:28:05 PM
I agree on the UX. A few basic things seem totally broken.The flow of connecting a github account works, then disconnects, sometimes doesn't work, sometimes just errors. I can't install things that I could yesterday and my environment is just... broken? I have two versions of a repo and it works in only one.
Speed is a big thing. Not the llm stuff so much, but the setup and everything around it for each step.
Not having search cripples some cases where O3 seems incredible.
but there's a lot of places this feels like it can land tasks that often wouldn't get done. A near infinite army of juniors who can take on the lots of tiny tasks in 15-20 minutes is great. Fix some typos, add a few util functions (a task I have right now running), I even just asked it to add new endpoint to a server and it added it, migrations needed, tests and more and seems alright.
The ideal workflow in a way here is that the people asking for these things get to tag the ticket to codex/whatever, they run off and do the thing, PR lands and discussion and changes happen there, demo envs are setup and then someone can check and approve it.
edit -
To be fair, I also used firebase studio and that was worse. Blank screens, errors in the console, when I refreshed and moved around and got an actual page, it ended up failing to setup firebase. UI for editing and code totally failed after that and the explanations for how to fix it I was linked to I couldn't do.
by IanCal
5/22/2025 at 10:34:26 AM
It's a shame nobody has invented some sort of computerised intelligence that understands code and could fix some of those bugs. Ah wellby ifwinterco
5/22/2025 at 12:10:18 AM
> AFAIK, you can’t spin up containers that might be needed in tests, severely limiting its usefulness.This is what's blocking me right now. I couldn't find any documentation on whether they allow Docker-in-Docker which typically means that the answer is "no". Since I'm building an AWS-native app I use LocalStack for end-to-end tests which requires a container engine. Codex not having it is a showstopper.
by alexjplant
5/22/2025 at 3:17:54 PM
This might not help you but to the very best of my knowledge localstack can operate over the network just fine and I am pretty sure it has a reset endpoint for zeroing its state (I think it's this https://github.com/localstack/localstack/blob/v4.4.0/localst... )The other alternative is that I've seen folks mention systemd-nsspawn as a form of isolation if that's what your using docker for (but I've never tried it myself)
by mdaniel
5/20/2025 at 9:09:12 PM
It really needs container supportby anxman