12/8/2025 at 3:49:17 PM
I wrote a comment saying that this should be possible with a proper playwright harness and screenshot taking. My comment ended up in the negatives (though curiously no one stopped to explain why), as if I was saying something so absurdly inaccurate that it wasn’t even worth rebutting. Thank you for actually running the experiment and proving it - I was almost annoyed enough to do it myself.I couldn’t understand why it had happened - it felt about as logical to my mind as writing a comment that Rust was faster than Node. I feel there is a strong anti-AI sentiment here, to the point that people will ignore evidence presented directly to them.
Personal vendetta aside, I enjoyed this post! You had some clever tricks I wouldn’t have considered. In fact, the idea of producing a pixel diff as output was particularly imaginative. And the bit about autoformalization definitely hits on something I’ve been feeling when working with AI recently.
EDIT: I notice my comment yesterday is in the positives. Please don’t vote it up. That was not my intention here.
by johnfn
12/8/2025 at 3:54:24 PM
There's a lot of LLM haters, simple as that.by simlevesque
12/8/2025 at 11:15:39 PM
For posterity: Claude is no longer “just” an LLM you’re interacting with.by DANmode
12/8/2025 at 4:48:01 PM
I use AI everyday. But some AI adopters are getting a bit culty as well.by ls-a
12/8/2025 at 5:32:11 PM
But there is nothing culty about saying “an LLM could one-shot this” when it has clearly been demonstrated that an LLM can, in fact, one shot this!by johnfn
12/8/2025 at 7:50:22 PM
You must have a different definition of "one-shot"by socalgal2
12/8/2025 at 7:54:14 PM
"One-shot" means you type one prompt into Claude, press enter, and then judge the results when the AI completes the task. The article says that Claude was able to take a single prompt and produce a pixel-perfect replica.Is there another definition of "one-shot" I'm not aware of?
by johnfn
12/8/2025 at 8:03:57 PM
Yea, the one that includes all the previous "one-shot" failures.If you throw 4 basketballs at a hoop and only the 4th one makes it in, that's not "one-shot". It's 4.
Your definition appears to be, after many failed attempts, a single prompt that gets the AI to complete the task was found. You called that "one-shot" but what really happened is there were several "shots" that failed.
by socalgal2
12/8/2025 at 8:12:51 PM
If three of my friends that have never played basketball take a shot and they all miss, that doesn't say anything about my ability to make a shot. Obviously, my friends are inexperienced with basketball. There is nothing wrong with that, but it's not a fair or just argument to tell people who do have experience that "your success not a true one-shot because other, inexperienced people, failed".You appear to be extending the definition of "one-shot" to "an LLM should be able to accomplish anything no matter how poorly it was prompted", which is, I hope you can agree, not very reasonable. It's certainly not something I would be arguing in favor of.
by johnfn
12/8/2025 at 11:58:33 PM
your shot is not dependent on your 3 friends. The "one-shot" prompt only happened because they took the info from previous failed attempts to guide their promptby socalgal2
12/9/2025 at 12:48:37 AM
You seem to be saying that I never would have thought of how to prompt Claude until I became inspired by reading the post yesterday. But I use Claude with Playwright + screenshots basically daily.by johnfn
12/8/2025 at 11:19:57 PM
I can one-shot a 10-footer, might need few attempts to hit a half-court shot. not all shots are created equal but I have seen many full-court one-shots by LLMs which would have taken human (other than Steph) tens if not hundreds of shots :)by bdangubic
12/8/2025 at 8:07:41 PM
This isn't in a vacuum though, it's after someone else failed and they took the learnings from that and put it into the prompt.by tapoxi
12/8/2025 at 11:28:39 PM
> I sort of do agree that Claude Code on its own would not be able to do this. But Claude powered by nori configs absolutely should be able to.Yeah not to mention setting up additional tooling
by ls-a
12/8/2025 at 4:14:01 PM
Note that I didn't even it tell it to use a pixel diff. Claude w/ Nori did that on its own by following the Nori TDD skill. I did very little, I'm actually very lazy :Dby theahura
12/8/2025 at 4:19:42 PM
There is a quote about lazy developers, but I too lazy to search for it.by stanac
12/8/2025 at 4:39:53 PM
Laziness is one of the three virtues (of a good programmer), but I think Larry didn’t anticipate the current situation when he wrote it:”The quality that makes you go to great effort to reduce overall energy expenditure. It makes you write labor-saving programs that other people will find useful and document what you wrote so you don't have to answer so many questions about it.”
by eCa
12/8/2025 at 4:51:34 PM
I didn't get downvoted yesterday but I got pretty far strongly hinting Claude should use very basic image processing approaches and it went for opencv very successfully. It was very fast on the image layout but failed pretty hard on the footer. This morning I decided to walk it through basic image processing for text detection and word building and that went pretty well but I didn't tell it what we were doing and it was too much me telling it what to do. It did sort of realize what we were doing at one point. I was thinking about trying again with just a nudge to think about using basic OCR image processing techniques to detect words and lines and see what Claude comes up with. Was also wondering what it would do if I just told it to use tesseract or paddleocr.by fluidcruft
12/8/2025 at 8:06:24 PM
Voting is meant to allow a community to police itself to some extent, although the downside to that is it incentivizes controlling the discussion over contributing to it. It’s a lot easier to vote in accordance with your own beliefs than articulate counter-arguments. The prisoner dilemma takes hold, people stop visiting as they get frustrated by the downvoting, and a bubble forms. It’ll be interesting to see how AI changes the online discussion landscape.by jack_h
12/9/2025 at 5:31:43 AM
Also note that votes don't just mean agree/disagree. I frequently upvote comments I disagree with and downvote comments I agree with. The votes place the comments in the discussion ranking so plenty of people vote this way.One example of this is I might like a conversation that's responding to a comment I don't like. But it's a common misunderstanding so I want the conversation to be boosted. Therefore I upvote the comment I disagree with because it's parent to the comments I want to be more visible.
I don't think I'm the only one that does this on HN. And I think doing this can help reduce the repeated comments. In the above example an early common misconception might get downvoted, not seen by others, and then repeated by some, where it can then rise because it's seen by a different subset.
Anyways, I don't think people should vote strictly on agree/disagree
by godelski
12/10/2025 at 7:56:18 PM
I also don’t think people should vote like that, the problem is there’s no way to enforce such a pattern[1]. Everyone benefits from voting as you describe, but that’s precisely where the prisoner’s dilemma comes into play. There’s also a size component to it. Small, tight knit communities tend to do well with voting, but as communities grow interactions become less personal, trust drops, and the incentive structure I described above becomes dominant. Voting essentially allows a community to establish its own Overton window distinct from what the official rules create, but that can be changed and constricted until a bubble is established[2]. I’ve seen it happen with countless communities across social media. Despite good intentions I think voting systems are a net negative to fostering good discussions and debate.[1] Maybe AI meta-moderation?
[2] I don’t mean that this is happening intentionally by bad actors, merely that on average large groups produce outcomes that are dictated by incentives.
by jack_h
12/8/2025 at 4:59:13 PM
Dude, the recreation is a joke (hopefully an intentional one). It uses the screenshot instead of the assets.Go ahead, turn on the Web Inspector, and remove the body background:
by gaigalas
12/8/2025 at 5:18:32 PM
The article mentions this:> So it kind of cheated, though it clearly felt angst about it. After trying a few ways to get the stars to line up perfectly, it just gave up and copied the screenshot in as the background image, then overlaid the rest of the HTML elements on top.
by Tiberium
12/8/2025 at 7:54:32 PM
That does not make the title any less clickbaity. Moreover, it does not seem like a vindication of johnfn's original comment.by Palmik
12/8/2025 at 8:17:40 PM
index_tiled.html is what justifies the title IMO - it's not using a screenshot as the background like index.html, and is as close as you can get using the original assets given the screenshot's scaling and compression artifacts (minus the red text being off).But I feel it'd make more sense to just retake the screenshot properly and see if it can create a pixel-perfect replica.
by Ukv
12/8/2025 at 5:31:05 PM
The outcome does not justify @johnfn's redemption celebration. That's why I decided to give him a heads up.Aside from that, I think it's a joke. Like the value of pi example I gave in the other comment. If it's not, it is really just sad.
by gaigalas
12/8/2025 at 5:15:13 PM
Please read the blog post!by theahura
12/8/2025 at 5:24:20 PM
It's a joke, right? A joke similar to this one:---
> Make me a python script that calculates the value of PI
```python
print("3.1415")
```
"I think it's passable!" <--- The joke
---
If it's not a joke, then it's just sad.
by gaigalas
12/8/2025 at 5:33:56 PM
I hate to tell you this but all digital representations of pi are numeric approximations. Your joke works, but perhaps not in the direction you were angling for.by johnfn
12/8/2025 at 5:37:45 PM
Only the digital ones? oof, why so specific?I would have accepted `22/7`.
by gaigalas
12/8/2025 at 5:36:54 PM
First, you're being unnecessarily acerbic. It doesn't help your case, and it's just kinda weird!Second, the original post was obviously about the placement of the buttons on the space jam website.
Third, I spend at least half the blog post responding to the exact complaint you have. If you do not have more to add beyond pointing out that the 'hack' exists, you aren't adding to the conversation.
Fourth, the blog post and the repo has a version that does not include the screenshot and actually tiles the gif.
I'm still convinced you haven't actually read the blog post because you have shown zero indication that you are engaging with the material. In which case, why even bother commenting?
by theahura
12/8/2025 at 5:44:55 PM
Can I offer some valid criticism?The original Space Jam website is fluid (it's 90s lingo for responsive).
It is also a still relevant website because it is a living fossil of that era's way of doing webdesign.
Asking to recreate it, should include those aspects (epoch-relevant technical achievements such as fluid layouts) and faithfulness to the original implementation.
I'm not saying that Claude should know that out of the box (it would have been impressive if it did), but the prompt should have included those ideas.
A modern reconstruction in CSS3, in contrast to a faithful reproduction, should have mirrored what the techniques accomplished with modern tools. It would be useful in a sense of showcasing how CSS3 evolved, it would have a purpose.
Do you understand why this is not passable? It has no value as a recreation.
by gaigalas
12/8/2025 at 4:18:29 PM
I haven't seen your original comment but "It could work if they did it better" is in general a low value comment.by jayd16
12/8/2025 at 4:21:19 PM
You should go read it and see if you can tell me a way I could improve it. I felt I gave actionable advice, but I’m always happy to know if I could have said things better.by johnfn
12/8/2025 at 4:42:27 PM
Looking at the comment, I would argue that it's fairly vague. Maybe it's clear if you have done it but not clear to others type thing.Then you undercut the advice by adding "I've always wondered if <confident suggestion> would work", making it unclear how much of the advice is a shot in the dark and how much you've actually seen results from that advice.
Claims like "you might even one shot it" also make it seem like simple hype and not the war story of someone who's actually taken the advice.
But you know, people are down voting me for engaging with your question as well so I don't know. Maybe it's all bots these days :p
by jayd16
12/8/2025 at 4:49:10 PM
Haha. For what it’s worth, and despite the downvotes, I do appreciate the feedback.by johnfn
12/8/2025 at 6:58:15 PM
You could improve it by simply doing the thing you describe and linking to it.by kcatskcolbdi
12/8/2025 at 3:55:31 PM
It is a task that could be _easily_ done manually in much shorter time without AI, probably by developers who even love to develop. The reaction on this shouldn't be misjudged as anti-AI. A lot of people, including me, just do not get it! For scientific purposes? Ok, fair enough. But what is the further meaning of this exercise?by Aldipower
12/8/2025 at 3:59:39 PM
The point is that if we agree that this task is truly a one shot, as long as you agree it’s faster to prompt than code, then while you “easily” do this task in around an hour (or however long you say it will take you), I’ll prompt Claude in around 5 minutes, and get a few more things done while I let it run in the background. What am I missing from your argument?by johnfn
12/8/2025 at 4:01:58 PM
Reading the blog post, prompting Claude setting up Playwright etc. takes at least one hour maybe more? Not seeing where your 5 minutes coming from.by Aldipower
12/8/2025 at 4:07:21 PM
author here -- it took like 5 minutes of actual attention from me? I'm not sure why you are counting reading the blog post or setting up playwright. I guess I did read the blog post, but im not sure that should count. And claude set up playwright, not me.by theahura
12/8/2025 at 4:45:57 PM
[dead]by dingnuts
12/8/2025 at 4:06:22 PM
“Setting up playwright” is about two sentences of a prompt to Claude. I know this because I’ve done it many times. The authors prompt in the post is only a few hundred words. (Most of the post is just LLM output.) I’m certain I could type that out in 5 minutes.Do we really need another post where I time how long it takes to prompt Claude to create the Space Jam website?
by johnfn
12/8/2025 at 4:22:41 PM
its less than a few hundred words. The full total of what I typed into claude to get the first version is:Initial prompt:
> I am giving you:
> 1. A full screenshot of the Space Jam 1996 landing page (screenshot.png)
> 2. A directory of raw image assets extracted from the original site (files/)
> Your job is to recreate the landing page as faithfully as possible, matching the screenshot exactly.
> Use the webapp-testing skill. Take screenshots and compare against the original. <required>You must be pixel perfect.</required>
plan response:
> they should all go to tilework.tech
> exact screenshot dimensions
which is 75 words
by theahura
12/8/2025 at 4:23:54 PM
> But which takes longer: learning all of web dev to code the site, or learning to tell Claude to doff pixels?As a developer, I naturally prefer the former over the latter, as this becomes general programming knowledge that I can benefit from in later projects.
BTW one of the creators of the Space Jam Website left a comment on the blog post: " Sebastien Derenoncourt 20m
I must say as a person who worked on that original website, I am confused why you needed claude to do so much basic HTML/css we didn't even use tons of complex CSS until much later in time... "
by Aldipower
12/8/2025 at 4:51:11 PM
If we both agree that prompting is net positive on time, I think we’re in agreement. You are trying to move the discussion to some sort of philosophical point about knowledge gain and subjective experience, but that is definitely not what I was going for.by johnfn
12/8/2025 at 5:41:46 PM
I do not try to move anything, if you read my initial post.And after I have seen the final result coming from Claude, which is linked now in the blog post, I must say, recreation not completed! A lot of things are missing, proper zoom behavior, window resize, correct center aligned positioning. So what's the point here anyway?
by Aldipower
12/8/2025 at 8:52:41 PM
The original post that failed to create the Space Jam had no comments on a lack of "proper zoom behavior" or "window resize". It said that the LLM failed to place the elements on the page. The LLM has now solved that problem.You are repeatedly trying to change the criteria to be different than what it contextually was. "It's not a perfect solution because as a developer I would not get benefit from this." "It's not a perfect solution because it doesn't handle <thing which was never enumerated as a success criteria>". None of that was a topic in the original blog post.
You can make another blog post, if you like, about how Claude can't handle proper zoom when creating the Space Jam website. My guess is someone will one-shot that too.
by johnfn
12/8/2025 at 5:05:59 PM
When I did it, vanilla Claude (Max5, default Opus 4.5, no special skills loaded, etc) had playwright up and running for screenshots in minutes (after intervening to tell it to use python 3.13 after noticing python 3.14 seems to be missing a lot of wheels and uv was rebuilding numpy for some reason) without me telling how to screenshot at all.https://news.ycombinator.com/threads?id=fluidcruft#46185996
(One fun wrinkle I enjoyed watching was I created the target screenshot in Firefox, and Claude was using playwright with Chrome. Ultimately, I have no idea whether either Firefox or Chrome has the correct actual fonts and I'm not a webdev and don't remember how to figure that all out)
by fluidcruft
12/8/2025 at 4:58:40 PM
For me it's more that I'm not a web developer and it would definitely take me way longer to research all the parts of doing this. I have booksmarts (at best) about basic CSS and have given up trying to keep up with javascript anything.by fluidcruft
12/8/2025 at 4:20:03 PM
It's seemingly an experiment to see how an LLM performs when the task is just outside of its milieu. The answer is not very well.by BearOso