alt.hn

3/31/2026 at 4:25:43 PM

Show HN: PhAIL – Real-robot benchmark for AI models

https://phail.ai

by vertix

3/31/2026 at 6:12:22 PM

This is absolutely awesome. Thanks for sharing! I would love to chat more with you. For context: we make a remote teleoperation solution for robotics. It's mostly used for mobile robots, but we've been getting a lot of inquiries regarding teleoperation for manipulation, so I've been learning more about this, in particular regarding the question of speed. I really appreciate these results!

by chfritz

3/31/2026 at 6:18:32 PM

Feel free to reach me out via hi at phail dot ai

by vertix

3/31/2026 at 7:25:15 PM

This is amazing. Loved watching the videos with real-world attempts.

Finally a real benchmark vs polished teleoperated twitter videos. Shows the real state of a super important industry, and there’s a lot of work to do.

by apetrovicheva

3/31/2026 at 7:35:05 PM

[dead]

by vertix

3/31/2026 at 5:26:00 PM

I'm a big fan of benchmarks and now finally we have one to evaluate models on physical tasks. Will be interesting to see how fast this gap will narrow.

by vladimir_gor

3/31/2026 at 4:49:06 PM

If I understand correctly, this is about benchmarking robot models. Do you have a robot to do the benchmarking or is it all simulation?

by akshaisarathy

3/31/2026 at 4:52:15 PM

All real hardware, no simulation. Franka FR3 arm with a Robotiq gripper, physical totes, real objects. Every run is recorded with synced video and telemetry (you can watch any episode on the site).

That's the whole point – simulation benchmarks exist, but operators deploying robots care about real-world performance.

by vertix

3/31/2026 at 4:35:38 PM

I'm curious! What other models you're planning to add to the leaderboard?

by anna_pozniak

3/31/2026 at 4:39:48 PM

We're working on adding DreamZero (NVIDIA's latest) next. The leaderboard is open to any model – both open-source and closed-source. If you have a checkpoint, we'll run it on the same hardware under the same blind protocol. Closed-source participants can submit their model as a container and we evaluate it without accessing the weights. Reach out at hi@phail.ai if you want to submit.

by vertix

4/1/2026 at 3:40:02 AM

[dead]

by agenexus