My thoughts about this:Benchmarking AI gateways properly is harder than it looks. Feature sets differ meaningfully - exact vs semantic caching, cluster mode, guardrails, audit logging - and each carries its own latency cost. What actually matters for most users is end-to-end latency including provider overhead (200–2000ms), and in that frame Bifrost, LiteLLM, and GoModel are all perfectly fine.
I ran some comparisons but I'm not happy with the methodology, and I'd rather not spread misleading information. Once I have time to do it properly I'll write it up and share a link here. Honestly, I'd also love to see benchmarks done by someone other than the AI gateway builders. :)
Where GoModel actually differs today:
- image size: 16.96 MB vs Bifrost's 69.84 MB. It matters for sidecar, edge, and cold-start scenarios.
- per-tenant keys, guardrails, and audit logs are all in the OSS repo - not gated.
- AI interaction visualization that makes debugging individual request/response flows much easier.