Show HN: GoModel – an open-source AI gateway in Go

4/22/2026 at 12:15:03 PM

Nice work. How are you doing the cache key when the prompt has a timestamp or session id in it? Regex pass to strip volatile fields before hashing, or do you make the user declare what's stable upfront? I've tried both in my own stuff and neither ends up clean.

by nopointttt

4/21/2026 at 5:37:14 PM

Looks nice, thanks for open sourcing and sharing.

I'm all in on Go and integrating AI up and down our systems for https://housecat.com/ and am currently familiar and happy with:

https://github.com/boldsoftware/shelley -- full Go-based coding agent with LLM gateway.

https://github.com/maragudk/gai -- provides Go interfaces around Anthropic / OpenAI / Google.

Adding this to the list as well as bifrost to look into.

Any other Go-based AI / LLM tools folks are happy with?

I'll second the request to add support for harnesses with subscriptions, specifically Claude Code, into the mix.

by nzoschke

4/21/2026 at 7:20:07 PM

If you're all in on Go and AI, you might want to take a look at: https://github.com/ewhauser/gbash

It's a just-bash like variant implemented in Go. Useful for giving a managed bash tool to your agents without a full sandboxing solution.

by ewhauser421

4/22/2026 at 5:04:39 AM

I built Fence (https://github.com/Use-Tusk/fence) in Go, a lightweight process sandbox for CLI agents (or any command really) with filesystem and network restrictions. It's also available as a Go library if you wish to add sandboxing to Shelley.

by jy-tan

4/21/2026 at 6:16:06 PM

I'll take a closer look at it over the next few days.

However, it might be challenging, considering that Claude Code with a subscription no longer officially works with OpenClaw.

by santiago-pl

4/21/2026 at 8:07:17 PM

Yeah I share the same uncertainty here. My understanding is personal and interactive use should be fine. I use Conductor all day every day and it wraps a subscription.

Perhaps fully automated use is where the line is drawn.

But I also suspect individuals using it for light automated dispatching would be ok too.

by nzoschke

4/21/2026 at 6:38:20 PM

That... might have changed?

https://news.ycombinator.com/item?id=47844269

by arcanemachiner

4/21/2026 at 9:48:30 PM

It suppose to work again based on todays news

by lackoftactics

4/21/2026 at 11:41:23 PM

That's great news! The AI model ecosystem is changing so fast.

by santiago-pl

4/21/2026 at 6:57:18 PM

i use this for my personal projects. some features are gated behind a license but the basics like provider proxy, logs, metrics are covered in the free version. https://github.com/maximhq/bifrost

by rolls-reus

4/21/2026 at 9:11:14 PM

> Any other Go-based AI / LLM tools folks are happy with?

I can throw my hat into the ring, built on ADK, CUE, and Dagger (all also in Go); CLI, TUI, and VSCode interfaces. It's my personal / custom stack, still need to write up docs. My favorite features are powered by Dagger, sandbox with time travel, forking, jump into shell at any turn, diff between any points.

Good entrypoint folder: https://github.com/hofstadter-io/hof/tree/_next/lib/agent

by verdverm

4/21/2026 at 5:28:19 PM

I have written and maintained AI proxies. They are not terribly complex except the inconsistent structure of input and output that changes on each model and provider release. I figure that if there is a not a < 24 hour turn around for new model integration the project is not properly maintained.

Governance is the biggest concern at this point - with proper logging, and integration to 3rd party services that provide inspection and DLP type threat mitigation.

by pizzafeelsright

4/21/2026 at 11:02:48 PM

Given the LiteLLM Supply Chain Incident, threat mitigation and governance are definitely major concerns.

by ijk

4/21/2026 at 5:22:18 PM

I wrote a similar golang gateway, with the understanding that having solid API gateway features is important.

https://sbproxy.dev - engine is fully open source.

Another reason golang is interesting for the gateway is having clear control of the supply chain at compile time. Tools like LiteLLM the supply chain attacks can have more impact at runtime, where the compiled binary helps.

by crawdog

4/21/2026 at 5:54:33 PM

Maybe worth showing on SHOW HN

by lackoftactics

4/21/2026 at 6:23:51 PM

Thanks I am finishing up some performance comparison work looking at rust vs golang and plan a deeper write up for that group. I hope to publish soon.

by crawdog

4/21/2026 at 8:14:17 PM

Hey, this looks super nice. I do like the 'compact' feel of this. Reminds me of Traefik. It seems very promising indeed!

One problem I have is that yes, LiteLLM key creation is easier than creating it directly at the providers and managing it there for team members and test environments, but if I had a way of generating keys via vault, it would be perfect and such a relief in many ways.

I see what I need on your roadmap, but miss integration with service where I can inspect and debug completion traffic, and I don't see if I would be able to track usage from individual end-users through a header.

Thank you and godspeed!

by hgo

4/21/2026 at 8:38:47 PM

"... and I don't see if I would be able to track usage from individual end-users through a header".

Currently we have a unified concept of User-Paths. Once you add a specific header OR assign User-Path to an API key, you can track the usage based on this. The User-Path might be youe end-user, internal user or some service. Examples:

  /client1/app1
  /agents/agent1
  /team2/john
  /team2/adam

Would this work for you?

https://gomodel.enterpilot.io/docs/features/user-path

PS Thanks for the feedback on the Vault integration. Noted.

by santiago-pl

4/22/2026 at 8:32:43 AM

Ah, seems like the right thing. To be more clear on what I'm looking for is this: the system using the LLM gateway would present an arbitrary user id. Let's say the system has thousands of end-users (completely managed by that system and not configured in the LLM proxy). The admin is interested in blocking end-users from using more than a certain allowed quota.

by hgo

4/21/2026 at 9:28:21 PM

Given this app seems to expose itself via REST calls, why would anyone care that it’s written in Go? I guess it matters to potential contributors but the majority of interest would be from users.

by neilly

4/21/2026 at 11:11:45 PM

It's like fuel costs in a supply chain. When you buy apples at the store, you don't think about oil prices. But if trucks ran on something cheaper, more efficient, or less taxed, the apples on the shelf would be cheaper too.

by santiago-pl

4/22/2026 at 9:51:54 AM

Really slick and lightweight! My biggest question is about the upstream API evolution. Keeping up with every provider's updates feels like a full-time job—how do you automate or manage the tracking of these constant protocol shifts?

by StingSS

4/22/2026 at 10:15:09 AM

I'm wroking on it full-time right now. It might be challenging, especially when it comes to interactions with video, audio, and image models. I'm just trying to stay on top of what's happening and add new things day by day.

An even bigger challenge might be integrating it tighter with the cloud platforms, Prometheus, OAuth, Datadog, vaults, and DLP software.

by santiago-pl

4/21/2026 at 4:59:25 PM

Does this have a unified API? In playing around with some of these, including unified libraries to work with various providers, I've found you are, at some point, still forced to do provider-specific works for things such as setting temperatures, setting reasoning effort, setting tool choice modes, etc.

What I'd like is for a proxy or library to provide a truly unified API where it will really let me integrate once and then never have to bother with provider quirks myself.

Also, are you also planning on doing an open-source rug pull like so many projects out there, including litellm?

by mosselman

4/21/2026 at 5:23:28 PM

1. Yes, we have OpenAI-compatible API and we develop GoModel with Postel’s law in mind: https://gomodel.enterpilot.io/docs/about/technical-philosoph... .

2. Regarding being open-source and the license, I've described our approach here transparently: https://gomodel.enterpilot.io/docs/about/license

by santiago-pl

4/21/2026 at 5:03:49 PM

Are these kinds of libraries a temporary phenomenon? It strikes me as weird that providers haven't settled on a single API by now. Of course they aren't interested in making it easier for customers to switch away from them, but if a proprietary API was a critical part of your business plan, you probably weren't going to make it anyway.

(I'm asking only about the compatibility layer; the other tracking features would be useful even if there were only one cloud LLM API.)

by sowbug

4/21/2026 at 5:41:10 PM

I've been maintaining an abstraction layer over multiple providers for a couple of years now - https://llm.datasette.io/

The best effort we have to defining a standard is OpenAI harmony/responses - https://developers.openai.com/cookbook/articles/openai-harmo... - but it's not seen much pickup. The older OpenAI Chat Completions thing is much more of an ad-hoc standard - almost every provider ends up serving up a clone of that, albeit with frustrating differences because there's no formal spec to work against.

The key problem is that providers are still inventing new stuff, so committing to a standard doesn't work for them because it may not cover the next set of features.

2025 was particularly turbulent because everyone was adding reasoning mechanisms to their APIs in subtly different shapes. Tool calls and response schemas (which are confusingly not always the same thing) have also had a lot of variance - some providers allow for multiple tool calls in the same response, for example.

My hunch is we'll need abstraction layers for quite a while longer, because the shape of these APIs is still too frothy to support a standard that everyone can get behind without restricting their options for future products too much.

by simonw

4/22/2026 at 3:33:27 AM

Calling it "Harmony" is a historical death knell. :-D

by jdub

4/21/2026 at 5:18:50 PM

The providers themselves can't keep this straight even within their own ecosystem. Plus everyone is running at a million miles/hour.

For example `Claude code` used to set 2 specific beta headers with some version numbers for their Max subscription to be supported.

Oauth tokens for Max plan is different from how their API keys looked. They kind of look similar, but has specific prefix that these tool pre-validate.

It is barely working at this point even within a single provider

by harikb

4/21/2026 at 7:51:48 PM

It’s a complete mess, and the hardest part of this kind of tool is maintenance.

It’s not just about incompatible APIs, but also about how messages are structured. Even getting reliable tool calling requires a significant amount of work and testing for each individual model.

Just look at LiteLLM’s commit history and open issues/PRs. They’re still struggling with reliable multi-turn tool calling for Gemini, Kimi requires hardcoded rules (so K2.6 is currently unsupported because it’s not on the list), and so on.

Implementing the basic, generic OpenAI/Anthropic protocols is trivial, and at that point it almost feels like building an AI gateway is done. But it isn’t — that’s just the beginning of a long journey of constantly dealing with bugs, changes, and the quirks of each provider and model.

by jedisct1

4/21/2026 at 5:41:19 PM

This is awesome work, thanks for sharing!

How do you plan on keeping up with upstream changes from the API providers? I have implemented something similar, and the biggest issue I have faced with go is that providers don’t usually have sdk’s (compared to javascript and python), and there is work involved in staying up to date at each release.

by glerk

4/21/2026 at 7:55:33 PM

First, GoModel is designed to be flexible. If you add an extra field, it tries to pass it through in the appropriate place (Postel's law)

Therefore there's a good chance that if they make a minor API-level change, GoModel will handle it without any code changes.

Also, changes to providers' API formats might be less and less frequent. Keeping up typically means adding a few lines of code per month. I'm usually aware of those changes because I use LLMs daily and follow the news in a few places.

As a fallback, GoModel includes a passthrough API that forwards your request to the provider in its original format. That might be useful when an AI provider changes their contract significantly and we haven't caught up yet.

Also, official SDKs aren't bug-free either. Skipping that extra layer and hitting the API directly might actually be beneficial for GoModel.

by santiago-pl

4/21/2026 at 6:11:10 PM

Most APIs provide some sort of documentation. If it’s swagger you can just update the application from that.

by vorticalbox

4/21/2026 at 5:51:31 PM

Almost impossible without backing from some VC like litellm

by lackoftactics

4/21/2026 at 6:31:55 PM

ridiculous statement. most people dont need long tail.

by swyx

4/21/2026 at 6:36:00 PM

I might be wrong, but if you go after 200 integrations keeping them on is substantial work for solo founder

by lackoftactics

4/21/2026 at 6:38:12 PM

real people just need like 20 at most.

by swyx

4/21/2026 at 6:44:44 PM

if that's the case and doesn't need crazy number of integrations, I agree with you 100%

by lackoftactics

4/21/2026 at 7:59:39 PM

Yes, the number of meaningful providers might be around 20-30.

by santiago-pl

4/21/2026 at 3:15:17 PM

Expectable, given that LiteLLM seems to be implemented in Python.

However kudos for the project, we need more alternatives in compiled languages.

by pjmlp

4/21/2026 at 9:25:37 PM

LiteLLM proxy has to be about the worst codebase and performing Python code I've used in a long time. Incredibly poor performance, riddled with serious bugs and to be honest devs that don't seem to understand them.

by smcleod

4/21/2026 at 11:27:51 PM

TBH I decided to write GoModel because I needed something like this for my startup, enterpilot, and LiteLLM didn’t meet my needs.

by santiago-pl

4/21/2026 at 3:34:15 PM

Agree and thank you! Please let us know if you'd like to give it a try and if you miss any feature in GoModel.

by santiago-pl

4/21/2026 at 4:41:56 PM

It’s also badly implemented - everything is a global import. Had to stop using it

by goodkiwi

4/22/2026 at 10:30:40 AM

Love the progress!! keey going

by Trenchcole

4/21/2026 at 3:15:07 PM

Curious how the semantic caching layer works.. are you embedding requests on the gateway side and doing a vector similarity lookup before proxying? And if so, how do you handle cache invalidation when the underlying model changes or gets updated?

by Talderigi

4/21/2026 at 3:43:22 PM

Hey, contributor here. That's right, GoModel embeds requests and does vector similarity lookup before proxying. Regarding the cache invalidation, there is no "purging" involved – the model is part of the namespace (params_hash includes the LLM model, path, guardrails hash, etc). TTL takes care of the cleanup later.

by giorgi_pro

4/21/2026 at 4:49:08 PM

Nice one! Let's say I'm serving local models via vllm (because ollama comes with huge performance hits), how would I implement that in gomodel?

by driese

4/22/2026 at 8:58:12 AM

I've released a new version of GoModel (0.1.20) with explicit support for vllm. You can now use it even with a few vLLM instances. Like this:

  docker run --rm -p 8080:8080 \
    -e VLLM_BASE_URL=http://host.docker.internal:18000/v1 \
    -e VLLM_BASEMENT_BASE_URL=http://host.docker.internal:18000/v1 \
    enterpilot/gomodel:latest

by santiago-pl

4/21/2026 at 5:02:45 PM

This is way more interesting to me as well. I have projects that use small limited-purpose language models that run on local network servers and something like this project would be a lot simpler than manually configuring API clients for each model in each project.

by devmor

4/21/2026 at 5:35:29 PM

Thanks for raising it! Since vLLM has an OpenAI-compatible API, this should work for now:

  docker run --rm -p 8080:8080 \
    -e OPENAI_API_KEY="some-vllm-key-if-needed" \
    -e OPENAI_BASE_URL="http://host.docker.internal:11434/v1" \
    ...
    enterpilot/gomodel

I'll add a more convenient way to configure it in the coming days.

by santiago-pl

4/21/2026 at 4:03:20 PM

Any plans for AI provider subscription compatibility? Eg ChatGPT, GH Copilot etc ? (Ala opencode)

by indigodaddy

4/21/2026 at 4:28:51 PM

You are not the first person who has asked about it.

It looks like a useful feature to have. Therefore, I'll dig into this topic more broadly over the next few days and let you know here whether, and possibly when, we plan to add it.

by santiago-pl

4/21/2026 at 3:51:29 PM

This is really useful. I've been building an AI platform (HOCKS AI) where I route different tasks to different providers — free OpenRouter models for chat/code gen, Gemini for vision tasks. The biggest pain point has been exactly what you describe: switching models without changing app code.

One thing I'd love to see is built-in cost tracking per model/route. When you're mixing free and paid models, knowing exactly where your spend goes is critical. Do you have plans for that in the dashboard?

by tahosin

4/21/2026 at 4:02:05 PM

This comment looks like AI-generated.

However IIUC what you're asking for - it's already in the dashboard! Check the Usage page.

by santiago-pl

4/21/2026 at 4:14:18 PM

I don't see any significant advantage over mature routers like Bifrost.

Are there even any benchmarks?

by rvz

4/22/2026 at 10:02:05 AM

My thoughts about this:

Benchmarking AI gateways properly is harder than it looks. Feature sets differ meaningfully - exact vs semantic caching, cluster mode, guardrails, audit logging - and each carries its own latency cost. What actually matters for most users is end-to-end latency including provider overhead (200–2000ms), and in that frame Bifrost, LiteLLM, and GoModel are all perfectly fine.

I ran some comparisons but I'm not happy with the methodology, and I'd rather not spread misleading information. Once I have time to do it properly I'll write it up and share a link here. Honestly, I'd also love to see benchmarks done by someone other than the AI gateway builders. :)

Where GoModel actually differs today:

  - image size: 16.96 MB vs Bifrost's 69.84 MB. It matters for sidecar, edge, and cold-start scenarios.
  - per-tenant keys, guardrails, and audit logs are all in the OSS repo - not gated.
  - AI interaction visualization that makes debugging individual request/response flows much easier.

by santiago-pl

4/21/2026 at 5:43:26 PM

It’s a heavily vibe coded project with only proxy with terrible benchmarks design. Basically vibe coded benchmarks that lie through ignorance of mocked super fast endpoint without using full power of litellm in multiple processes.

Other than that almost useless it’s faster when this will be io bound and not cpu bound.

by lackoftactics

4/21/2026 at 6:42:02 PM

Which project are you talking about, GoModel or Bifrost?

by eikenberry

4/21/2026 at 6:56:44 PM

GoModel. I see some red flags in the docs/benchmarks, but I could be wrong in my judgement here.

What I noticed: the website shows a diagram of the litellm SDK communicating with the gateway proxy of GoModel, poor design of benchmarks, the scope of the project in readme vs. depth.

I don't have professional experience in GoLang, so will not comment on quality of code.

There are some genuinely good things about this project and the effort here, but with solid position of Bifrost sitting at a version above 1.0.0 and so many other initiatives in this space, it's a tough market.

by lackoftactics

4/21/2026 at 8:22:35 PM

The LiteLLM SDK is intentionally on the website. You can "talk" to GoModel with it because both projects use an OpenAI-compatible API under the hood.

You can use it like this:

  from litellm import completion
  print(completion(
      model="openai/gpt-4.1-nano",
      api_base="http://localhost:8080/v1",
      api_key="your-gomodel-key",
      messages=[{"role": "user", "content": "hi"}],
  ).choices[0].message.content)

by santiago-pl

4/21/2026 at 8:55:31 PM

Thank you

by lackoftactics

4/21/2026 at 7:16:54 PM

it's nice that it supports different providers

by immanuwell

4/21/2026 at 7:45:40 PM

looks interesting, will defo give it a try. thanks for open-sourcing it!

by phoenixranger

4/21/2026 at 3:01:05 PM

how does this compare to bifrost - another golang router?

by anilgulecha

4/21/2026 at 3:28:26 PM

First of all, GoModel doesn't have a separate private repository behind a paywall/license.

It's more lightweight and simpler. The Bifrost docker image looks 4x larger, at least for now.

IMO GoModel is more convenient for debugging and for seeing how your request flows through different layers of AI Gateways in the Audit Logs.

by santiago-pl

4/21/2026 at 3:33:20 PM

That would be valuable if there's a commitment to never have a non-opensource offering under GoModel? If so, you can document it in the repo.

by anilgulecha

4/21/2026 at 3:37:24 PM

I would love to keep it open source forever, but I can't promise that for now. I've written a whole doc page about it if you're curious: https://gomodel.enterpilot.io/docs/about/license

by santiago-pl

4/21/2026 at 6:31:43 PM

If your concern is someone selling GoModel as a service, you could add a license provision for that. Technically it'd no longer be open source, I think, but most people won't care.

by antonvs

4/21/2026 at 11:22:13 PM

I'll consider it for sure.

by santiago-pl

4/21/2026 at 6:44:57 PM

[dead]

by rpdaiml

4/21/2026 at 5:00:24 PM

[dead]

by pukaworks