Cloudflare's AI Platform: an inference layer designed for agents

4/16/2026 at 8:30:25 PM

So it's basically just openrouter with cloudflare argo networking? I feel like they could do some much more interesting stuff with their replicate acquisition. Application specific RL is getting so good but there's no good way to deploy these models in a scalable way. Even the providers like fireworks which claim to let you deploy LORAs in a scalable way can't do it. For now I literally have to host base load on my application on a rack of 3090s in my garage which seems silly but it saves me $1k a month.

by mips_avatar

4/17/2026 at 7:38:46 AM

Running a rack of 3090s in your garage to avoid provider lock-in/costs is the most Hacker News thing. Out of curiosity, what are you doing for uptime/failover? If you are running production traffic to that garage rack, does your app just degrade gracefully if your home internet drops, or do you have a cloud fallback?

by bryden_cruz

4/17/2026 at 4:59:20 PM

Yeah the model i'm running locally is just one of several models the app supports and it falls back to others if not available.

by mips_avatar

4/17/2026 at 8:27:29 AM

[flagged]

by handfuloflight

4/17/2026 at 3:30:07 AM

Gilfoyle? Is that you?

by jonfromsf

4/17/2026 at 4:24:19 AM

I think these gpus were actually used for bitcoin mining before I bought them

by mips_avatar

4/17/2026 at 4:54:59 PM

It's Anton's grandson!

by menno-dot-ai

4/16/2026 at 10:00:40 PM

Curious which models are you able to run and how many 3090s do they require at scale?

by vladgur

4/16/2026 at 10:20:55 PM

4 3090s with nvlinks on each pair. Super fast inference on Moe models around 20-36b

by mips_avatar

4/17/2026 at 4:00:46 PM

> Super fast inference

How fast is "super fast" exactly, and with what runtime+model+quant specifically? Curious to see how how 4x 3090s compare to 1x Pro 6000, could probably put together 4x 3090s for a fraction of the cost compared to the Pro 6000, but the times I've seen the tok/s in/out for multiple GPUs my heart always drops a little.

by embedding-shape

4/17/2026 at 4:43:03 PM

I haven't benchmarked against a pro 6000, it's more that i have 4 3090s and i don't have a pro 6000.

by mips_avatar

4/17/2026 at 5:00:25 PM

Yes, that's why I'm asking you what exactly 4 3090s get in prompt-processing and generation, sorry if I was unclear.

by embedding-shape

4/17/2026 at 7:29:25 PM

Maxes out around 4K tok/s output. Each pair of 3090s has its own instance of the model with parallelism across the nvlink bridge. Though nvlink is only 2x over pcie5

by mips_avatar

4/17/2026 at 11:39:21 AM

The interesting part is that you can use the same API with Workers AI models (hosted at the edge) and proxied models (OpenRouter-style).

Disclaimer: I work at Cloudflare, but not on this.

by ascorbic

4/17/2026 at 11:11:47 PM

It's the same problem as fireworks, the only models supporting LORA are like year old dense models that perform horribly on most tasks. If you want to do anything close to relevant you still need to rent/own dedicated GPUs, which seems insane to me when vLLM fully support dynamic LORA loading.

by mips_avatar

4/16/2026 at 2:38:57 PM

This actually looks very useful. Cloudflare seems to be brining together a great set of tools. Not to mention, D2 is literally the only sqlite-as-a-service solution out there whose reliability is great and free tier limits are generous.

by whereistejas

4/16/2026 at 6:48:16 PM

D1 reliability has been bad in our experience. We've had queries hanging on their internal network layer for several seconds, sometimes double digits over extended periods (on the order of weeks). Recently I've seen a few times plain network exceptions - again, these are internal between their worker and the D1 hosts. And many of the hung queries wouldn't even show up under traces in their observability dashboard so unless you have your own timeout detection you wouldn't even know things are not working. It was hard to get someone on their side to take a look and actually acknowledge and understand the problem.

But even without network issues that have plagued it I would hesitate to build anything for production on it because it can't even do transactions and the product manager for D1 openly stated they wont implement them [0]. Your only way to ensure data consistency is to use a Durable Object which comes with its own costs and tradeoffs.

https://github.com/cloudflare/workers-sdk/issues/2733#issuec...

The basic idea of D1 is great. I just don't trust the implementation.

For a hobby project it's a neat product for sure.

by eis

4/16/2026 at 10:30:39 PM

> And many of the hung queries wouldn't even show up under traces in their observability dashboard

How did you work around this problem? As in, how do you monitor for hung queries and cancel them?

> D1 reliability has been bad in our experience.

What about reads? We use D1 in prod & our traffic pattern may not be similar to yours (our workload is async queue-driven & so retries last in order of weeks), nor have we really observed D1 erroring out for extended periods or frequently.

by ignoramous

4/17/2026 at 7:14:01 AM

> How did you work around this problem? As in, how do you monitor for hung queries and cancel them?

You just wrap your DB queries in your own timeout logic. You can then continue your business logic but you can't truly cancel the query because well, the communication layer for it is stuck and you can't kill it via a new connection. Your only choice is to abandon that query. Sometimes we could retry and it would immediately succeed suggesting that the original query probably had something like packetloss that wasn't handled properly by CF. Easy when it's a read but when you have writes then it gets complicated fast and you have to ensure your writes are idempotent. And since they don't support transactions it's even more complex.

Aphyr would have a field day with D1 I'd imagine.

> What about reads? We use D1 in prod & our traffic pattern may not be similar to yours (our workload is async queue-driven & so retries last in order of weeks), nor have we really observed D1 erroring out for extended periods or frequently.

We have reads and writes which most of the time are latency sensitive (direct user feedback). A user interaction can usually involve 3-5 queries and they might need to run in sequence. When queries take 500ms+ the system starts to feel sluggish. When they take 2-3s it's very frustrating. The high latencies happened for both reads and writes, you can do a simple "SELECT 123" and it would hang. You could even reproduce that from the Cloudflare dashboard when it's in this degradated state.

From the comments of others who had similar issues I think it heavily depends on the CF locations or D1 hosts. Most people probably are lucky and don't get one of the faulty D1 servers. But there are a few dozen people who were not so lucky, you can find them complaining on Github, on the CF forum etc. but simply not heard. And you can find these complaints going back years.

This long timeframe without fixes to their network stack (networking is CF's bread and butter!), the refusal to implement transactions, the silence in their forum to cries for help, the absurdly low 10GB limit for databases... it just all adds up. We made the decision to not implement any new product on D1 and just continue using proper databases. It's a shame because workers + a close-by read replica could be absolutely great for latency. Paradoxically it was the opposite outcome.

by eis

4/17/2026 at 8:05:16 AM

There is always one thing that bites you because Cloudflare is different. I just built an AI game (sleuththetruth.com) and the primary reason it's so slow to prompt a new board is actually not because of AI latency. It's because CF workers have a limit of 6 connections (including spawned workers). There is no way to gulp down all the wiki images I want all at once. If I had put the backend on Railway I don't think I'd have this issue.

by brikym

4/17/2026 at 3:24:07 PM

You can farm out the requests to a bunch of Durable Objects. Each DO will have a separate six-concurrent limit. And you can send unlimited concurrent requests to Durable Objects. (This is not an exploit, this is working as intended. The concurrency limit exists to prevent creating excessive connections from a single machine; farming to DOs means the requests are spread out.)

Also note that as of recently, the concurrent limit applies only up to the point that response headers are received, not during body streaming.

by kentonv

4/18/2026 at 4:12:29 AM

Great tip. I knew about #2 which still doesn't help me but #1 is nowhere in their docs!

by brikym

4/17/2026 at 12:36:19 PM

just keep-alive it with pipelining, depending on the server, 100k+ RPS.

by vjerancrnjak

4/16/2026 at 5:10:48 PM

* D1, but agreed. I wish Cloudflare would offer a built-in D1-R2 backups system though! (Can be done with custom code in a worker, but wish it was first-party)

by kylehotchkiss

4/16/2026 at 11:31:46 PM

yeah this really sucks.

No downtime snapshots would be the best but I'd be quite happy with a blocking backup on a set schedule that can be set from the GUI / from the cli / from a config file. Its a huge PITA having to play 'trust me bro' to clients and their admins with custom workers and backups.

I currently stream it D1 dump -> worker(encrypt w/ key wrapping) -> R2 on a schedule, then have a container spin up once a day and create changesets from the dumps. An external tool pulls the dumps and changesets.

by Normal_gaussian

4/16/2026 at 6:37:13 PM

> For those who don’t use Workers, we’ll be releasing REST API support in the coming weeks, so you can access the full model catalog from any environment.

Cloudflare seems to be building for lock-in and I don't love it. I especially don't understand how you build an OpenRouter and only have bindings for your custom runtime at launch.

by BoorishBears

4/16/2026 at 7:11:26 PM

Workers runtime is open source and permissively licensed fwiw

https://github.com/cloudflare/workerd

by switz

4/17/2026 at 7:33:59 AM

Yes but that is just a tiny part of the whole CF worker ecosystem. The other services are not open source and so the lock-in is very very real. There are no API compatible alternatives that cover a good chunk of the services. If you build your application around workers and make use of the integrated services and APIs there is no way for you to switch to another provider because well, there is none.

by eis

4/16/2026 at 5:55:07 PM

Agreed -- except that all of their docs and marketing pitches it for use cases like "per-user, per-tenant or per-entity databases" -- which would be SO great.

But in practice, it's basically impossible to use that way in conjunctions with workers, since you have to bind every database you want to use to the worker and binding a new database requires redeploying the worker.

by mikeocool

4/16/2026 at 7:41:17 PM

If you want to dynamically create sqlite databases, then moving to durable objects which are each backed by an sqlite database seems to be the way to go currently.

by AgentME

4/17/2026 at 7:30:11 AM

And now you've put everything on the equivalent of a single NodeJS process running on a tiny VM. Next step: spread out over multiple durable objects but that means implementing a sharding logic. Complexity escalates very fast once you leave toy project territory.

by eis

4/16/2026 at 7:11:49 PM

Yeah but the 10GB limit for D1 is crazy, can you really start building on that? Other than toy projects?

by rs_rs_rs_rs_rs

4/17/2026 at 9:10:50 AM

Most website content management systems would never get close to that size. If you need a bigger database, D1 is probably the wrong solution to begin with. 10GB can be millions of records depending on your table structure. But if you are gathering some survey data, running a CMS, etc. you probably should be fine with even just a few MB of data; which is probably the sweet spot for D1.

by jillesvangurp

4/17/2026 at 5:20:00 PM

Per their own docs, D1 is primarily meant for things like Auth DBs that you have frequent read/write access to but that store limited amounts of data. If you need more storage, running Postgres somewhere else and querying via Hyperdrive is probably what you want to do instead.

by chrisldgk

4/16/2026 at 8:06:25 PM

Really depends on what you’re putting in the DB. Cloudflare is clear that these are supposed to be very localized DBs. Per user or tenant.

by dpark

4/17/2026 at 10:53:17 AM

Turso/libsql has been great for poc project so far

by ncrmro

4/16/2026 at 6:35:51 PM

I find it really confusing that the worker AI models on here: https://developers.cloudflare.com/workers-ai/models/ do not have full overlap with the ones on here: https://developers.cloudflare.com/ai/models/

Yes, you can see the same "hosted" ones on there, but when you look at the models endpoint, there are much less options at the "workers-ai/*" namespace. Is that intentional?

by james2doyle

4/16/2026 at 6:40:25 PM

To better clarify, I don’t see "workers-ai/@cf/google/gemma-4-26b-a4b-it" in the /models enpoint in gateway.ai.cloudflare.com but it does seem to exist as a hosted model. Same with "workers-ai/@cf/nvidia/nemotron-3-120b-a12b" which I would expect to see

by james2doyle

4/16/2026 at 7:26:55 PM

Hey James.

Thanks for the feedback, and good catch. Looks like that endpoint is pulling from a slightly out of date data source. The docs/dashboard currently are the best resources for the full catalog, but we'll update that API to match.

by samjs

4/17/2026 at 2:20:30 PM

Sexy, but I wouldn't trust it. Why ? because Cloudflare AI Gateway is reporting inaccurate/wrong price for flagship models such as Nano Banana 2 and Nano Banana pro (I run production app using those). Been reporting it on discord and twitter, and they don't care. Entreprise client here :)

by sf_tristanb

4/17/2026 at 4:02:18 PM

hi, I am the PM for AI Gateway. We want to make sure our pricing is correct. I found your tweets about this and will dig in!

by minglu

4/20/2026 at 2:27:13 PM

thank you

by sf_tristanb

4/16/2026 at 9:44:06 PM

The inference layer question is getting solved fast. The harder problem coming next is the governance layer — what agents are authorised to do and proving it later. Curious if Cloudflare is thinking about this layer too.

by RITESH1985

4/16/2026 at 9:53:13 PM

You'd think there'd be some sort of automatic system where there's zero trust and each agent would have to provide its rbac creds to something to get authorization.

by halJordan

4/17/2026 at 10:03:00 AM

[dead]

by claud_ia

4/16/2026 at 2:18:09 PM

No spending limit / no ability to set a budget, unlike Google or OpenAI. Be prepared for an eye-watering invoice if you have a bug or get hacked.

edit: Why downvote? It's correct, and it's a risk that competitors handle better, including for their CDN products (compared to Bunny CDN). Maybe you are just used to the risk and haven't felt the burn yourself yet. Or you have the mistaken notion that there is no price at which temporary downtime is worthwhile to avoid paying.

by wahnfrieden

4/17/2026 at 12:54:18 AM

>Be prepared for an eye-watering invoice if you have a bug or get hacked.

Speaking of:

https://news.ycombinator.com/item?id=47787042

I really hope that person gets a resolution from Cloudflare that doesn't financially ruin them.

by rl3

4/16/2026 at 6:23:39 PM

I just added some credits to my account. You can set a daily $ spend limit as well as add credits without auto-refill

by james2doyle

4/16/2026 at 2:09:59 PM

Anthropic gonna acquire Cloudflare for stock. Solves their infrastructure problems in one shot.

by throwpoaster

4/16/2026 at 5:11:51 PM

No way! Cloudflare will buy anthropic when the economy begins self-correcting. Looking forward to Workers AI getting all those H100s to run more Qwens

by kylehotchkiss

4/16/2026 at 2:22:36 PM

I'm not ready to for another rug pull, so please no :( I really enjoy Cloudflare's CDN.

by neya

4/16/2026 at 2:41:03 PM

Big, could be a viable Bedrock alternative. Probably better uptime than Anthropic or AWS, too.

by ramesh31

4/16/2026 at 2:01:56 PM

Not seeing any pricing info on the models[1] page. Wonder how much of a lift this is over paying providers directly. Perhaps Cloudflare is doing this at cost? Also interesting that zero data retention is not on by default, and is not supported with all providers[2]. Finally, would be great if this could return OpenAI AND Anthropic style completions.

[1] https://developers.cloudflare.com/ai/models/

[2] https://developers.cloudflare.com/ai-gateway/features/unifie...

by bm-rf

4/16/2026 at 5:12:51 PM

Hey! I'm one of the engineers who built this :)

We'll be adding prices to the docs and the model catalog in the dashboard shortly.

In short: currently the pricing matches whatever the provider charges. You can buy unified billing credits [1] which charges a small processing fee.

> Finally, would be great if this could return OpenAI AND Anthropic style completions.

Agreed! This will be coming shortly. Currently we'll match the provider themselves, but we plan to make it possible to specify an API format when using LLMs.

[1]: https://developers.cloudflare.com/ai-gateway/features/unifie...

by samjs

4/16/2026 at 7:10:31 PM

excellent! please make sure to include rate limit details as well.

by agentifysh

4/16/2026 at 2:03:50 PM

Workers AI pricing is this: https://developers.cloudflare.com/workers-ai/platform/pricin...

by yoavm

4/16/2026 at 2:52:48 PM

Thanks, I don't see pricing for foundation models however, such as GPT-5.4

by bm-rf

4/16/2026 at 5:07:20 PM

It’s at-cost pricing I believe, no mark up

by ashleypeacock

4/16/2026 at 6:36:21 PM

Good to see their purchase of Replicate paying off!

by datadrivenangel

4/17/2026 at 8:29:07 PM

It's great to see more such platform popping up. It's good for the ecosystem. We need more hosting options that are clear, secure and have the ability to help people run as many models as possible.

by erans

4/17/2026 at 10:53:34 AM

Interesting timing — I've been using Bunny CDN for video delivery and considering moving parts to Cloudflare. Anyone have experience comparing the two for media streaming specifically?

by strimoza

4/17/2026 at 11:04:57 AM

Wondering why you're considering this move, I also use Bunny for some embedded videos and am considering fully moving my websites away from Cloudflare's CDN to Bunny

by __jonas

4/17/2026 at 3:31:58 PM

Not OP but things like this: (BunnyCDN has been silently losing our production files for 15 months) https://news.ycombinator.com/item?id=47710845

by dalenw

4/17/2026 at 10:13:46 AM

The interesting question isn't "can CF run agent inference" — it's what the routing layer needs to look like for multi-turn workflows. Shipping agent systems to enterprise clients the last year, the bottleneck is never raw tokens/sec. It's (a) state checkpointing betweentool calls, (b) cold-start latency on embedding/rerank models, (c) rate-limit coordination across concurrent agent loops. Does CF expose per-session state, or still stateless-per-request? Without that, you end up building the interesting part yourself.

by hemangjoshi37a

4/17/2026 at 11:11:18 AM

Thanks ChatGPT

by lateral_cloud

4/17/2026 at 1:07:16 PM

I've been using AI gateway for months already, is this any different or is it just moving out of beta?

by Invictus0

4/17/2026 at 4:11:21 PM

we've added bindings support, and it's an of the models you can use through unified billing on cloudflare. but still the same ai gateway product that you've been using!

by minglu

4/17/2026 at 2:52:03 PM

Is there something free like Codes or AntiGravity that can run open-source LLM models?

by VikRubenfeld

4/17/2026 at 2:57:28 PM

AnythingLLM.com lets you run local open source models.

by godzillabrennus

4/17/2026 at 1:28:24 PM

Would be nice to filter out "proxied" models in the Workers AI page.

by kol3x

4/16/2026 at 8:25:16 PM

So, is this similar to openrouter?

by messh

4/16/2026 at 9:06:39 PM

Yes with less models to choose from unless you bring your own model.

by pizzly

4/16/2026 at 8:30:40 PM

with Argo networking

by mips_avatar

4/17/2026 at 8:49:02 AM

That's so brilliant that it's already a thing called openrouter!

by TheServitor

4/16/2026 at 1:38:36 PM

Can't wait for the free tier!

by pprotas

4/16/2026 at 2:07:19 PM

Workers AI had a free tier since it launched, I think? See the pricing page I linked to above.

by yoavm

4/16/2026 at 4:01:29 PM

So looks like the AI Platform free tier will have access to the open models only perhaps? And the 10,000 neuron thing? I don't see any mention of frontier models in the url you linked in the other comment ( https://news.ycombinator.com/item?id=47792538#47793142 )

by indigodaddy

4/16/2026 at 1:42:09 PM

Sadly no mention on regions.

by Jack5500

4/16/2026 at 6:14:43 PM

It will work great in Spain! /s

by pjmlp

4/16/2026 at 2:09:03 PM

don’t attach to a single AI provider when you can attach to cloudflare as your single AI gateway provider!

rant aside, they are greatly positioned network wise to offer this service, i wonder about their princing and potential markup on top of token usage?

i presume they wont let you “manage all your AI spend in one place” for free.

by 6thbit

4/16/2026 at 2:10:19 PM

> i presume they wont let you “manage all your AI spend in one place” for free.

Of course they will. In return they get to control who they’re routing requests to. I wouldn’t be surprised if this turns I to the LLM equivalent of “paying for order flow”.

by koolba

4/16/2026 at 2:18:06 PM

i got shivers thinking about a future ai dynamic pricing and automatic gateway choosing the cheapest provider available

by 6thbit

4/16/2026 at 5:55:23 PM

shivers? as in it frightens you? i believe there is no way around tokens being prices like gasoline at the gas station - it changes every hour. Any other system means you are either over- or underspending.

by nubg

4/16/2026 at 2:49:28 PM

Openrouter already does this, unless I've misunderstood the premise.

by nhecker

4/16/2026 at 10:03:44 PM

They can route between models but you pay the standard rate for whichever model is selected (plus 5% fee). Afaik all current model providers have fixed prices per tokens which don't vary depending on, say, demand or hardware availability.

by 6thbit

4/16/2026 at 11:52:51 PM

openrouter works perfectly well for me called by cloudflare workers. open router also has superior cascading and waterfalling if models are offline. Not sure they have that working from V1.

I love everything about openrouter. So kinda a fan boy.

by kinnth

4/16/2026 at 3:02:43 PM

Can I set a hard cost limit ? Else I'm not interested, don't be like googles mess of billing.

by mbtrucks

4/16/2026 at 6:23:12 PM

Seems like it. I just added some credits to my account. You can set a daily $ spend limit as well as add credits without auto-refill

by james2doyle

4/16/2026 at 2:48:55 PM

What is Cloudflare trying to be? Everything everywhere all at once?

by ernsheong

4/16/2026 at 3:18:44 PM

They want to be an edge networking platform. Anything that would be useful doing on an edge node close to the end user is in scope.

by charcircuit

4/16/2026 at 3:36:01 PM

A CSP.

by PUSH_AX

4/16/2026 at 9:33:42 PM

`Unified inference layer` is a polite way to say: "proxy that knows every prompt and every response".

by reconnecting

4/16/2026 at 3:02:07 PM

Can I set a hard cost limit per day ? With no drift, else I'm not interested.

by mbtrucks

4/16/2026 at 7:00:58 PM

I think you should look at OpenRouter. It has budget controls

by tln

4/16/2026 at 3:08:49 PM

A few weeks ago, I ran into a bug with Cloudflare's DNS server not detecting when I updated the records with the registrar. The bug was 100% on their end, entirely unsolvable by me, yet they have made it literally impossible to contact them to file a bug report. Their standard user help workflow dead-ended by forcing me to talk to their absolutely useless AI help chatbot, which proceeded to regurgitate their FAQ (inaccurately, uselessly), then referred me to a phone number that was disconnected/not in service, then gave me an email address that auto-replied it was no longer in use, then just looped back to the FAQ. There was no way for me to even send them an email to let them know they have a major bug.

I immediately pulled all my sites off of Cloudflare and I will never use that godawful nightmare of a company for anything ever again. If they can't even host a generic help bot without screwing it up that badly, why would I ever use them for anything at all, never mind an AI platform?

by stult

4/16/2026 at 6:57:07 PM

What was the bug? I configure DNS for both public and private networks on cloudflare semi-frequently and always see changes in minutes or less.

by allthetime

4/17/2026 at 1:34:21 PM

That had previously been my experience with CF too. In this case, I was migrating my domain over from the registrar, and updated the nameservers to point to CF as per the standard practice, then waited for CF to detect the updated DNS records. Two days later (well after DNS should have propagated) CF was still displaying an error saying the update to the DNS record for the domain hadn't been detected.

There's not a lot of UI surface area that a user can touch that can even theoretically affect the NS detection process because that process happens in CF entirely "under the hood" as it were. You more or less just have to wait for CF to detect the DNS changes. That said, I tried everything I could think of to try to trigger their detector to reset, including deleting and recreating the site from scratch in CF. After another few days of combing through CF docs and forums, and after changing and reverting every setting I possibly could, I concluded there was no workaround available to me as a user and tried to reach CF as I described above.

Having done this many times before, I am quite certain that I set the nameservers correctly. I even had two other very experienced engineers review what I had done to make sure I wasn't falling victim to some mental blindspot that prevented me from recognizing what the problem was. I think every SWE has had the experience of spending an enormous amount of time debugging a problem only to realize they mistyped a magic string somewhere, but for whatever reason their brain just straight up refused to recognize the typo, but unfortunately that was not the case here. The other engineers saw what I saw and also were unable to fix the problem.

I was subsequently able to set DNS up on Vercel without any trouble at all. Bottom line, the issue was almost certainly a bug in Cloudflare's code. That indicates a code quality problem to me, which, in combination with the reckless incompetence that it takes to try to automate customer support with a chatbot that doesn't even have accurate information about their own processes and basic contact information, never mind a reasonable escape hatch to actual human-provided support in unusual cases (even for a paying customer), has led me simply not to trust them to deliver a reasonable quality product anymore.

They didn't even maintain any mechanism for reporting bugs to them, which is just insanity because it means there is no way to inform them even in extreme cases like a critical security bug. I get that they want to cut costs by reducing the employees needed to deal with customer service complaints, but it costs practically nothing to have a little feedback form somewhere, especially now that an LLM can handle most user feedback processing. Or failing that, a functioning support email address or phone number. But they can't even clear that incredibly low bar.

All of these issues could have been avoided with a very limited application of ordinary common sense and foresight. Whoever programmed their chatbot did not take the time to set up a decent RAG system with up-to-date information about their support processes and how to contact them, even though that is an obvious requirement for a tech support chatbot. They should also have recognized the business risks posed by exposing their customers to a system which lacks any escape hatches for outlier cases requiring actual human support, which risks alienating customers like me by forcing us to jump through Kafkaesque bureaucratic hoops just to get simple problems addressed, and--even worse--making it impossible to resolve such problems after jumping through all their hoops. The team implementing this chatbot didn't even think to include a contact form as a last resort method for reporting problems to them when the chatbot gets in over its head.

Most people hate this kind of LLM-provided customer support without any human escalation options, because the bots often end up uselessly looping through some debugging steps that simply do not work for the customer's specific issue for whatever reason, which feels like slamming your head against a wall repeatedly. It's a truly infuriating user experience and is practically guaranteed to destroy the business's public goodwill and reputation.

All of which means they are gutting their customer service department following some process that lacks access to these very basic insights, which screams mismanagement to me.

I'm not exactly a huge customer, but between my personal and business sites, I plowed $45k into CF last year, and will spend not another penny on them this year, or ever again. Maybe that's not huge spend in the grand scheme of the tech industry, but at a minimum that amount of money should entitle me to some human-provided support. My annual spend alone could provide the budget for multiple offshored CSRs. If I am spending enough money to buy a car, the least they can do is let me send them an email when I have a problem instead of just throwing me to the wolves.

Ultimately, they have a much weaker moat now than at any point in the past, because LLMs make it so much easier to build out critical functionality in-house that previously would have been worth paying someone else to manage via a SaaS. And while I may not be a big enough customer for them to worry about in and of myself, I am also not the only person affected by these business practices. Every affected person increases the reputational harms suffered by Cloudflare, with another alienated customer like me bashing CF in posts like this or in conversations with their friends and colleagues in the industry. Those harms should be very concerning to CF's management because it is extremely difficult to recover lost goodwill.

by stult

4/17/2026 at 1:13:14 AM

[dead]

by ZihangZ

4/16/2026 at 11:17:39 PM

[dead]

by kantaro

4/17/2026 at 9:51:07 AM

[dead]

by redoh