1/23/2026 at 9:38:02 AM
Google quietly announced that Programmable Search (ex-Custom Search) won’t allow new engines to “search the entire web” anymore. New engines are capped at searching up to 50 domains, and existing full-web engines have until Jan 1, 2027 to transition.If you actually need whole-web search, Google now points you to an “interest form” for enterprise solutions (Vertex AI Search etc.), with no public pricing and no guarantee they’ll even reply.
This seems like it effectively ends the era of indie / niche search engines being able to build on Google’s index. Anything that looks like general web search is getting pushed behind enterprise gates.
I haven’t seen much discussion about this yet, but for anyone who built a small search product on Programmable Search, this feels like a pretty big shift.
Curious if others here are affected or already planning alternatives.
UPDATE: I logged into Programmable Search and the message is even more explicit: Full web search via the "Search the entire web" feature will be discontinued within the next year. Please update your search engine to specify specific sites to search. With this link: https://support.google.com/programmable-search/answer/123971...
by 01jonny01
1/23/2026 at 3:29:51 PM
I know that duckduckgo uses Microsoft Bing Custom search and honestly it is a much more robust system since you don't have to worry about Google axing it. https://www.customsearch.aiby zitterbewegung
1/23/2026 at 4:23:10 PM
Instead you worry about Microsoft axing it? Sure, it might take 3 years instead of 6 months, and the shutdown period would be 1 year instead of 1 month, but hardly either are long-term solutions.by embedding-shape
1/23/2026 at 6:15:39 PM
> it might take 3 years instead of 6 months, and the shutdown period would be 1 year instead of 1 monthThis matters much more than people (and evidently those within Google) realize
by aylmao
1/23/2026 at 6:32:10 PM
It matters more the shorter your future planning is. Neither works if you're looking forward 3-4 years, for example.by embedding-shape
1/26/2026 at 12:52:29 AM
Why wouldn’t it work if you plan ahead 3-4 years?Planning ahead doesn’t mean you never suffer from unexpected events, including depreciations. And long timelines help you as much as anyone else.
by aylmao
1/23/2026 at 8:08:30 PM
Having been downsizing my horde of computer junk I have several large boxes of full MSDN disc sets. There is a graveyard of MS stuff that is no longer supported. The only thing with MS is they seem to give you a better off ramp usually than 'oh well sucks to be you'.by sumtechguy
1/23/2026 at 6:25:56 PM
Bing Custom Search was discontinued last year. Although duckduckgo probably has some kind of special contract with Microsoft.by thayne
2/5/2026 at 5:27:12 PM
Important Email Update from Google:Dear Programmable Search Engine user,
Thank you for contacting us via the Web Search Products Interest Form. We have received your feedback and are actively reviewing the specific use cases you shared.
We are writing to share important details regarding the transition plan and the available solutions for your search needs.
1. For Unrestricted Web Search: Future Web Search Service
For partners requiring unrestricted "Search the entire web" capabilities, we are developing a new enterprise-grade Web Search Service. As you evaluate your future needs, please be aware of the commercial terms planned for this new service:
Pricing: USD $15 CPM (Cost Per Mille / 1,000 requests).
Minimum Commitment: A minimum monthly fee of USD $30,000 will apply.
We’ll release more information on this service later in 2026. Existing 'Search the entire web' engines remain functional until January 1, 2027.
2. For AI & Advanced Search: Google Vertex AI
We strongly encourage you to explore Google Vertex AI as another option for partners who need enterprise search and AI capabilities across 50 or fewer domains. Vertex AI offers powerful capabilities for:
Grounded Generation: Connecting your AI agents to your own data and/or to Google Search to provide accurate, up-to-date responses.
Custom Data Search: Building enterprise-grade search engines over your own data and specific websites.
This solution is available now and is designed to scale with your specific application needs.
Clarification on Current Service Status:
While you evaluate which path fits your business needs, please remember the timeline for your current implementations:
Existing Projects: If you have an existing Programmable Search Engine configured to "Search the entire web," your service will continue to function until January 1, 2027. You have the full year to plan your migration.
New Projects: As of January 20, 2026, new engines created in the Programmable Search Engine admin console are restricted to "Site Search Only" (specific domains only).
Later in 2026 we’ll provide you with more updates regarding the new Web Search Service and the means to express your desire to use it to power your web search experiences.
Sincerely,
Programmable Search Engine Team
by 01jonny01
1/23/2026 at 10:10:00 AM
I built my own web search index on bare metal, index now up to 34m docs: https://greppr.org/People rely too much on other people's infra and services, which can be decommissioned anytime. The Google Graveyard is real.
by saltysalt
1/23/2026 at 10:14:42 AM
Number of docs isn’t the limiting factor.I just searched for “stackoverflow” and the first result was this: https://www.perl.com/tags/stackoverflow/
The actual Stackoverflow site was ranked way down, below some weird twitter accounts.
by orf
1/23/2026 at 10:20:46 AM
I don't weight home pages in any way yet to bump them up, it's just raw search on keyword relevance.by saltysalt
1/23/2026 at 11:36:56 AM
Google's entire (initial) claim-to-fame was "PageRank", referring both to the ranking of pages and co-founder Larry Page, which strongly prioritised a relevance attribute over raw keyword findings (which then-popular alternatives such as Alta Vista, Yahoo, AskJeeves, Lycos, Infoseek, HotBot, etc., relied on, or the rather more notorious paid-rankings schemes in which SERP order was effectively sold). When it was first introduced, Google Web Search was absolutely worlds ahead of any competition. I remember this well having used them previously and adopted Google quite early (1998/99).Even with PageRank result prioritisation is highly subject to gaming. Raw keyword search is far more so (keyword stuffing and other shenanigans), moreso as any given search engine begins to become popular and catch the attention of publishers.
Google now applies other additional ordering factors as well. And of course has come to dominate SERP results with paid, advertised, listings, which are all but impossible to discern from "organic" search results.
(I've not used Google Web Search as my primary tool for well over a decade, and probably only run a few searches per month. DDG is my primary, though I'll look at a few others including Kagi and Marginalia, though those rarely.)
<https://en.wikipedia.org/wiki/PageRank>
"The anatomy of a large-scale hypertextual Web search engine" (1998) <http://infolab.stanford.edu/pub/papers/google.pdf> (PDF)
Early (1990s) search engines: <https://en.wikipedia.org/wiki/Search_engine#1990s:_Birth_of_...>.
by dredmorbius
1/23/2026 at 12:01:55 PM
PageRank was an innovative idea in the early days of the Internet when trust was high, but yes it's absolutely gamed now and I would be surprised if Google still relies on it.Fair play to them though, it enabled them to build a massive business.
by saltysalt
1/23/2026 at 12:07:14 PM
Anchor text information is arguably a better source for relevance ranking in my experience.I publish exports of the ones Marginalia is aware of[1] if you want to play with integrating them.
[1] https://downloads.marginalia.nu/exports/ grab 'atags-25-04-20.parquet'
by marginalia_nu
1/23/2026 at 1:11:50 PM
Though I'd think that you'd want to weight unaffiliated sites' anchor text to a given URL much higher than an affiliated site."Affiliation" is a tricky term itself. Content farms were popular in the aughts (though they seem to have largely subsided), firms such as Claria and Gator. There are chumboxes (Outbrain, Taboola), and of course affiliate links (e.g., to Amazon or other shopping sites). SEO manipulation is its own whole universe.
(I'm sure you know far more about this than I do, I'm mostly talking at other readers, and maybe hoping to glean some more wisdom from you ;-)
by dredmorbius
1/23/2026 at 1:37:12 PM
Oh yeah, there's definitely room for improvement in that general direction. Indexing anchor texts is much better than page rank, but in isolation, it's not sufficient.I've also seen some benefit fingerpinting the network traffic the websites make using a headless browser, to identify which ad networks they load. Very few spam sites have no ads, since there wouldn't be any economy in that.
e.g. https://marginalia-search.com/site/www.salon.com?view=traffi...
The full data set of DOM samples + recorded network traffic are in an enormous sqlite file (400GB+), and I haven't yet worked out any way of distributing the data yet. Though it's in the back of my mind as something I'd like to solve.
by marginalia_nu
1/23/2026 at 2:56:27 PM
Oh, that is clever!I'd also suspect that there are networks / links which are more likely signs of low-value content than others. Off the top of my head, crypto, MLM, known scam/fraud sites, and perhaps share links to certain social networks might be negative indicators.
by dredmorbius
1/23/2026 at 3:10:59 PM
You can actually identify clusters of websites based on the cosine similarity of their outbound links. Pretty useful for identifying content farms spanning multiple websites.Have a lil' data explorer for this: https://explore2.marginalia.nu/
Quite a lot of dead links in the dataset, but it's still useful.
by marginalia_nu
1/23/2026 at 12:12:08 PM
Very interesting, and it is very kind of you to share your data like that. Will review!by saltysalt
1/23/2026 at 1:23:05 PM
Google’s biggest search signal now is aggregate behavioral data reported from Chrome. That pervasive behavioral surveillance is the main reason Apple has never allowed a native Chrome app on iOS.It’s also why it is so hard to compete with Google. You guys are talking about techniques for analyzing the corpus of the search index. Google does that and has a direct view into how millions of people interact with it.
by snowwrestler
1/23/2026 at 2:16:24 PM
> That pervasive behavioral surveillance is the main reason Apple has never allowed a native Chrome app on iOSThe Chrome iOS app still knows every url visited, duration, scroll depth, etc.
by xnx
1/23/2026 at 2:28:30 PM
Yes indeed, they have an impossibly deep moat and deeper pockets. I'm certainly not trying to compete with them with my little side project, it's just for fun!by saltysalt
1/24/2026 at 5:30:30 PM
> That pervasive behavioral surveillance is the main reason Apple has never allowed a native Chrome app on iOS.There is a native Chrome app on iOS. It gets all the same url visit data as Chrome on other platforms.
Apple blocks 3rd party renderers and JS engines on iOS to protect its App Store from competition that might deliver software and content through other channels that they don't take a cut of.
by danans
1/23/2026 at 10:38:00 AM
Sure, but the point is results are not relevant at all?It’s cool though, and really fast
by orf
1/23/2026 at 10:42:12 AM
I'll work on that adjustment, it's fair feedback thanks!by saltysalt
1/23/2026 at 11:18:33 AM
Unfortunately this is the bulk of search engine work. Recursive scraping is easy in comparison, even with CAPTCHA bypassing. You either limit the index to only highly relevant sites (as Marginalia does) or you must work very hard to separate the spam from the ham. And spam in one search may be ham in another.by direwolf20
1/23/2026 at 11:38:40 AM
I limit it to highly relevant curated seed sites, and don't allow public submissions. I'd rather have a small high-quality index.You are absolutely right, it is the hardest part!
by saltysalt
1/23/2026 at 11:28:05 AM
What do you mean they're not relevant? The top result you linked contained the word stackoverflow didn't it? It's showing you exactly what you searched for. Why would you need a search engine at all if you already know the name of the thing? Just type stackoverflow.com into your address bar.I feel like Google-style "search" has made people really dumb and unable to help themselves.
by globular-toast
1/23/2026 at 11:43:31 AM
the query is just to highlight that relevance is a complex topic. few people would consider "perl blog posts from 2016 that have the stack overflow tag" as the most relevant result for that query.by orf
1/23/2026 at 12:53:50 PM
Confluence search does this, for our intranet. As a result it's barely usable.Indexing is a nice compact CS problem; not completely simple for huge datasets like the entire internet, but well-formed. Ranking is the thing that makes a search engine valuable. Especially when faced with people trying to game it with SEO.
by pjc50
1/23/2026 at 12:33:07 PM
This is pretty cool. Don't let the naysayers stop you. Taking a stab at beating Google at their core product is bravery in my book. The best of luck to you!by tosti
1/23/2026 at 2:32:32 PM
Thank you kindly! It's just for fun.by saltysalt
1/23/2026 at 4:49:56 PM
> it’s just for fun.amazing, for real.
everything i’ve read and heard about the good internet is that it was good because sooooo many of the people did stuff for exactly that, fun.
i’ve spent some time reading through some of the old email lists from earlier internet folks, they predicted exactly what weve turned this into. reading the resistance against early adoption of cookies is incredible to see how prescient some of those people were. truly incredible.
keep having fun with it, i think it’s our only way out of whatever this thing is we have now.
by toofy
1/23/2026 at 5:43:53 PM
Couldn't agree more! The early pioneers of the Internet were hackers and tinkers, I've tried to maintain the same ethos.by saltysalt
1/23/2026 at 6:45:28 PM
That's super cool! Do you have any plans to commercialize it or it's just a pet project?by Tenemo
1/23/2026 at 7:00:00 PM
Pet project just for fun, thanks!by saltysalt
1/23/2026 at 11:35:00 AM
Lol, a GooglePlus URL was mentionned on a webpage i browsed this week.#blastFromThePastby lolive
1/23/2026 at 2:37:43 PM
I still remember their circles interface ;-)by saltysalt
1/23/2026 at 10:38:14 AM
I tested it using a local keyword, as I normally do, and it took me to a Wikipedia page I didn’t know existed. So thanks for that.by johnofthesea
1/23/2026 at 10:43:48 AM
It will throw up weird and interesting results sometimes ;-)by saltysalt
1/23/2026 at 7:50:36 PM
Thanks for sharing, this is really impressive.Can you talk a bit about your stack? The about page mentions grep but I'd assume it's a bit more complex than having a large volume and running grep over it ;)
Is it some sort of custom database or did you keep it simple? Do you also run a crawler?
by bflesch
1/25/2026 at 10:12:32 PM
I huge Lucene index for storage and search, with a custom crawler that I wrote myself. It's a fun engineering problem.by saltysalt
1/23/2026 at 12:14:28 PM
Unfortunately the index is the easy part. Transforming user input into a series of tokens which get used to rank possible matches and return the top N, based on likely relevence, is the hard part and I'm afraid this doesn't appear to do an acceptable job with any of the queries I tested.There's a reason Google became so popular as quickly as it did. It's even harder to compete in this space nowadays, as the volume of junk and SEO spam is many orders of magnitude worse as a percentage of the corpus than it was back then.
by jfindley
1/23/2026 at 2:42:40 PM
I am definitely not trying to complete with Google, instead I am offering an old-school "just search" engine with no tracking, personalization filtering, or AI.It's driven by my own personal nostalgia for the early Internet, and to find interesting hidden corners of the Internet that are becoming increasingly hard to find on Google after you wade through all of the sponsored results and spam in the first few pages...
by saltysalt
1/23/2026 at 3:48:10 PM
There may be a free CS course out there that teaches how to implement a simplified version of Google's PageRank. It's essentially just the recursive idea that a page is important if important pages link to it. The original paper for it is a good read, too. Curiously, it took me forever to find the unaltered version of the paper that includes Appendix A: Advertising and Mixed Motives, explaining how any search engine with an ad-based business model will inherently be biased against the needs of their users[0][0] https://www.site.uottawa.ca/~stan/csi5389/readings/google.pd...
by prophesi
1/23/2026 at 5:44:29 PM
Nice find, will review!by saltysalt
1/23/2026 at 12:32:40 PM
The input on the results page doesn't work, you always need to return to the start page on which the browser history is disabled. That's just confusing behaviour.by 1718627440
1/23/2026 at 2:31:49 PM
I guess you used the return key instead of clicking on the search icon? Seems to be a bug with the return key, I'll fix that this weekend sorry.by saltysalt
1/23/2026 at 2:59:52 PM
True, didn't occur to me, that I should click on the icon instead. Once I have clicked on the search icon once, enter also works. When I input a short query (single letter) it sometimes just shows a blank page, but maybe that is just HNs hug of death. Consider putting the query term more prominently in the front of the URL, so users can edit it. Also from the startpage, the URL in the URLbar isn't updated. As I already wrote, the browser shows completion for the searchbar on the result page, but does not for the one one the startpage. For my taste I would prefer less JS trickery, which would maybe already get rid of some of these issues.by 1718627440
1/23/2026 at 4:30:31 PM
Appreciate the detailed feedback! A lot of the JS trickery and URL shenanigans I'm doing is to prevent bot spam attempts, which was a real problem in the beginning.by saltysalt
1/23/2026 at 5:49:07 PM
Sad state the web is in.It is intended, that the page currently shows a link to the wordpress login?
by 1718627440
1/23/2026 at 5:59:06 PM
It does not use WordPress.by saltysalt
1/23/2026 at 10:16:47 PM
I'm sorry, I am dumb and visited http://grepper.org/ . Where does your name come from I guess from grep for the WWW?by 1718627440
1/25/2026 at 10:06:46 PM
Yes correct, that is where the name comes from.by saltysalt
1/23/2026 at 10:15:11 AM
I made also something for my own search needs. It's just an SQLite table of domains, and places. I have your search engine there also ;-)https://github.com/rumca-js/Internet-Places-Database
Demo for most important ones https://rumca-js.github.io/search
by renegat0x0
1/23/2026 at 10:17:22 AM
Thank you, will check it out!by saltysalt
1/23/2026 at 12:41:46 PM
You should consider filtering by input language. Showing the same Wikipedia article in different languages is not helpful when I am searching in English. Also you may unify by entries by URL, it shows the same URL, just with different publish dates, which is interesting and might be useful, but should maybe be behind a toggle, as it is confusing at first.by 1718627440
1/23/2026 at 2:34:57 PM
Great feedback, agree I need to filter here. Some website localization is very hard to work around, because they will try to geo-locate the IP address of your bot and redirect it accordingly to a given language...by saltysalt
1/23/2026 at 3:02:25 PM
The issue I was having was with the query "term+wikipedia" it then shows the wikipedia article in Czech, Hungarian, Russian, some kind of Arab and other before finally showing the English version. Then also a lot of that occur 2,3,4+ times with the same URL, just differing in crawltime by a few minutes.by 1718627440
1/23/2026 at 10:13:20 PM
It's a difficult problem to fix, you can set an Accept-Language header on crawl requests but his only works if the target website uses "Content Negotiation." Some sites ignore headers and determine language based on the IP address (Geo-IP) or the URL structure (e.g., /es/ vs /en/), basically a mess...by saltysalt
1/23/2026 at 10:23:00 PM
I don't get the problem you claim. You crawl something and get a document in whatever language the site delivers you. You know the language of that document with the lang=... attribute of the document. What results you show for a given language is under your control and not influenced by what the crawled site chose to serve to the crawler.by 1718627440
1/25/2026 at 10:10:24 PM
I'm working on the language improvements presently, but I need to clean out a lot of bad entries in my index. In essence what I am trying to say is many servers ignore "Accept-Language" so you have to rely on other means of detecting the language of the page reliably, e.g. inspecting the body content of the response. It's a non-trivial problem online.by saltysalt
1/25/2026 at 10:20:10 PM
So html lang=... is wrong, or doesn't exist?> I am trying to say is many servers ignore "Accept-Language"
I wouldn't have expected that to be a hard rule, more like if there are multiple pages to return to have a factor, which one the user most likely wants.
by 1718627440
1/23/2026 at 3:48:54 PM
This is mad but cool. Keep at it.by dust-jacket
1/23/2026 at 5:48:06 PM
Thanks, mad is fun for me! It costs me nothing if it fails.by saltysalt
1/23/2026 at 10:37:23 AM
It's been clear for the last decade that we have to wean ourselves off of centralized search indexes if only to innoculate the Net against censorship/politically motivated black holing.I can only weep at this point, as the heroes that were the Silent and Greatest generations (in the U.S.), who fought hard to pass on as much institutional knowledge as possible through hardcore organization and distribution via public and University library, have had that legacy shit on by these ad obsessed cretins. The entirety of human published understanding; and we make it nigh impossible for all but the most determined to actually avail themselves of it.
by salawat
1/23/2026 at 11:10:19 AM
> “search the entire web”TIL they allowed that before. It sounds a bit crazy. Like Google is inviting people to repackage google search itself and sell it / serve with their own ads.
by raincole
1/23/2026 at 12:05:18 PM
You know, back in the days, the web used to be more open. Also - just because you CAN do something, doesn't mean you HAVE to.by MrGilbert
1/23/2026 at 12:21:01 PM
It basically means that Google is now transitioning into a private web.Others have to replace Google. We need access to public information. States can not allow corporations to hold us here hostage.
by shevy-java
1/23/2026 at 12:27:48 PM
I tried it and contributed to searx. It didn't give the same result as Google, and it also have 10k request rate limit (per month I believe). More than that you'll have to "contact us"by whs
1/23/2026 at 6:42:35 PM
If its Motion for a Partial Stay is denied, or if it loses on appeal, then under this Final Judgement Google will be forced to offer syndicated "full web" search to Qualified Competitorshttps://dn710204.ca.archive.org/0/items/gov.uscourts.dcd.223...
by 1vuio0pswjnm7
1/23/2026 at 9:52:03 AM
What are some of the niche search engines build on Google's index affected by this?by throwaway_20357
1/23/2026 at 10:01:07 AM
Kagiby doublerabbit
1/23/2026 at 10:46:07 AM
> Kagi This seems to be true, but more indirectly. From Kagi’s blog [0] which is a follow up to a Kagi blog post from last year [1].[0]> Google: Google does not offer a public search API. The only available path is an ad-syndication bundle with no changes to result presentation - the model Startpage uses. Ad syndication is a non-starter for Kagi’s ad-free subscription model.[^1]
[0]> The current interim approach (current as of Jan 21, 2026)
[0]> Because direct licensing isn’t available to us on compatible terms, we - like many others - use third-party API providers for SERP-style results (SERP meaning search engine results page). These providers serve major enterprises (according to their websites) including Nvidia, Adobe, Samsung, Stanford, DeepMind, Uber, and the United Nations.
I’m an avid Kagi user, and it seems like Kagi and some other notable interested parties have _already_ been unable to do get what they want/need with Google’s index.
[0]> The fact that we - and companies like Stanford, Nvidia, Adobe, and the United Nations - have had to rely on third-party vendors is a symptom of the closed ecosystem, not a preference.
Hopefully someone here can clarify for me, or enumerate some of these “third-party vendors” who seem like they will/might/could be directly affected by this.
[0] antibabelic > relevant https://blog.kagi.com/waiting-dawn-search [1] https://blog.kagi.com/dawn-new-era-search > [^1]: A note on Google’s existing APIs: Google offers PSE, designed for adding search boxes to websites. It can return web results, but with reduced scope and terms tailored for that narrow use case. More recently, Google offers Grounding with Google Search through Vertex AI, intended for grounding LLM responses. Neither is general-purpose index access. Programmable Search Engine is not designed for building competitive search. Grounding with Google Search is priced at $35 per 1,000 requests - economically unviable for search at scale, and structured as an AI add-on rather than standalone index syndication. These are not the FRAND terms the market needs
by nemosaltat
1/23/2026 at 11:24:43 AM
I believe they try to indirectly say they are using SerpApi or a similar product that scrapes Google search results to use them. And other big ones use it too so it must be ok...That must be the reason why they limit the searches you can do in the starter plan. Every SerpApi call costs money.
by tpetry
1/23/2026 at 12:35:09 PM
Google is also suing SerpAPIAnd I can't prove correlation but they refused to index one of my domains and I think it _might_ be because we had some content on there about how to use SerpAPI
by sixhobbits
1/23/2026 at 10:06:23 AM
They published this the other day:https://blog.kagi.com/waiting-dawn-search
Which saw some discussion on HN.
by marginalia_nu
1/23/2026 at 10:30:18 AM
> some discussion~450 score, ~247 comments and still on /best ("Most-upvoted stories of the last 48 hours"):
https://news.ycombinator.com/item?id=46708678 - "Waiting for dawn in search: Search index, Google rulings and impact on Kagi"
by embedding-shape
1/23/2026 at 12:00:51 PM
Kagi does not use Google's search index. From their post which made the front page of HN yesterday [1]:> Google does not offer a public search API. The only available path is an ad-syndication bundle with no changes to result presentation - the model Startpage uses. Ad syndication is a non-starter for Kagi’s ad-free subscription model.
by monooso
1/23/2026 at 12:25:47 PM
They then go on to say that they pay a 3rd party company to scrape Google results (and serve those scraped results to their users). So their search engine is indeed based on unauthorized and uncompensated use of Google's index.But since they're not using/paying for a supported API but just taking what they want, they indeed are unlikely to be impacted by this API turndown.
by jsnell
1/23/2026 at 2:30:28 PM
Congrats on saying that in the most one-sided way possible. Google makes it literally impossible for them to pay for access to search results to make the product they want (customizable subscription search with no ads), and Google also is the de-facto globally sanctioned crawler because they are the only search engine anyone gives a shit about, and also sites need to be indexed by them to survive. In short, Google owns the river and sells the boats, and the public built a wall around it. Google is in a monopoly position in search.by DangitBobby
1/23/2026 at 10:19:49 PM
They have a monopoly on their own search results. There's nothing stopping anyone from making their own (hell, a poster did so in the comments above). God forbid we aren't entitled access to the fruits of their labor; the reason you want it isn't because you can't make it (again, see above). It's because making it good is hard, and you want the good results without yourself putting in the effort to make itby Ferret7446
1/23/2026 at 3:10:24 PM
>In short, Google owns the river and sells the boats, and the public built a wall around it.That would be a monopoly if there was only 1 river in the whole world.
by nova22033
1/23/2026 at 6:51:47 PM
Yeah I mean think whatever you need to for the metaphor to work.by DangitBobby
1/23/2026 at 5:45:20 PM
They get results from another provider who has authorized access. Google doesn't provide search results to unauthorized requests as many on tor have experienced.by ipaddr
1/24/2026 at 4:51:59 PM
No. They pay SerpApi to scrape Google. And SerpApi is currently being sued by Google for unauthorized scraping.Kagi did make comments for years implying that they had a deal with Google for search results, but their latest blog post makes it clear that is not true and was never true.
by jsnell
1/23/2026 at 10:01:33 PM
Residential proxies are also cheaper than you might realize.by direwolf20
1/23/2026 at 10:18:42 AM
No wonder Kagi is angry.Google is a monopoly across several broad categories. They're also a taxation enterprise.
Google Search took over as the URL bar for 91% of all web users across all devices.
Since this intercepts trademarks and brand names, Google gets to tax all businesses unfairly.
Tell your legislators in the US and the EU that Google shouldn't be able to sell ads against registered trademarks (+/- some edit distance). They re-engineered the web to be a taxation system for all businesses across all categories.
Searching for Claude -> Ads in first place
Searching for ChatGPT -> Ads in first place
Searching for iPhone -> Ads in first place
This is inexcusable.
Only searches for "ChatGPT versus", "iPhone reviews", or "Nintendo game comparison" should allow ads. And one could argue that the "URL Bar" shouldn't auto suggest these either when a trademark is in the URL bar.
If Google won't play fair, we have to kill 50% of their search revenue for being egregiously evil.
If you own a trademark, Google shouldn't be able to sell ads against you.
--
Google's really bad. Ideally we'd get an antitrust breakup. They're worse than Ma Bell. I wouldn't even split Google into multiple companies by division - I'd force them to be multiple copies of the same exact entity that then have to compete with each other:
Bell Systems -> {BellSouth, Bell Atlantic, Southwestern Bell, ...}
Google -> {GoogleA, GoogleB, GoogleC, ...}
They'd each have cloud, search, browser, and YouTube. But new brand names for new parent companies. That would create all-out war and lead to incredible consumer wins.
by echelon
1/23/2026 at 10:40:27 AM
Could probably argued that search access is an essential facility[1], though it doesn't appear antitrust law has anywhere near the same sort of enforcement it did in the past.[1] https://en.wikipedia.org/wiki/Essential_facilities_doctrine
by marginalia_nu
1/23/2026 at 4:20:47 PM
> If you own a trademark, Google shouldn't be able to sell ads against you.This is frustrating even from a consumer perspective. Before I ran adblock everywhere, I couldn't stand that typing in a specific company I was looking for would just serve ads from any number of related brands that I wasn't looking for that were competitors.
by thewebguyd
1/23/2026 at 12:29:21 PM
what stops Kagi from indexing internet and makes them pay some guys to scrape search results from Google? one guy at Marginalia can do it and entire dev team at a PAID search engine can't?by throwaway290
1/23/2026 at 3:26:33 PM
I don't know about others, but we have special rules for Google, Bing, and a few others, rate-limiting them less than some random bot.The problem is scrapers (mostly AI scrapers from what we can tell). They will pound a site into the ground and not care and they are becoming increasingly good at hiding their tracks. The only reasonable way to deal with them is to rate-limit every IP by default and then lifting some of those restrictions on known, well behaving bots. Now we will lift those restrictions if asked, and frequently look at statistics to lift the restrictions from search engines we might have missed, but it's an up hill battle if you're new and unknown.
by mrweasel
1/23/2026 at 2:36:31 PM
As we've seen here on HN on the AI boom, it's not wonderful when a bunch of companies all use bots to scrape the entire web. Many sites only allow Google scrapers in robots.txt and the public will fight you hard if you scrape them without permission. It's just one of those things where it would be better for everyone if search engines could pay for access to the work that's done only once.by DangitBobby
1/23/2026 at 2:52:13 PM
> Many sites only allow Google scrapers in robots.txt and the public will fight you hard if you scrape them without permission.This just lets a monopoly replace the website instead of distributing power and fostering open source. The same monopoly that was already bleeding off the web's utility and taxing it.
by echelon
1/23/2026 at 11:52:30 AM
[dead]by onetokeoverthe
1/23/2026 at 10:07:09 AM
I think Kagi buys search engine results from SERP vendors who typically scrape Google’s results and offer an API experience on top of it.by pell
1/24/2026 at 7:45:50 PM
Damn, I just wrote a note "search is free" in my aggressively-automate-everything-using-llms personal project plan.md. I guess not anymore.by vagab0nd