3/29/2026 at 7:01:38 PM
AI companies and notably AI scrapers are a cancer that is destroying what's left of the WWW.I was hit with a pretty substantial botnet "distributed scraping" attack yesterday.
- About 400,000 different IP addresses over about 3 hours
- Mostly residential IP addresses
- Valid and unique user agents and referrers
- Each IP address would make only a few requests with a long delay in between requests
It would hit the server hard until the server became slow to respond, then it would back off for about 30 seconds, then hit hard again. I was able to block most of the requests with a combination of user agent and referrer patterns, though some legit users may be blocked.
The attack was annoying, but, the even bigger problem is that the data on this website is under license - we have to pay for it, and it's not cheap. We are able to pay for it (barely) with advertising revenue and some subscriptions.
If everyone is getting this data from their "agent" and scrapers, that means no advertising revenue, and soon enough no more website to scrape, jobs lost, nowhere for scrapers to scrape for the data, nowhere for legit users to get the data for free, etc.
by lm411
3/29/2026 at 7:24:50 PM
Thanks for sharing the perspective here. I think a lot of folks on HN have rightly said that a lot of the problems with the modern internet are due to the ad-supported business model. I don't think you were ever going to move away from it voluntarily -- too many people support it, even if they grumble about it.But maybe (and likely for worse) LLMs will finally kill this model.
by everdrive
3/29/2026 at 8:15:55 PM
I would love for the ad-supported model to die. I hate ads, and I hate having to serve ads. We get some subscription users but nowhere near enough to cover costs.Unfortunately, what I think will happen - and indeed already is - is that the AI companies themselves will replace much of the WWW. Sites like the one I am talking about will cease to exist. AI companies, once they can no longer scrape (steal) the data will end up licensing the data themselves and replace us as the distributor to end users. Perhaps as a subscription add-on or also with an ad based model.
Which to some may be fine. Personally, I don't want a few centralized AI companies replacing the hundreds of thousands of independent websites online. Way too much centralized power there.
by lm411
3/30/2026 at 2:29:11 AM
Evidently, users and customers like not having to sift through hundreds or thousands of independent websites.by lotsofpulp
3/30/2026 at 2:55:27 AM
I much prefer having my thoughts distilled down into easily digestable and agreeable idioms that I can push around with absolute faith that they weren't just lies written by some PERSON on the internet.by TheScaryOne
3/30/2026 at 4:56:07 AM
Absolutely.It's so much easier to know the truth when someone else tells me what it is and what to think about it.
How refreshing.
by lm411
3/30/2026 at 8:53:52 AM
> I hate ads, and I hate having to serve ads. We get some subscription users but nowhere near enough to cover costs.I hate ads and I hate having to use an ad blocker to be able to not go crazy in order to use the Internet.
You merely hate "having" to serve ads because it denies you profit from the people you're exploiting with those ads.
Why is your business more deserving to exist on the Internet than my usage??
by tripzilch
3/29/2026 at 11:52:18 PM
Ad-free premium has shown itself again and again to devolve into ad protection rackets.The minute the internet dies for good, the chat bots will run half-locally and request payments to stop recommending VRAM enlarging pills.
by avadodin
3/29/2026 at 7:30:23 PM
Do you not run Anubis or have strict fail2ban rules? I just straight up ban IPs forever if they lookup files that will never exist on my servers. That plus Anubis with the strictest settings.by shimman
3/29/2026 at 8:05:54 PM
Fail2ban doesn't scale well to these volumes of traffic and request patterns.Just like fail2ban is not very useful against a DDOS attack where each unique IP only makes a few requests with a large (hour+) delay in between requests. There is no clear "fail" in these requests, and the fail2ban database becomes huge and far too slow.
- 400,000 Unique IP addresses
- 1 to 3 requests per hour per IP addresses - with delays of over 60 minutes between each request.
- Legit request URLs, legit UA & referrer
Maybe Anubis would help, but it's also a risk for various reasons.
by lm411
3/30/2026 at 3:03:47 AM
The more sophisticated bots run real headless browsers that anubis can't touch, and they only follow links that are actually visible on the page, so they wouldn't hit fail2ban.They even sell access to proxy servers that successfully evade cloudflare captchas automatically.
by ranger_danger
3/29/2026 at 8:45:08 PM
If you don't mind me asking, what sort of data are you licensing? I noticed that you explicitly don't mention it.by ctoth
3/30/2026 at 5:15:27 AM
And self-ddos via HN advertising (a la slashdotted?:)by x______________
3/29/2026 at 7:43:43 PM
At some point there needs to be a check if it's a real human... But it's a cat and mouse game - any way we create to keep bots off gets a work around by clever engineers.by afinlayson
3/30/2026 at 3:06:06 AM
What makes a real human?by ranger_danger
3/30/2026 at 9:14:23 AM
CC payment. This is the ultimate test.by AugSun
3/30/2026 at 1:03:54 PM
Hard disagree, it's very easy for a bot to use a credit card. And not only are card numbers often stolen, they're even given to teenagers these days, and can also be owned by businesses and exist entirely virtually... so I don't think you can assume the use of a credit card can always be tied to legitimate use by a single person.by ranger_danger
3/30/2026 at 8:51:48 PM
Companies would offer all-you-can-DDoS plans at $20/bot per month if they could. Bots are only a problem to them because they prevent legitimate customers from handing over their credit card.by rchaud
3/29/2026 at 7:24:31 PM
Don’t worry, man, once AGI is here you’ll get your allowance (or whatever the hyperscalers plan is).by wiseowise
3/29/2026 at 7:53:45 PM
You’ll enjoy painting or some other art even if you aren’t interested in the arts. That’s what I’ve seen written about it.by righthand
3/30/2026 at 12:56:53 AM
Or they'll let you starve to death which is way way easier and way faster for "them"by collingreen
3/30/2026 at 2:58:55 AM
Well yeah, art will be valueless in a flooded market. Starving is implied, “starving artist”.by righthand
3/29/2026 at 11:12:02 PM
Unfortunately nobody cares about destroying the internet if it gets them a Lambo.Greed and ignorance have taken over the tech industry.
by PearlRiver