5/21/2026 at 4:24:55 PM
There’s a fallacy that gets used a whole lot to justify things like this (not just with LLMs), and I see it in many of the comments here: If it’s OK (or at least negligible on a small scale), then it must be OK on a large scale.It usually goes something like: If I can make money by learning something from a web page, why does a computer making money by learning everything from everyone upset people so? It’s the same thing!
It’s like if I go to Golden Gate Park and pick one flower, I shouldn’t do that, but no one cares. But if I build a machine to automatically cut every flower in the park because I want to sell them, that’s different.
“You say I can pick one flower, but you get upset when I take a bunch. That’s inconsistent. Check and mate.”
But quantitative changes in an activity produce qualitative changes. Everyone knows this, but sometimes they seem to find it inconvenient to admit it. Not that effects of the qualitative change are always bad, but they are often different, and worth considering rather than dismissing.
by danorama
5/21/2026 at 6:20:25 PM
> It’s like if I go to Golden Gate Park and pick one flower, I shouldn’t do that, but no one cares. But if I build a machine to automatically cut every flower in the park because I want to sell them, that’s different.It's not like that, because flowers are a physical object and moving them to one place deprives their original location of the flowers. When an LLM learns something from a webpage, the webpage is still there. Whatever 'theft' you perceive is entirely in your head; you were deprived of nothing by someone else making a copy of your thing.
by soerxpso
5/21/2026 at 6:30:10 PM
This is not true. Because the copy is a devaluation of the original, so even though the web page is still there it’s value has decreased.by LunicLynx
5/21/2026 at 7:24:47 PM
"It's not like that"That's not the point. The point is that scale matters, and that was the only point.
by jerf
5/21/2026 at 6:36:21 PM
It's not like that, because flowers are a physical object and moving them to one place deprives their original location of the flowers. When an LLM learns something from a webpage, the webpage is still there. Whatever 'theft' I perceive is entirely in my head; I was deprived of nothing by someone else making a copy of my thing.by abustamam
5/21/2026 at 7:48:06 PM
I get that the intention here is to plagiarize and thus cause the parent to feel the harm of it and realize the error in their ways, but I don't think it works. Plagiarism's harm to the plagiaree (?) is that it robs them of credit and payment, but nobody is viewing your reply in isolation of the parent's attribution and parent wasn't expecting to make money off of an HN comment. The harm to the rest of society where you gain false esteem for another's work is also not carried out in this instance. The harm to the plagiarizer where they fail to learn because they copied instead is likewise absent. If someone were to feel harm just from a copy of their words existing, they wouldn't need you to do it- google has hastily indexed this along with every other HN comment and we all know that this whole thread will make its way into LLM training sets eventually.by ToValueFunfetti
5/21/2026 at 7:15:42 PM
> Whatever 'theft' you perceive is entirely in your headRather, it appears to be in your head, since the person you’re replying to has not mentioned or even hinted at theft. The problem with taking all flowers from a public park for your own profit is multifaceted. Amongst others, you’re depriving everyone else from enjoying them, but also degrading the image of the park and harming all the insects which depend on those flowers and the birds who depend on those insects, which in turn degrades the park further, which stops people from enjoying it and going there and caring for it. It’s not about a single physical object, it’s about the ripple effect the selfish action produces.
by latexr
5/21/2026 at 6:34:59 PM
When the LLM presents what it learned as its own thoughts without any attribution, that's the theft.And you understand that. You're not stupid. This is the thing: AI is convenient for corporations, so you'll make dishonest arguments to justify your unethical behavior. Maybe you even believe what you say, but that's because people will hold on to any flimsy thing that lets them feel like they're good people, not because the reasoning actually makes any sense.
This is why people talking about AI get booed at speeches. There's no conversation to be had: you're not interested in the truth, or what's right, or what's good for anyone but yourself.
by kerkeslager
5/21/2026 at 4:45:21 PM
We ran into a lot of stuff like this in the early days of the web. For example, there was a lot of information that was "public" in that anyone could go to the city courthouse and ask to see the documents. But it changed in nature when you could suddenly look up anyone in the country by typing their name in your browser.by svachalek
5/21/2026 at 6:44:01 PM
I am not quite sure why my address history, known aliases, and sometimes phone number, are publicly available to anyone who Googles my name, and I'm not sure how to opt out of this.by abustamam
5/21/2026 at 7:20:43 PM
> I'm not sure how to opt out of this.If you’re a EU citizen, do a web search for “right to be forgotten”.
by latexr
5/21/2026 at 5:10:11 PM
For a practical example of that, a lot of documents used to have things like social security numbers, and they started stripping that information off once it was visible online.by nitwit005
5/21/2026 at 5:26:35 PM
we used to ship mass lists of addresses and phone numbers to people in each town and it was fine/appreciated.by mswphd
5/21/2026 at 5:31:06 PM
You could also easily opt out with the single entity that shipped that information.by disposition2
5/21/2026 at 5:41:53 PM
Yes, but getting an unlisted number was considered weird and against the norm even if possible. Even in the early 2000s when I dropped my landline, my parents were aghast - "if you do that, you won't be in the phone book! How will anyone get in contact with you?"by jhbadger
5/21/2026 at 6:14:58 PM
And you often had to pay for the privilege... A dollar a month for them to not put your name and number in the phonebook.by bigbuppo
5/21/2026 at 5:50:55 PM
Yeah and back then it wasn't used as a sort of UUID to track every single thing you do in your life... Different timesby pera
5/21/2026 at 5:29:14 PM
You ever had a bump in the night my guy?Or a stalker?
by _doctor_love
5/21/2026 at 6:14:51 PM
> It’s like if I go to Golden Gate Park and pick one flower, I shouldn’t do that, but no one cares. But if I build a machine to automatically cut every flower in the park because I want to sell them, that’s different.The problem here is, in your example the small scale example, and the large scale example are both unacceptable behavior.
Learning from others at a small scale is not only socially acceptable, but is the foundation of how advancement works.
So this concept of the issue of the scale being the issue isn't at its core the problem, its that something that that is desired behavior in a human, is not socially acceptable because of a machine is doing it.
by Meph504
5/21/2026 at 6:28:02 PM
Wasn’t his point about plagiarism? That is also not ok on a small scale.by LunicLynx
5/21/2026 at 6:39:57 PM
I was trying to stick to the example, but I agree, that getting away with something doesn't determine if it is right or wrong. And the whole concept of that makes for shaky ground for any form of legal or ethical argument.by Meph504
5/21/2026 at 6:35:31 PM
I think the difference here is that you guys are talking ethics. And in fact what were talking about is enforcement. While its unethical to pick one flower (in it's purest form, robbing the commons of the beauty of a flower), it won't be enforced.by boringg
5/21/2026 at 6:41:45 PM
Fair. AI might also not be the problem, but how it is utilized.Suddenly everyone and their grandma are specialist at everything and the actual value of understanding is not appreciated anymore.
by LunicLynx
5/21/2026 at 6:55:06 PM
> But quantitative changes in an activity produce qualitative changes.Interesting take. I think a corollary is that the qualitative changes are in the economics of things. And more than the scale, it is the value of those economic effects that determines how "accepted" that activity becomes.
Take Uber as an example; it basically enabled mass avoidance of taxi regulations, and naturally existing taxi drivers and lawmakers cried foul. But enough people found value in the service and kept using it that gradually and inexorably society and laws adjusted to it.
On the other hand, copyright infringement is an interesting case. While pretty much everyone and their dog pirates content to some extent, the % of people who think it's acceptable to do so is surprisingly small (22% apparently, up from only 14% in 2019). Furthermore the media industry, especially including ads, is a significant % of US GDP. I think those reasons, more than any RIAA/MPAA lobbying, are why copyright laws have remained as stringent as they have.
As such at a social level, I don't think these effects were dismissed, rather they were considered and formally internalized.
I suspect the same thing is happening with AI companies. They get away with devouring and training on the sum of human knowledge largely because existing laws are insufficient to stop them. So stopping this would require new laws but... well, given the early economic impact LLM technology is having my hunch is new laws will be brought in to protect it rather than restrain it.
by keeda
5/21/2026 at 4:41:32 PM
quantitative changes in an activity produce qualitative changes
Well said!
by kogus
5/21/2026 at 4:59:31 PM
It reminds me of a Stalin* quote: "Quantity has a quality all its own."* Note that it may be misattributed to him
by tyleo
5/21/2026 at 5:37:20 PM
Yes absolutely, when automation increases the rate of something many orders of magnitude that often is a qualitative difference.It's weird to me how often on HN of all places I see arguments that can be refuted with "scale matters". I commonly see arguments on all sorts of topics that make the same mistake you're calling out.
by rurp
5/21/2026 at 4:30:12 PM
If one person is murdered, that's bad. If a million people are murdered, that's war.If one word is stolen by AI, that's bad. If a million words are stolen by AI, that's business.
by inetknght
5/21/2026 at 4:45:41 PM
this made me oof. well said.by bogrollben
5/21/2026 at 5:33:33 PM
>If one word is stolen by AI, that's bad. If a million words are stolen by AI, that's business.Where are all the instances of "one word" being "stolen by AI", and people getting mad over it?
by gruez
5/21/2026 at 7:09:01 PM
Honestly that's what's wrong with capitalism and property rights. We can understand what it means to own a thing like a piece of furniture, or a house, and "a person's home is their castle" rings true. But scale that up to individuals controlling resources that affect a neighborhood, a city, a country, or the world -- at each step their army of voters supports their right to own 800 billion dollars or whatever, same as they own their own houses -- it's only fair! And if they want to build a starbase and launch some rockets near your house and sensitive ecology they're just exercising the same rights you or I have, and attack on their ability to inflict damage on the community is an attack on all.[edit] and the same goes for corporations owning "means of production". It's not the same as owning an iPhone.
by hughw
5/21/2026 at 6:50:42 PM
Of course it's robbery. I don't think anyone is truly arguing it's not. The issue is that, if we don't do it, China will. Game over.I'm surprised I hvan't seen more economist scholars exploring this topic; it's a fastincating phenomenon. I've seen folks try and re-visit history and compare what's happening with AI to some historic event--but, we've never seen anything quite like it. As much as history repeats itself; at the forefront of innvotaion it doesn't.
I suspect that there will one day be an AI tax as society tries to reclaim the value of the theft; maybe even UBI of some form. Until then, buy the stocks and ride the theft wave. The economsits are certainly exploring the K shaped economy, and this is why.
by waynesonfire
5/21/2026 at 7:14:57 PM
This argument of "if we don't do it, someone else will" to justify theft is so tiring. The companies doing the stealing are collectively the same ones that have power to prevent it, if they were incentivised to do so.by proofofcontempt
5/21/2026 at 5:15:31 PM
ugh. yeah. the tragedy of the commonsby nate
5/21/2026 at 6:23:16 PM
It's funny, the way that term gets used now is actually a wild distortion of the true history."The commons" was an incredibly successful system, and medieval (and prior) villages used it to great success, for the entire village's benefit! "Commons" are a great thing for everyone to have!
The real history is that as advances in technology (like the Industrial Revolution) changed things, certain rich villagers were suddenly able to manage more animals than they could before. Those (specific/rich) people over-used the commons, creating the "tragedy" we all know of.
The real lesson of history is not that commons fail: to the contrary, they worked great and helped everyone for centuries! The real lesson is "watch the fuck out for the new rich (especially when they just became rich because of recent technology advancements): those bastard will steal from everyone for their own benefit!"
by hungryhobbit
5/21/2026 at 5:49:51 PM
In general tech has sat in the opposite paradigm: identify when doing something at a small scale is bad, but at a large scale is notunauthorized plagiarism on the individual level is bad, at the medium scale is ick, but at the ultragigantic scale is meh.
laundering through an llm takes away the real moral ick from the plagiarism - the lying and building of ego by the person reboxing somebody else's ideas and work.
by 8note
5/21/2026 at 5:52:11 PM
>> the lying and building of ego by the person reboxing somebody else's ideas and work.Instead the bot lies to people who use its output to boost their ego. Not sure it's really changing the moral calculus here.
by wffurr
5/21/2026 at 4:55:48 PM
My complaint with your argument is that the word learn means one thing when we are talking about a person learning something from a webpage or book and something completely different when a webpage or book is used to adjust some weights in a matrix. Calling that learning is a distraction from the real copyright violations going on.by wang_li
5/21/2026 at 5:51:08 PM
>when we are talking about a person learning something from a webpage or book and something completely different when a webpage or book is used to adjust some weights in a matrixWhat material differences exist between the two besides "humans good, computers bad"?
>Calling that learning is a distraction from the real copyright violations going on.
Most courts so far have ruled that it counts as fair use.
by gruez
5/21/2026 at 4:44:04 PM
This is a great point. I think for coding, the wording of the MIT open source license makes it clear that copying and distributing the software is authorised on a small scale and it's very clear that the act of copying must involve a person.It provides distribution and modification rights to "any person obtaining a copy of the software" and explicitly requires attribution for any significant parts.
Mass-ingesting the code with a script without any human even reading the licence is a very different kind of copying mechanism and there is no person involved... The contract was bypassed completely. A contract requires consent from both parties to be binding. When ingesting code into the AI training set, nobody even read the license. There was no agreement; neither explicit nor implicit... Because the consumer, a script, never read the contact for that specific project.
There was nobody present when the copying occurred; on neither side! It cannot possibly constitute an agreement between two parties.
by jongjong
5/21/2026 at 6:09:22 PM
> I think for coding, the wording of the MIT open source license makes it clear that copying and distributing the software is authorised on a small scale and it's very clear that the act of copying must involve a person.I agree with “must involve a person. https://opensource.org/license/mit starts with (emphasis added) “Permission is hereby granted, free of charge, to any PERSON obtaining a copy of this software and associated documentation files (the “Software”)”.
That means it doesn’t give an LLM any rights. The way I see it, LLMs run (directly or indirectly) by a person can do stuff on their behalf, though, just as your CI pipeline can download and compile MIT-licensed software.
I definitely disagree with the “on a small scale” as the license continues (again, emphasis added) “to deal in the Software WITHOUT RESTRICTION, including WITHOUT LIMITATION the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software”.
by Someone
5/21/2026 at 4:48:49 PM
That's like saying you're not allowed to load the source code into an editor, because it's not a person. Or that you're not allowed to run a global search-replace on the entire code base, because it's a script and not a person.by quantummagic
5/21/2026 at 5:02:25 PM
But in this case, a human has awareness of what software they are copying or modifying and that's how the original software author receives credit. The contract requires some degree of human awareness to be valid. This is the critical difference.by jongjong
5/21/2026 at 5:11:03 PM
Sorry that's nonsense. There's human awareness when ingesting MIT code into an LLM too. In both cases it's a human that says $ excute-global-replace or $ ingest-into-llmBoth operations require some degree of human awareness. What you appear to be saying is, a human can only use a limited algorithm to access this source code, not a sophisticated one. And where do you draw that line? Who should get to say what is too sophisticated?
Error: your algorithm is too sophisticated to proceed, please provide more human awareness, it's a critical difference.
by quantummagic
5/21/2026 at 4:51:51 PM
This would be an extremely novel mechanism of copyright litigation and I doubt it would fly in an American court with its' emphasis on highly individualized legal rights and obligations. And, if it did get accepted by the courts, that's halfway to an even crazier argument: that the MIT license only allows individual distribution to known parties; i.e. no hosting the code on a website or seeding it on BitTorrent, because that's not "small scale" and doesn't "involve a person".by kmeisthax
5/21/2026 at 5:07:36 PM
You can only seed it on BitTorrent if it comes with the license which identifies the original author and acknowledges their copyrights over the code. Also there is definitely an assumption that a human will read the license or at least implicitly consent to the terms before using or modifying the software. When ingested by AI, the author gets zero credit and no consent has taken place between any sentient being on either side of the contract... Or at least none that are legally acknowledged as sentient or having legal rights.by jongjong
5/21/2026 at 6:18:07 PM
And the thing is, you point out the easy out on this for similarly licensed code... a giant list of authors and contributors that may have code included in the generated output. It's a win/win for everyone. The original authors get their acknlowdgement, and the AI company gets to bill the users of AI for all the tokens for that multi-gigabyte copyright disclosure file.by bigbuppo
5/21/2026 at 6:09:14 PM
data brokers lean into this too... you can go to the city hall and get someone's public information pretty easily, that does not mean you should make all of that information available to everyone else all the time from anywhereby micromacrofoot
5/21/2026 at 6:14:24 PM
> why does a computer making money by learning everything from everyone upset people so? It’s the same thing!The majority of the population, sitting outside the VC bubble, views AI unfavorably. That's not my hot take, that's a fact from the NYT survey published today.
It's going to be hilarious when VCs, having expropriated the IP of the entire internet, build The Layoff Machine That Does Everything Without Workers, and then the voters decide to just...enthusiastically expropriate that, and we end up with Fully Automated Luxury Communism.
by mullingitover
5/21/2026 at 6:18:14 PM
>The majority of the population, sitting outside the VC bubble, views AI unfavorably.Sure, where AI means threatens my job or my skills, people view it unfavourably.
But then they use it. They're all using it. People's rhetoric seldom matches their actions.
>enthusiastically expropriate that, and we end up with Fully Automated Luxury Communism
Maybe in other countries, initially, but the US is very firmly a plutocracy, and has a populace that will very happily vote against their own interests because the plutocrat-owned media told them to. And yeah, it is very rapidly approaching the point where there is going to be zero chance of a revolution even if people opened their eyes.
Which is precisely why the US is now threatening other countries as well, because plutocracy is threatened by rational, educated, better managed countries. Canada, for instance, is an example that country doesn't have to revert to being an idiocracy, so it's first in the crosshairs.
by llm_nerd
5/21/2026 at 6:34:34 PM
> But then they use it. They're all using it. People's rhetoric seldom matches their actions.I don't see any contradiction. I criticize the hell out of guns and want them strictly controlled, and yet I own one. `¯\_(ツ)_/¯`
People can use AI and still demand that all of society receive the benefits, instead of a small group of oppressors.
by mullingitover
5/21/2026 at 5:17:56 PM
Of course quantity makes emerge it’s own quality. If you kill a single person, you are a murderer, if you genocide "others" and distribute the spoliation wealth to those unscathed you are a national hero. If you steal small material you are a theft and go to prison, if you hog some billions you can enact laws to grab even more.by psychoslave
5/21/2026 at 6:02:53 PM
>If you kill a single person, you are a murderer, if you genocide "others" and distribute the spoliation wealth to those unscathed you are a national hero.This is a fundamental misunderstanding of how laws work. It's not the scale that makes it okay, it's that it's done through some official process. Trump's raid to grab Maduro killed less than 100 people. Pretty modest by "genocide" standards, and is easily eclipsed by gang/cartel violence. Yet nobody is going after Trump because he didn't meet some kill quota to get special protection, nor are people condoning cartel violence because they killed far more than Trump.
by gruez
5/21/2026 at 7:13:06 PM
That's exactly how laws work then.International Right for those who don't have all the nukes and lobotomized cannon meat bag ready to invade on a whim, and on the other side doing all the crimes and atrocities, straight transgress all legal processes ever invented, and expecting no possible punishment in return.
Number of directly killed people is not something that can be eclipsed by bigger number of killed people. Not in a mind that keeps empathy high in its value.
by psychoslave
5/21/2026 at 7:49:35 PM
[dead]by eboy
5/21/2026 at 5:18:03 PM
No. It's more like,"You say I can take a photo of one flower in your flowerbed you put next to the public street, but you get upset when I take a bunch of photos of many public flowerbeds. That's both an over-reach and inconsistent."
by superkuh
5/21/2026 at 6:05:36 PM
Can you map this more directly to claims made about AI? It's impossible to agree or disagree with you. You've just given us an analogy - but to what?by nonethewiser
5/21/2026 at 6:47:14 PM
Not sure what you're missing here. The other comment replies don't seem to be missing it. See the article, etc.by dbalatero