6/9/2026 at 10:48:59 PM
> In looking at the code that the LLMs have produced for the project, especially given the pretty massive and widespread architectural changes needed to make the implementation libified and memory safe, we decided that the codebase is not a derivative work that would require carrying forward the GPL license and have decided to release the code under the MIT instead.Hmm. That's going to be interesting.
by Philpax
6/10/2026 at 9:44:12 AM
Well, there's lots of really interesting opinions here from a lot of armchair lawyers.To clarify, my stance on this is that the reimplementation did not copy protected expressions (Jplag reports less than 1.8% max similarity between the codebases), it's done in good faith, and it's what's best for the broader Git ecosystem (assuming Grit even becomes usable, which it's currently not purported to be).
From a copyright standpoint, however, only the first argument there is relevant. Grit is an independently authored implementation of Git-compatible behavior, with negligible similarity to Git source code.
I think antirez summarized the situation quite well and I broadly agree with his position: https://antirez.com/news/162
I think that those in the community who know me and have worked with me in the Git and open source communities for the last 20 years know that my intentions are to contribute, share and foster innovation and learning. Many of the main authors of the Git source code are friends of mine and I have no intention to steal anything from anyone, only to make their great ideas more broadly useful.
by schacon
6/10/2026 at 6:54:34 AM
A translation of a book to a different language is a derivative work. So a translation of a computer program to a different programming language is also. But if in the translation of the book you start altering the plot and the personalities of that characters, does it at some point become not a derivative work? What point? IANAL, and I have no real idea, but I imagine that point has been probed significantly in case-law with respect to creative works. Given the current climate of ever-expanding scope of "intellectual property", if they admit that the LLM had access to git source code then I would say their case is weak at best.by jbotz
6/10/2026 at 8:11:54 AM
> translation.It's not technically a translation, it's a re-implementation, with test suites acting as the destination. If it was a file by file translation your argument would have been valid.
by anilgulecha
6/10/2026 at 8:34:00 AM
Git is part of the LLM's training set though, so simply asking it to recreate git in another language is pretty equivalent. Like, you can almost certainly get these LLMs to output gits full source code with some prompting, so there's not that much difference (as much as we like to pretend that AI generated code has no copyright implications)by 20k
6/10/2026 at 8:54:21 AM
> Like, you can almost certainly get these LLMs to output gits full source code with some prompting, so there's not that much difference (as much as we like to pretend that AI generated code has no copyright implications)Are you sure? LLMs are in some way a compressed version of their input but it's a pretty lossy compression (arguably this makes them more like a compression algorithm than a compressed version of the data). I'm not sure you can prompt a full, accurate, copy of a nontrivial codebase out of them. Even with zero temperature their accuracy is just not that high.
by rcxdude
6/10/2026 at 10:13:19 AM
> I'm not sure you can prompt a full, accurate, copy of a nontrivial codebase out of them. Even with zero temperature their accuracy is just not that high.Granted, these are some of the most widely spread texts, and not codebases, but just fyi: https://arxiv.org/pdf/2601.02671
> For Claude 3.7 Sonnet, we were able to extract four whole books near-verbatim, including two books under copyright in the U.S.: Harry Potter and the Sorcerer’s Stone and 1984 (Section 4).
by philipportner
6/10/2026 at 8:33:55 AM
Yes, but as soon as copyright became a problem for very rich people parts of it were cancelled.1) re-implementation for compatibility (which was quickly "reestablished" through use of copyright-protecting encryption. In other words: do you get to write software that connects to MS/Apple/Google/Facebook servers without authorization from those companies? Yes. Do you get to copy an encryption key from their software to make it possible? No)
and, more recently,
2) violating copyright for LLM training
and, currently mostly attempted:
3) "uncopyrighting" run software through an LLM, and some people "believe" it comes out with your copyright on it! Because very rich people want to sell uncopyrighting.
Ie. the jury's still out what will happen when it's billionnaire vs billionnaire.
Of course, the question is what happens the second someone does this with a disney movie, or a big microsoft application ...
by spwa4
6/9/2026 at 11:09:07 PM
they would be just wrong. I hope someone with standing suesby nextaccountic
6/9/2026 at 11:39:01 PM
I don't think it's that clear cut. The functional parts probably aren't copyrightable, only the stylistic ones. It's going to be a mix of courts applying laws in new ways that hasn't been done before and fact specific questions about what actually persisted through the LLM if it goes to court.I'd be fascinated to see what happens if it does. Both in the analyses that we'd get of what the LLM did to the codebase and on the legal decisions on what the copyrightable creative elements in code actually are.
If I was the author though... there would be no way that I would be volunteering to be a test case like this. Also seems just rude for no reason.
by gpm
6/10/2026 at 12:33:43 AM
It probably would have been less bad if he had chosen MPL-2.0 or LGPL-2.1-or-later. But he chose MIT, which cuts at the core of the intent of licensing the project with a share-alike license.by Conan_Kudo
6/10/2026 at 12:39:04 AM
Tell me, can I create a copyrighted video that's not GPL licensed using ffmpeg? Now tell me how creating a rust library using the git test suite is different?by joshka
6/10/2026 at 1:45:48 AM
> using the git test suiteThat's not actually the case at hand here - the agents were given the original source to reference: https://github.com/gitbutlerapp/grit/blob/main/AGENTS.md#sou...
But for the sake of argument: The test suite itself is copyrighted. To the extent the resulting work is a derivative of the test suite it is possibly infringing. For example you might example that the agent would derive variable names, function names, structure sequence and organization of the code from the test suite. It might even copy comments wholesale. Those are copyrightable things. (Which is of course just the first step in analyzing if it is infringement, there would be interesting fair use, de-minimis copying, etc arguments following a conclusion that any of those were copyrighted. A product produced this way definitely could be infringing given the right facts though).
by gpm
6/10/2026 at 2:01:19 AM
> That's not actually the case at hand here - the agents were given the original source to reference: https://github.com/gitbutlerapp/grit/blob/main/AGENTS.md#sou...yeah fair - the "The canonical Git source code we're targeting to replicate the functionality of is in the git/ subdirectory." part makes this hard to argue against.
> To the extent the resulting work is a derivative of the test suite it is possibly infringing
It's this bit that I have a problem with. If I run the test, it fails and reports a failure. Now I write code and run the test again. What is the theory there that code that I wrote infringes.
Simplify this down:
Assume the following is copyrighted:
fn test_sum() {
assert_eq!(sum(1, 1), 2);
}
Does writing the following code: fn sum(a: u8, b: u8) {
a + b
}
infringe on the test copyright?
by joshka
6/10/2026 at 2:19:17 AM
Writing fn sum(a: u8, b: u8) {
a + b
}
Doesn't infringe upon copyright period, because there's no creative element in that work.Imagine a more substantial example though. Perhaps you have a test that checks that some file written in a binary format is correct, and gives names (creative elements) to each field of the format that it prints when you mess up the field, and has comments describing why the bytes are laid out like they are (the comments being copyrightable even if the facts they describe aren't), and the LLM copies those field names and comments verbatim... Now it's quite likely that the LLMs work is a derivative of the test suite.
by gpm
6/10/2026 at 2:40:57 AM
> Doesn't infringe upon copyright period, because there's no creative element in that work.There's likely a threshold at some point. It's helpful to look at a minima and then continue from there though.
I'm curious if there's case law that supports your assertions here?
by joshka
6/10/2026 at 2:47:09 AM
For that assertion in particular I believe I'm practically parroting a ruling by the district court in Oracle vs Google about some extremely simple Java functions that Oracle claimed Google copied. Though I can't say I checked to make sure I'm remembering right.by gpm
6/10/2026 at 2:55:06 AM
You're recalling it right, but there's a nice quote from Judge Alsup in that case that talks about this exact situation:> “So long as the specific code used to implement a method is different, anyone is free under the Copyright Act to write his or her own code to carry out exactly the same function or specification...”
Here given that this is rust and the original expression is C, the implementations cannot be the same by definition.
by joshka
6/10/2026 at 4:40:03 AM
That's essentially the same thing as modding a game, though. I know there have been lawsuits to stop modding, but I don't think any were successful.by Pay08
6/10/2026 at 5:04:17 AM
If you did it in a loop until the test passed, maybe?Your result is essentially impossible without the original. With ffmpeg, your result does not depend on ffmpeg specifically - you can use any video creation tool.
by lelanthran
6/10/2026 at 4:01:26 AM
A GPL tool that processes data doesn't virally transfer the license to its output. Copyrighted ffmpeg code isn't incorporated into the video output. The LLM didn't just conjure up equivalent behavior to git without ingesting the code and transforming it as new output. There is no other behavioral description that would reproduce all needed functionality.by kevin_thibedeau
6/10/2026 at 1:29:18 AM
Medium, substitutibility, basics of copyright law.by NewJazz
6/10/2026 at 1:49:58 AM
Fair point on medium - this was a lazy example.Substitutibility probably doesn't apply here in the way you're implying and if it did it would likely be hampered by the 9th circuits findings about transformation in sony v connectix. Arguments here likely would look at rust not having a stable ABI, and hence not being inherently substitutable as a libray (grit-lib), less clear as an executable (grit-cli) on that side
basics of copyright law - the fundamental thing being protected is the expression... is a rust program's expression the same expression as a c program? I'd say generally not.
by joshka
6/10/2026 at 2:32:07 AM
The test suite could test aspects of the architecture/design of the codebase that are not necessary for interoperability and constitute novel expression of a piece of software in a way that is not at all language specific.by NewJazz
6/10/2026 at 2:44:05 AM
By definition a test suite is about testing interoperability with the test suite. An HTTP test suite should likely test for whether response code 418 is implemented a particular way and while humorous it would still be an interop test no?by joshka
6/10/2026 at 2:58:06 AM
No, the git test suite is about testing the git codebase. If you want something like that, you need a conformance suite, which does not exist for git.by Conan_Kudo
6/10/2026 at 3:04:11 AM
If feeding the source code through a complier yields a derivative work, why wouldn't feeding it to an LLM give the same result?by phkahler
6/10/2026 at 3:18:25 AM
Because compilers and LLMs do different things, and what is done matters, so you can't reason by stepping from one to the other.Compilers don't axiomatically yield derivative works, they simply in practice do because for non-trivial programs they preserve copyrightable elements of the work in the output.
by gpm
6/10/2026 at 7:45:25 AM
So, if we will compile or decompile code using LLM instead of a compiler, then we can use the result for free?(LLM can translate code to/from other code or to/from a machine code).
by oneshtein
6/10/2026 at 3:34:54 AM
Well compilers are a mechanical transformation and if that were sufficient to free you of IP law then IP law wouldn't work.An LLM is also a computer program which takes input and produces output related in some way to that input. However I don't think most people would view it as a "mere" mechanical transformation. One could tautologically argue that an LLM blends the user input with the training inputs which is a sort of transformation and further that the LLM itself is a computer program thus it is mechanical in nature. However it should be immediately obvious that such an overly literal interpretation is in danger of subsuming human work as well. Where the boundary lies is an unanswered question.
Related, compilers can pose a problem depending on what the output includes. For example common lisp compilers that aren't under a permissive license are a minefield because regardless of what anyone might say the image that gets output includes (approximately) the full language implementation verbatim in addition to the user's program.
by fc417fc802
6/10/2026 at 1:26:40 AM
functional parts not being copyrightable means that you can't claim a program is a copyright violation based on the fact it does the exact same thing based on compatibility reasons (you can copy what the program does). E.g. git stores refs in .git/refs, so does grit, that's not a violation. You still can't copy the program.by trumpdong
6/10/2026 at 1:49:34 AM
Yes... and now we get to the fact specific question of "did they copy the program". Or actually the answer to that is plainly "no" - they made something similar from it - and didn't run ctrl-c ctrl-v in an unlicensed manner, but "did they copy the relevant facets of the program into the new similar thing".by gpm
6/10/2026 at 1:52:44 AM
Making something similar is copying for the purpose of copyright law. If I trace over a Disney character it's still copyright Disney.by trumpdong
6/10/2026 at 2:22:18 AM
No. You're allowed to make a similar tool, the functional elements are not copyrightable. There's a long history, predating LLMs by many decades, of doing this in the software industry.My use of the word "similar" does not imply here that I think it's obvious that they are "similar" in any copyrightable elements - whether they are or not is one of the interesting questions I think this case would have to resolve.
Incidentally you're also allowed to make similar creative elements so long as they aren't copies and you did so independently... which could actually come up in a case like this (imagine the LLM produced a similar function to some function in the original... but the original wasn't in the context window at the time. Not at all unlikely with code where there often is only one or two natural ways to write something).
by gpm
6/10/2026 at 12:36:34 AM
I suspect that the issue is more likely that the LLM code doesn't have an author and hence some parts of it can't be licenses, it's less likely that it's infringing on git's copyright for various reasons. (I am not a lawyer, but I do read copyright law for funsies).by joshka
6/10/2026 at 4:12:01 AM
https://www.copyright.gov/newsnet/2025/1060.html> It concludes that the outputs of generative AI can be protected by copyright only where a human author has determined sufficient expressive elements. This can include situations where a human-authored work is perceptible in an AI output, or a human makes creative arrangements or modifications of the output, but not the mere provision of prompts.
Well that's interesting.
by nomel
6/10/2026 at 5:13:49 AM
Also "just" the legal opinion of a government office. It has yet to be tested in courtby wongarsu
6/10/2026 at 1:25:10 AM
why wouldn't it? If you run git through a compiler it's still copyright the git devs, same if you run it through an LLM.by trumpdong
6/10/2026 at 1:40:05 AM
What makes you think that's what the article says that it did? There's a lot of specific nuance and it doesn't say that anywhere. In fact it speaks of making a test suite pass only. This is the classic cleanroom bios from specs approach but no need to extract it as the test is available to run and there's nothing in the GPL that suggests that running a test suite infects software that you run it on.by joshka
6/10/2026 at 5:08:13 AM
Surely git’s source is already in LLM’s training corpus. So this is far from clean room approach.by vasachi
6/10/2026 at 4:14:24 AM
Obligatory: https://github.com/chardet/chardet/issues/327by xiaoyu2006
6/10/2026 at 4:16:30 AM
Not a fan of this trend of "cleaning" GPL licensed software and releasing under permissive licenses. Also why I'm not a fan of UUtils nor Canonical's early adoption of it in Ubuntu.The intent here is extraction of all the value provided by copyleft projects without the obligation to give back. Wether it's technically legal or not, it's disgusting behavior IMO.
by thewebguyd
6/10/2026 at 4:28:01 AM
That’s explicitly not what’s happening with uutils; they have contributed fixes and test cases back to upstreamby Ar-Curunir
6/10/2026 at 5:07:35 AM
And just like that, it was forked by Microsoft a few days ago. Handed to them on a silver platter.by WD-42
6/10/2026 at 4:55:11 AM
> Not a fan of this trend of "cleaning" GPL licensed software > Wether it's technically legal or not, it's disgusting behavior IMO.GNU was originally developed to "clean" UNIX from the AT&T license.
by trimbo
6/10/2026 at 9:03:02 AM
An idea...Take this (assuming it's not slop), relicence as GPL, submit upstream (imagine it's accepted for a moment...).
If they proceed with license washing then from the Rust version, it's certainly derived work.
by silon42
6/10/2026 at 8:43:17 AM
This is not a proper black-box reimplementation, I doubt they can get away with that. And that's not mentioning all other obvious ethical concerns of course.by NietTim
6/10/2026 at 8:56:43 AM
black-box/clean-room isn't necessarily required, though. It does make it a lot harder to argue in court, of course.by rcxdude
6/10/2026 at 8:19:33 AM
I don't care if they can convince a judge. The fact that they even want to in the first place tells me what kind of people they are.F-ing scumbags. It's already free, but they still decide to steal it.
by Brian_K_White
6/10/2026 at 3:03:03 AM
I'm not a copyright lawyer, but it seems pretty clear to me you can't wash a license using an LLM.[US jurisdiction]: Anything in the result written by the LLM can not be copyright by anyone.
Anything in the result written by a human can be, and if it was all emitted by the LLM then that portion originally written by a human carries its own copyright.
As a work of an LLM, the entirety presumably can not be copyright, at all. Portions written by humans presumably carry their original copyright.
by jhayward
6/10/2026 at 3:47:44 AM
> [US jurisdiction]: Anything in the result written by the LLM can not be copyright by anyone.This is a bit stronger than the actual report where this has been discussed finds. See part 2 in https://www.copyright.gov/ai/ for details, but TL;DR, parts where humans have control over the expression may be copyrightable. But working out which parts those are is likely a difficult question (would likely require proof of provenance across many of those LLM sessions)
by joshka
6/10/2026 at 4:36:34 AM
Knowing what you don't know is such an important skill in life and your career. And I 100% agree with you that the author is, well, off their rocker.Let me give an example: I could take Goldeneye from the N64, extract the binary and then run it through an LLM to disassemble it and possibly rewrite it in a modern higher-level language. Do you think Nintendo would look at that and say "well, he did a lot of work so he's escaped our license"? Of course not. It's just silly.
ingesting the source code and producing output in another language is quite clearly a derivative work. You don't need to be an IP lawyer to figure that out.
Now, if you went to Calude and gave it documentation and told it to produce something that was compatible, would that be a derivative work and thus covered by the GPL? I would guess probably. But I'm not 100% sure anymore. I wouldn't risk it however.
Here's another thought experiment: what if someone takes this supposedly MIT licensed source tree, plugs it into another LLM and asks it to produce the output in C? Now how is it licensed? It might be very similar. After all, there are only so many ways to produce a SHA1 hash and so many ways to do a command line parser.
But this then makes it an interesting legal issue. In the Oracle v. Google court case, this was a key issue. Google successfully argued there's only so many ways to write a loop so just because a loop is similar to the source, that doesn't mean it's copyright infringement (as Oracle argued).
Anyway, it's a crazy position to take.
by jmyeet
6/10/2026 at 8:04:24 AM
> Knowing what you don't know is such an important skill in life and your career. And I 100% agree with you that the author is, well, off their rocker.They aren't the only ones - look at the number of people in this thread who are arguing that this is analogous to producing a movie with ffmpeg - just because ffmpeg is GPL, does not make your movie GPL.
I am struggling to understand how such a high level of cognitive dissonance is possible: They believe both a) that the license can be laundered in this manner, and that b) the license they put on the result is effective!
by lelanthran
6/10/2026 at 5:35:47 AM
Well that is already how it is done with numerous multi-decade open rewrites of closed games. They usually require the asset pack.I don't know how this squares with law, but Oracle v Google gave a very valuable judgment to the public that an API is not copywritable. If we take the LLM out of it, that's all we are talking about in the pure case.
Of course, we can't take the LLM out, but it is the starting point.
by beacon294
6/10/2026 at 7:46:18 AM
> Well that is already how it is done with numerous multi-decade open rewrites of closed gamesSerious such rewrites don't start with the code of the closed game!
> I don't know how this squares with law, but Oracle v Google gave a very valuable judgment to the public that an API is not copywritable. If we take the LLM out of it, that's all we are talking about in the pure case.
Not at all. The LLM used to write grit has seen the git code. That is what we're talking about here.
> Of course, we can't take the LLM out, but it is the starting point.
The LLM isn't the important thing. The important thing is that the git source code was used to make grit.
by gspr
6/10/2026 at 8:58:23 AM
>Serious such rewrites don't start with the code of the closed game!No, but they often involve reverse engineering the binary pretty heavily.
by rcxdude
6/10/2026 at 8:22:39 AM
heh - https://github.com/n64decomp/007game decompilation and emulation is as old as computing
by thedevilslawyer