I wonder if we should have #GPL 4 that would cover machine learning. Something like "if this code is used to train an LLM, then the code produced by the LLM must be released under the same license". I know there are many challenges, such as effective enforcement, but if this issue remains unaddressed, I believe LLMs may become a way to evade license virality.

#freesoftware #OpenSource #licensing

Ondřej Caletka reshared this.

in reply to Miroslav Suchý

@mirek that is very curious, i'm wondering how familiar the lawyers were with the actual technology.

i don't use llms too much, so i'm not familiar myself -- but to me it looks like some creative work goes into writing the prompt?

i don't use these tools too much, but just a few days ago i ended up using one and it turned out that a good way to avoid shit output is feeding in pseudocode or commanding it to make the very same sort of edits as one would otherwise make manually. it'd be surprised if someone considered this not a creative process?

in reply to Jiří Eischmann

Copyright licenses are not a magic spell. If LLMs are adjudicated to be derivative works of their inputs, no additional license is needed; existing GPL (or indeed, even Apache or MIT, given the lack of license text reproduction in LLM output) is fine. If LLMs are adjudicated to be fair use, no additional license will help.

Focusing on licensing is fighting the last war. Public communications and movement-building is what is needed now.