Comment by Myrmornis

Comment by Myrmornis 3 hours ago

2 replies

But lyrics are just one example. Are you saying that training experiments must filter out all substrings from the training input that bear too close a resemblance to a substring of a copyrighted work?

barrucadu 2 hours ago

Obviously there's a limit, reproducing a single sentence is unlikely to be copyright infringement just because there are only so many words in a language; but if reproducing some text would be copyright infringement if a human did it, I don't see why LLM companies should get a free pass.

If it's really essential that they train their models on song lyrics, or books, or movie scripts, or articles, or whatever, they should pay license fees.

freejazz 17 minutes ago

At some point, use of the lyrics becomes de minimis