Comment by aprilthird2021

> let's not pretend that an LLM that autocompletes a couple lines from harry potter with 50% accuracy is some massive new avenue to piracy. No one is using this as a substitute for buying the book.

Well, luckily the article points out what people are actually alleging:

> There are actually three distinct theories of how training a model on copyrighted works could infringe copyright:

> Training on a copyrighted work is inherently infringing because the training process involves making a digital copy of the work.

> The training process copies information from the training data into the model, making the model a derivative work under copyright law.

> Infringement occurs when a model generates (portions of) a copyrighted work.

None of those claim that these models are a substitute to buying the books. That's not what the plaintiffs are alleging. Infringing on a copyright is not only a matter of privacy (piracy is one of many ways to infringe copyright)

theK 17 hours ago

I think that last scenario seems to be the most problematic. Technically it is the same thing that piracy via torrent does, distributing a small piece of a copyrighted material without the copyright holders consent.

Reply View 0 replies

paxys 17 hours ago

People aren't alleging this, the author of the article is.