Comment by jimbokun

Comment by jimbokun 2 days ago

5 replies

The claim of theft is simple: the AI companies stole intellectual property without attribution. Knowing how AIs are trained and seeing the content they produce, I'm not sure how you can dispute that.

riotnrrd 2 days ago

Statistics are not theft. Judges have written over and over again that training a neural network (which is just fitting a high-dimensional function to a dataset) is transformative and therefore fair use. Putting it another way, me summarizing a MLB baseball game by saying the Cubs lost 7-0 does not infringe on MLB's ownership of the copyright of the filmed game.

People claiming that backpropagation "steals" your material don't understand math or copyright.

You can hate generative tools all you want -- opinions are free -- but you're fundamentally wrong about the legality or morality at play.

  • jimbokun 16 hours ago

    LLMs sometimes spit out large chunks of text from it's training data almost verbatim.

senordevnyc 2 days ago

In the exact same way that it’s not theft if an artist-in-training goes to a museum to look at how other painters created their works.

  • jamietanna 2 days ago

    False equivalence - a random person can't go to a museum and then immediately go and paint exactly like another artist, but that's what the current LLM offerings allow

    See Studio Ghibli's art style being ripped off, Disney suing Midjourney, etc

    • senordevnyc a day ago

      That's not exactly how LLMs learn either, they require huge amounts of training data to be able to imitate a style. And lots of human artists are able to imitate the style of one another as well, so I'm not sure what makes LLMs so different.

      Regardless of whether you think IP laws should prevent LLMs from training on works under copyright, I hardly think the situation is beyond dispute. Whether copyright itself should even exist is something many dispute.