Comment by ryandamm

Comment by ryandamm a day ago

6 replies

This may not be a particularly popular opinion, but current copyright laws in the US are pretty clearly in favor of training an AI as a transformative act, and covered by fair use. (I did confirm this belief in conversation with an IP attorney earlier this week, by the way, though I myself am not a lawyer.)

The best-positioned lawsuits to win, like NYTimes vs. OpenAI/MS, is actually based on violating terms of use, rather than infringing at training time.

Emitting works that violate copyright is certainly possible, but you could argue that the additional entropy required to pass into the model (the text prompt, or the random seed in a diffusion model) is necessary for the infringement. Regardless, the current law would suggest that the infringing action happens at inference time, not training.

I'm not making a claim that the copyright should work that way, merely that it does today.

codedokode a day ago

> Regardless, the current law would suggest that the infringing action happens at inference time, not training.

Zuckerberg downloading a large library of pirated articles does not violate any laws? I think you can get a life sentence for merely posting links to the library.

  • philipkglass a day ago

    I think you can get a life sentence for merely posting links to the library.

    This isn't true in the United States. I would be surprised if it were true in any country. Many people have posted sci-hub links here, and to my knowledge nobody has ever suffered legal problems from it:

    https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

    • codedokode a day ago

      Doesn't it count as distribution? I thought DMCA requires to delete links.

      • philipkglass 18 hours ago

        A copyright holder may file a takedown notice [1] against a platform that hosts a link to copyright-infringing material like a book from Library Genesis or an article from sci-hub. Failure to act upon a legitimate takedown notice opens the platform operator up to a civil law suit. The platform does not have to take proactive measures to prevent infringing links from being posted by users. Some platforms like YouTube take more aggressive measures to proactively guard against infringement, but they are not required by the provisions of the DMCA.

        [1] https://guides.dml.georgetown.edu/c.php?g=904530&p=6510951 (See "Notifications of Claimed Infringement")

photonthug a day ago

> The best-positioned lawsuits to win, like NYTimes vs. OpenAI/MS, is actually based on violating terms of use, rather than infringing at training time.

I agree with this, but it's worth noting this does not conflict with and kind of reinforces the GP's comment about hypocrisy and "[ignoring] the law as long as you've got enough money".

The terms of use angle is better than copyright, but most likely we'll never see any precedent created that allows this argument to succeed on a large scale. If it were allowed then every ToS would simply begin to say Humans Only, Robots not Welcome or if you're a newspaper then "reading this you agree that you're a human or a search engine but will never use content for generative AI". If github could enforce site terms and conditions like that, then they could prevent everyone else from scraping regardless of individual repository software licenses, etc.

While the courts are setting up precedent for this kind of thing, they will be pressured to maintain a situation where terms and conditions are useful for corporations to punish people. Meanwhile, corporations won't be able to punish corporations for the most part, regardless of the difference in size. But larger corporations can ignore whatever rules they want, to the possible detriment of smaller ones. All of which is more or less status quo

o11c a day ago

Training alone, perhaps. But the way the AIs are actually used (regardless of prompt engineering) is a direct example of what is forbidden by the case that introduced the "transformative" language.

> if [someone] thus cites the most important parts of the work, with a view, not to criticize, but to supersede the use of the original work, and substitute the review for it, such a use will be deemed in law a piracy.

Of course, we live in a post-precedent world, so who knows?