Comment by pavlov

Comment by pavlov 3 hours ago

10 replies

There are many competing providers of commercial LLMs with equal capabilities, so another vendor would probably be happy to serve a leading Western market of 83 million people.

petesergeant 3 hours ago

Yeah? Which commercial provider’s model do you think was trained without using lyrics?

  • pavlov 3 hours ago

    The point is that some other vendor will do the work to implement the filtering required by Germany even if OpenAI doesn't.

  • aniviacat 3 hours ago

    I would imagine providers who want to comply will scan the LLM's output and pay a license fee to the owner if it contains lyrics.

    • petesergeant 3 hours ago

      They scan for commercial work already. Isn’t the law about training, not output?

      • dathinab an hour ago

        they clearly didn't do that properly, or we wouldn't have the current law suite

        the lawsuit was also not about weather it is or isn't copy right infringement. It was about who is responsible (OpenAI or the user who tries to bait it into making another illegal copy of song lyrics).

        A model outputting song lyrics means it has it stored somehow somewhere. Just because the storage is in a lossy compressed obscure hyper dimensional transformation of some kind, doesn't mean it didn't store an illegal copy. Or it wouldn't have been able to output it. _Technical details do not protect from legal responsibilities (in general)_

        you could (maybe should) add new laws which in some form treat LLM memorized things the same as if a human did memorize it, but currently LLMs have no special legal treatment when it comes to them storing copies of things.

      • aniviacat 3 hours ago

        Perhaps; I didn't read the court ruling.

        But I'd be surprised if that was generally the case. It's easy to see why ChatGPT 1:1 reproducing a song's lyrics would be a copyright issue. But creating a derivative work based on the song?

        What if I made a website that counts the number of alliterations in certain songs' lyrics? Would that be copyright infringement, because my algorithm uses the original lyrics to derive its output?

        If this ruling really applied to any alogrithm deriving content from copyright protected works, it would be pretty absurd.

        But absurd copyright laws would be nothing new, so I won't discount the possibility.

      • Semaphor 2 hours ago

        No, it’s specifically about (mostly) verbatim producing big chunks of lyrics in the output. The court PR specifically mentioned memorization, retaining training data, multiple times.