dvt 18 hours ago

> the physical encoding which definitely exists in my brain is a copyright violation

First of all, we don't really know how the brain works. I get that you're being a snarky physicalist, but there's plenty of substance dualists, panpsychsts, etc. out there. So, some might say, this is a reductive description of what happens in our brains.

Second of all, yes, if you tried to publish Harry Potter (even if it was from memory), you would get in trouble for copyright violation.

  • ninetyninenine 18 hours ago

    Right but the physical encoding already exists in my brain or how can I reproduce it in the first place? We may not know how the encoding works but we do know that an encoding exists because a decoding is possible.

    My question is… is that in itself a violation of copyright?

    If not then as long as LLMs don’t make a publication it shouldn’t be a copyright violation right? Because we don’t understand how it’s encoded in LLMs either. It is literally the same concept.

    • Jaygles 18 hours ago

      To me the primary difference between the potential "copy" that exists in your brain and a potential "copy" that exists in the LLM, is that you can't make copies and distribute your brain to billions of people.

      If you compressed a copy of HP as a .rar, you couldn't read that as is, but you could press a button and get HP out of it. To distribute that .rar would clearly be a copyright violation.

      Likewise, you can't read whatever of HP exists in the LLM model directly, but you seemingly can press a bunch of buttons and get parts of it out. For some models, maybe you can get the entire thing. And I'm guessing you could train a model whose purpose is to output HP verbatim and get the book out of it as easily as de-compressing a .rar.

      So, the question in my mind is, how similar is distributing the LLM model, or giving access to it, to distributing a .rar of HP. There's likely a spectrum of answers depending on the LLM

      • ninetyninenine 17 hours ago

        > that exists in the LLM, is that you can't make copies and distribute your brain to billions of people.

        I can record myself reciting the full Harry Potter book then distribute it on YouTube.

        Could do the exact same thing with an LLM. The potential for distribution exists in both cases. Why is one illegal and the other not?

    • numpad0 17 hours ago

      copyright is actually not as much about right to copy as it is about redistribution permissions.

      if you trained an LLM on real copyrighted data, benchmarked it, wrote up a report, and then destroyed the weight, that's transformative use and legal in most places.

      if you then put up that gguf on HuggingFace for anyone to download and enjoy, well... IANAL. But maybe that's a bit questionable, especially long term.

    • bitmasher9 18 hours ago

      I don’t think the lawyers are going to buy arguments that compare LLMs with human biology like this.

lithiumii 19 hours ago

You are not selling or distributing copies of your brain.

harry8 19 hours ago

If you perform it from memory in public without paying royalties then yes, yes it is.

Should it be? Different question.

JKCalhoun 18 hours ago

The end of "Fahrenheit 451" set a horrible precedent. Damn you, Bradbury!

beowulfey 18 hours ago

Only if you charge someone to reproduce it for them