Comment by WhatsName

As there is no reason to believe that Harry Potter is axiomatic to our culture in the way that other concepts are, it is strange to me that the LLMs are able to respond in this way, and not at all expected. Why do you think this outcome is expected? Are the LLMs somehow encoding the same content in such a way that they can be prompted to decode it? Does it matter legally how LLMs are doing what they do technically? This is pertinent to the court case that Meta is currently party to.

https://en.wikipedia.org/wiki/Artificial_intelligence_and_co...

> See for example OpenAI's comment in the year of GPT-2's release: OpenAI (2019). Comment Regarding Request for Comments on Intellectual Property Protection for Artificial Intelligence Innovation (PDF) (Report). United States Patent and Trademark Office. p. 9. PTO–C–2019–0038. “Well-constructed AI systems generally do not regenerate, in any nontrivial portion, unaltered data from any particular work in their training corpus”

https://copyrightalliance.org/kadrey-v-meta-hearing/

> During the hearing, Judge Chhabria said that he would not take into account AI licensing markets when considering market harm under the fourth factor, indicating that AI licensing is too “circular.” What he meant is that if AI training qualifies as fair use, then there is no need to license and therefore no harmful market effect.

I know this is arguing against the point that this copyright lobbyist is making, but I hope so much that this is the case. The “if you sample, you must license” precedent was bad, and it was an unfair taking from the commons by copyright holders, imo.

The paper this post is referencing is freely available:

https://arxiv.org/abs/2505.12546