Comment by mvdtnz

No, assuming that just because it was in the training data it must be memorized is hare brained.

LLMs have limited capacity to memorize, under ~4 bits per parameter[1][2], and are trained on terabytes of data. It's physically impossible for them to memorize everything they're trained on. The model memorized chunks of Harry Potter not just because it was directly trained on the whole book, which the article also alludes to:

> For example, the researchers found that Llama 3.1 70B only memorized 0.13 percent of Sandman Slim, a 2009 novel by author Richard Kadrey. That’s a tiny fraction of the 42 percent figure for Harry Potter.

In case it isn't obvious, both Harry Potter and Sandman Slim are parts of books3 dataset.

[1] -- https://arxiv.org/abs/2505.24832 [2] -- https://arxiv.org/abs/2404.05405