Comment by flowerthoughts

If LLMs are good at summarizing/compressing, what does this say about the underlying text? Why are some passages more easily recalled? Sure, some sections have probably been quoted more times than others, so there's bias in training data, which might explain why the Llama 1 and 3.1 images have similar peaks. Would this happen to LLMs even with no training bias?

Edit: seems the first part is about a memory about being bullied by Duddley. The second is where he's been elected to the quidditch team. Possibly they are just boring passages, compared to the surrounding ones. So probably just training bias.