Comment by abtinf
Comment by abtinf 17 hours ago
Of course it is.
It's just a form of compression.
If I train an autoencoder on an image, and distribute the weights, that would obviously be the same as distributing the content. Just because the content is commingled with lots of other content doesn't make it disappear.
Besides, where did the sections of text from the input works that show up in the output text come from? Divine inspiration? God whispering to the machine?
Indeed! It is a form of massive lossy compression.
> Llama 3 70B was trained on 15 trillion tokens
That's roughly a 200x "compression" ration; compared to 3-7x for tradtional lossless text compression like bzip and friends.
LLM don't just compress, they generalize. If they could only recite Harry Potter perfectly but couldn’t write code or explain math, they wouldn’t be very useful.