Comment by aschobel

Comment by aschobel a day ago

1 reply

Indeed! It is a form of massive lossy compression.

> Llama 3 70B was trained on 15 trillion tokens

That's roughly a 200x "compression" ration; compared to 3-7x for tradtional lossless text compression like bzip and friends.

LLM don't just compress, they generalize. If they could only recite Harry Potter perfectly but couldn’t write code or explain math, they wouldn’t be very useful.

amlib 11 hours ago

But LLMs cant write code nor explain math, they only plagiarize existing code and plagiarize existing explanations of math.