Comment by ra

Comment by ra 3 days ago

This is exactly right. Attention is all you need. It's all about attention. Attention is finite.

The more you data load into context the more you dilute attention.

people who criticize LLMs for merely regurgitating statistically related token sequences have very clearly never read a single HN comment