Comment by ra
This is exactly right. Attention is all you need. It's all about attention. Attention is finite.
The more you data load into context the more you dilute attention.
This is exactly right. Attention is all you need. It's all about attention. Attention is finite.
The more you data load into context the more you dilute attention.
people who criticize LLMs for merely regurgitating statistically related token sequences have very clearly never read a single HN comment