Comment by om8

Comment by om8 21 hours ago

> 50 tokens is not really very much Yes! And also llama3.1’s tokens are different from Qwen and llama1 tokens. That’s the first model where meta started to use very large vocab_size.