Comment by meatmanek

Comment by meatmanek 17 hours ago

Generation time is more or less proportional to tokens * model size, so if you can get the same quality result with fewer tokens from the same size of model, then you save time and money.

kergonath 5 hours ago

Thanks. That was not obvious to me either.

Reply View 0 replies