Comment by lalassu
Comment by lalassu 2 days ago
Disclaimer: I did not test this yet.
I don't want to make big generalizations. But one thing I noticed with chinese models, especially Kimi, is that it does very well on benchmarks, but fails on vibe testing. It feels a little bit over-fitting to the benchmark and less to the use cases.
I hope it's not the same here.
K2 Thinking has immaculate vibes. Minimal sycophancy and a pleasant writing style while being occasionally funny.
If it had vision and was better on long context I'd use it so much more.