geerlingguy 2 days ago

Kiki K2 was made to be optimized at 4-bit, though.

  • natrys 2 days ago

    That's the Kimi K2 Thinking, this post seems to be talking about original Kimi K2 Instruct though, I don't think INT4 QAT (quantization aware training) version was released for this.

elif 2 days ago

I think when you say trillion parameters, it's implied that it's quantized