Comment by karmakaze

What I find impressive with V3.1 are the things that are different, especially efficiency:

Significant improvements in training efficiency through innovations like FP8 mixed precision training, which reduces memory use by up to 75% and accelerates training.

Faster inference speed with multi-token prediction architecture, generating multiple tokens per step, resulting in 2-3x faster outputs.

New hybrid thinking mode that allows switching between fast non-thinking mode and slower, more thoughtful reasoning without quality loss.