Comment by ACCount37
Not necessarily. The reason why SFT can hurt performance is often the gap between the data and the capabilities.
Imagine forcing someone who never used chopsticks to eat with the chopsticks. The results wouldn't be good - the instruction "use chopsticks" has taken effect, but an underlying "chopstick use" capability isn't there.
If your SFT data pushes your LLM too far past its capabilities? It'll teach it to try doing a thing it can't do.
If your SFT traces assume your LLM can do 10 digit multiplication, the LLM wouldn't learn 10 digit multiplication from them. It'll learn to attempt 10 digit multiplication, and it'll fail.
fair point regarding data quality, but in the PEFT-Bench study, the base model actually outperformed the fine-tuned versions on those specific math/code tasks.
So the "chopstick capability" was already there (at least partially), but the SFT process actively degraded it. It seems less about the data being too hard and more about the parameter-efficient methods (like LoRA) overwriting or interfering with delicate reasoning circuits just to satisfy the formatting loss.