Comment by imtringued
Comment by imtringued 3 days ago
I'm not an expert, but MoE models perform better at continuous learning, because they are less prone to catastrophic forgetting.
Comment by imtringued 3 days ago
I'm not an expert, but MoE models perform better at continuous learning, because they are less prone to catastrophic forgetting.