Comment by imtringued
Comment by imtringued 6 months ago
I'm not an expert, but MoE models perform better at continuous learning, because they are less prone to catastrophic forgetting.
Comment by imtringued 6 months ago
I'm not an expert, but MoE models perform better at continuous learning, because they are less prone to catastrophic forgetting.