Comment by FridgeSeal
Comment by FridgeSeal 4 hours ago
What does “borking the projection matrices” and affecting the BPE tokeniser mean/look like here?
Are we just trying to produce content that will pass as human-like (therefore get stripped out by coarse filtering) but has zero or negative informational utility to the model? That would mean, theoretically if enough is trained on it would actively worsen the model performance right?