Comment by FridgeSeal

What does “borking the projection matrices” and affecting the BPE tokeniser mean/look like here?

Are we just trying to produce content that will pass as human-like (therefore get stripped out by coarse filtering) but has zero or negative informational utility to the model? That would mean, theoretically if enough is trained on it would actively worsen the model performance right?