Comment by beepbooptheory

Comment by beepbooptheory 2 days ago

0 replies

Even granting that we can disregard a really huge factor here, which I'm not sure we really can, one can not know beforehand how the clustering of the vocabulary is going to go pre-training, and its speculated that both at the center and at the edges of clusters we get random particularities. Hence the "solidgoldmagikarp" phenomenon and many others.