Comment by beepbooptheory
Comment by beepbooptheory 2 days ago
Even granting that we can disregard a really huge factor here, which I'm not sure we really can, one can not know beforehand how the clustering of the vocabulary is going to go pre-training, and its speculated that both at the center and at the edges of clusters we get random particularities. Hence the "solidgoldmagikarp" phenomenon and many others.