Comment by Javantea_
I'm surprised no one in the comments has mentioned overfitting. Perhaps this is too obvious but I think of it as a very clear bug in a model if it asserts something to be true because it has heard it once. I realize that training a model is not easy, but this is something that should've been caught before it was released. Either QA is sleeping on the job or they have intentionally released a model with serious flaws in its design/training. I also understand the intense pressure to release early and often, but this type of thing isn't a warning.
It's apparently known among LLM researchers that the best epoch count for LLM training is one. They go through the entire dataset once, and that makes best LLMs.
They know. LLM is a novel compression format for text(holographic memory or whatever). The question is whether the rest of the world accept this technology as it is or not.