Comment by pton_xd

Comment by pton_xd 2 days ago

That's pretty much the state of today. Frontier LLMs are already trained on all publicly available human-generated text, and they are already heavily training on synthetic data to improve at verifiable tasks eg coding.