Comment by fredoliveira

Comment by fredoliveira 3 days ago

0 replies

> whole of English Wikipedia baked into them (IIRC it constitutes the bulk of the training data for pretty much all of them)

Not a dig on anything you are saying (because I agree that just shoving a link into an LLM and asking for a summary is a horrendous stand-in for learning), but worth correcting that wikipedia is a very small fraction (certainly under 1%) of the training corpus for LLMs these days.