Comment by wasabi991011

Comment by wasabi991011 a day ago

2 replies

That's not what they are saying. SOTA models include much more than just language, and the scale of training data is related to its "intelligence". Restricting the corpus in time => less training data => less intelligence => less ability to "discover" new concepts not in its training data

withinboredom 2 hours ago

Could always train them on data up to 2015ish and then see if you can rediscover LLMs. There's plenty of data.

franktankbank a day ago

Perhaps less bullshit though was my thought? Was language more restricted then? Scope of ideas?