Comment by empiko
What a great write-up, kudos to the author! I’ve been in the field since 2014, so this really feels like reliving my career. I think one paradigm shift that isn’t fully represented in the article is what we now call “genAI.” Sure, we had all kinds of language models (BERTs, word embeddings, etc.), but in the end, most people used them to build customized classifiers or regression models. Nobody was thinking about “solving” tasks by asking oracle-like models questions in natural language. That was considered completely impossible with our technology even in 2018/19. Some people studied language models, but that definitely wasn’t their primary use case; they were mainly used to support tasks like speech-to-text, grammar correction, or similar applications.
With GPT-3 and later ChatGPT, there was a very fundamental shift in how people think about approaching NLP problems. Many of the techniques and methods became outdated and you could suddenly do things that were not feasible before.
> Nobody was thinking about “solving” tasks by asking oracle-like models
I remember this being talked about maybe even earlier than 2018/2019, but the scale of models then was still off by at least one order of magnitude before it had a chance of working. It was the ridiculous scale of GPT that allowed the insight that scaling would make it useful.
(Tangentially related; I remember a research project/system from maybe 2010 or earlier that could respond to natural language queries. One of the demos was to ask for distance between cities. It was based on some sort of language parsing and knowledge graph/database, not deep-learning. Would be interesting to read about this again, if anyone remembers.)