Comment by int_19h
I don't see how that is evidence of the claim. We are doing all these things because they make existing models work better, but a larger model with RAG etc is still better than a small one, and everyone keeps working on larger models.
There is a contingent that I think Marcus is responding to that have been claiming that all we need to get to AGI or ASI is pure transformer scaling, and that we were very close with only maybe $10B or $100B more investment to get there. If the last couple of years of research have given us only incrementally better models to the point that even the best funded teams are moving to hybrid approaches then that's evidence that Marcus is correct.