Comment by dauertewigkeit
Comment by dauertewigkeit 17 hours ago
CNNs and Transformers are both really simple and intuitive so I don't think there is any stroke of genius in how they were devised.
Their success is due to datasets and the tooling that allowed models to be trained on large amounts of data, sufficiently fast using GPU clusters.
Exactly right. Neatly said by the author in the linked article.
> I spent years building ImageNet, the first large-scale visual learning and benchmarking dataset and one of three key elements enabling the birth of modern AI, along with neural network algorithms and modern compute like graphics processing units (GPUs).
Datasets + NNs + GPUs. Three "vastly different" advances that came together. ImageNet was THE dataset.