Comment by dauertewigkeit

Comment by dauertewigkeit 17 hours ago

2 replies

CNNs and Transformers are both really simple and intuitive so I don't think there is any stroke of genius in how they were devised.

Their success is due to datasets and the tooling that allowed models to be trained on large amounts of data, sufficiently fast using GPU clusters.

yzydserd 17 hours ago

Exactly right. Neatly said by the author in the linked article.

> I spent years building ImageNet, the first large-scale visual learning and benchmarking dataset and one of three key elements enabling the birth of modern AI, along with neural network algorithms and modern compute like graphics processing units (GPUs).

Datasets + NNs + GPUs. Three "vastly different" advances that came together. ImageNet was THE dataset.

byearthithatius 15 hours ago

"CNNs and Transformers are both really simple and intuitive" and labeling a bunch of images you downloaded is not simple and intuitive? It was a team effort and I would hardly call a single dataset what drove modern ML. Most of currently deployed modern ML wasn't trained on that dataset and didn't come from models trained on it.