Comment by lokimedes
Comment by lokimedes 8 days ago
There’s a couple of “concerns” you may separate to make this a bit more tractable:
1. Learning CUDA - the framework, libraries and high-layer wrappers. This is something that changes with times and trends.
2. Learning high-performance computing approaches. While a GPU and the Nvlink interfaces are Nvidia specific, working in a massively-parallel distributed computing environment is a general branch of knowledge that is translatable across HPC architectures.
3. Application specifics. If your thing is Transformers, you may just as well start from Torch, Tensorflow, etc. and rely on the current high-level abstractions, to inspire your learning down to the fundamentals.
I’m no longer active in any of the above, so I can’t be more specific, but if you want to master CUDA, I would say learning how massive-parallel programming works, is the foundation that may translate into transferable skills.
Former GPU guy here. Yeah, that's exactly what I was going to suggest too, with emphasis on #2 and #3. What kind of jobs are they trying to apply for? Is it really CUDA that they need to be familiar with, or CUDA-based libraries like cuDNN, cuBLAS, cuFFT, etc?
Understanding the fundamentals of parallel programming comes first, IMO.