Comment by throwaway81523
Comment by throwaway81523 8 days ago
I looked at the CUDA code for Leela Chess Zero and found it pretty understandable, though that was back when Leela used a DCNN instead of transformers. DCNN's are fairly simple and are explained in fast.ai videos that I watched a few years ago, so navigating the Leela code wasn't too difficult. Transformers are more complicated and I want to bone up on them, but I haven't managed to spend any time understanding them.
CUDA itself is just a minor departure from C++, so the language itself is no big deal if you've used C++ before. But, if you're trying to get hired programming CUDA, what that really means is they want you implementing AI stuff (unless it's game dev). AI programming is a much wider and deeper subject than CUDA itself, so be ready to spend a bunch of time studying and hacking to come up to speed in that. But if you do, you will be in high demand. As mentioned, the fast.ai videos are a great introduction.
In the case of games, that means 3D graphics which these days is another rabbit hole. I knew a bit about this back in the day, but it is fantastically more sophisticated now and I don't have any idea where to even start.
This is a great idea! This is the code right' https://github.com/leela-zero/leela-zero
I have two beginner (and probably very dumb) questions, why do they have heavy c++/cuda usage rather than using only pytorch/tensorflow. Are they too slow for training Leela? Second, why is there tensorflow code?