Comment by moralestapia
Comment by moralestapia a day ago
???
Know any LLMs that are implemented in CUDA?
Comment by moralestapia a day ago
???
Know any LLMs that are implemented in CUDA?
Wrong.
Show me one single CUDA kernel on Llama's source code.
(and that's a really easy one, if one knows a bit about it)
Wrong.
It is the same PyTorch whether it runs on an AMD or an NVIDIA GPU.
The exact same PyTorch, actually.
Are you're trying to suggest that the machine code that runs on the GPU is the one that is different?
If you knew a bit more, you would know that this is the case even between different generations of GPUs of the same vendor; making that argument completely absurd.
The average consumer uses llama.cpp. So here is your list of kernels: https://github.com/ggml-org/llama.cpp/tree/master/ggml/src/g...
And here is pretty damning evidence that you're full of shit: https://github.com/ggml-org/llama.cpp/blob/master/ggml/src/g...
The ggml-hip backend references the ggml-cuda kernels. The "software is the same" (as in, it is CUDA) and yet AMD is still behind.
Ultimately all of them except Gemini.