Comment by dist-epoch

Comment by dist-epoch 8 days ago

As they typically say: Just Do It (tm).

Start writing some CUDA core to sort an array or find the maximum element.

I concur with this. Then supplement with resources A/R. Ideally, find some tasks in your programs that are parallelize. (Learning what these are is important too!), and switch them to Cuda. If you don't have any, make a toy case, e.g. an n-body simulation.

Reply View 0 replies

amelius 8 days ago

I'd rather learn to use a library that works on any brand of GPU.

If that is not an option, I'll wait!

Reply View 12 replies

latchkey 8 days ago

Then learn PyTorch.
The hardware between brands is fundamentally different. There isn't a standard like x86 for CPUs.
So, while you may use something like HIPIFY to translate your code between APIs, at least with GPU programming, it makes sense to learn how they differ from each other or just pick one of them and work with it knowing that the others will just be some variation of the same idea.

Reply View | 1 reply
- horsellama 8 days ago
  
  the jobs requiring cuda experience are most of the times because torch is not good enough
  
  Reply View | 0 replies
labberdabberdoo 2 days ago

Isn't this basically what Mojo is attempting? "Vendor independent GPU programmability", according to Modular.

Reply View | 0 replies
pjmlp 8 days ago

If only Khronos and the competition cared about the developer experience....

Reply View | 5 replies
- the__alchemist 8 days ago
  
  This is continuously a point of frustration! Vulkan compute is... suboptimal. I use Cuda because it feels like the only practical option. I want Vulkan or something else to compete seriously, but until that happens, I will use Cuda.
  
  Reply View | 4 replies
  
  corysama 8 days ago
  
  Is https://github.com/KomputeProject/kompute + https://shader-slang.org/ getting there?
  Runs on anything + auto-differentiatation.
  
  Reply View | 0 replies
  
  pjmlp 8 days ago
  
  It took until Vulkanised 2025, to acknowledge Vulkan became the same mess as OpenGL, and to put an action plan into action to try to correct this.
  Had it not been for Apple with OpenCL initial contribution, regardless of how it went from there, AMD with Mantle as starting point for Vulkan, NVidia with Vulkan-Hpp and Slang, and the ecosystem of Khronos standards would be much worse.
  Also Vulkan isn't as bad as OpenGL tooling, because LunarG exists, and someone pays them for the whole Vulkan SDK.
  The attitude "we put paper standards" and the community should step in for the implementations and tooling, hardly comes to the productivity from private APIs tooling.
  Also all GPU vendors, including Intel and AMD, also rather push their own compute APIs, even if based on top of Khronos ones.
  
  Reply View | 2 replies
Cloudef 8 days ago

Both zig and rust are aiming to compile to gpus natively. What cuda and hip provide is heterogeneous computing runtime, aka hiding the boilerplate of executing code on cpu and gpu seamlessly

Reply View | 0 replies
moralestapia 8 days ago

K, bud.
Perhaps you haven't noticed, but you're in a thread that asked about CUDA, explicitly.

Reply View | 0 replies
uecker 8 days ago

GCC / clang also have support for offloading.

Reply View | 0 replies