the__alchemist 8 days ago

I concur with this. Then supplement with resources A/R. Ideally, find some tasks in your programs that are parallelize. (Learning what these are is important too!), and switch them to Cuda. If you don't have any, make a toy case, e.g. an n-body simulation.

amelius 8 days ago

I'd rather learn to use a library that works on any brand of GPU.

If that is not an option, I'll wait!

  • latchkey 8 days ago

    Then learn PyTorch.

    The hardware between brands is fundamentally different. There isn't a standard like x86 for CPUs.

    So, while you may use something like HIPIFY to translate your code between APIs, at least with GPU programming, it makes sense to learn how they differ from each other or just pick one of them and work with it knowing that the others will just be some variation of the same idea.

    • horsellama 8 days ago

      the jobs requiring cuda experience are most of the times because torch is not good enough

  • labberdabberdoo 2 days ago

    Isn't this basically what Mojo is attempting? "Vendor independent GPU programmability", according to Modular.

  • pjmlp 8 days ago

    If only Khronos and the competition cared about the developer experience....

    • the__alchemist 8 days ago

      This is continuously a point of frustration! Vulkan compute is... suboptimal. I use Cuda because it feels like the only practical option. I want Vulkan or something else to compete seriously, but until that happens, I will use Cuda.

      • pjmlp 8 days ago

        It took until Vulkanised 2025, to acknowledge Vulkan became the same mess as OpenGL, and to put an action plan into action to try to correct this.

        Had it not been for Apple with OpenCL initial contribution, regardless of how it went from there, AMD with Mantle as starting point for Vulkan, NVidia with Vulkan-Hpp and Slang, and the ecosystem of Khronos standards would be much worse.

        Also Vulkan isn't as bad as OpenGL tooling, because LunarG exists, and someone pays them for the whole Vulkan SDK.

        The attitude "we put paper standards" and the community should step in for the implementations and tooling, hardly comes to the productivity from private APIs tooling.

        Also all GPU vendors, including Intel and AMD, also rather push their own compute APIs, even if based on top of Khronos ones.

  • Cloudef 8 days ago

    Both zig and rust are aiming to compile to gpus natively. What cuda and hip provide is heterogeneous computing runtime, aka hiding the boilerplate of executing code on cpu and gpu seamlessly

  • moralestapia 8 days ago

    K, bud.

    Perhaps you haven't noticed, but you're in a thread that asked about CUDA, explicitly.

  • uecker 8 days ago

    GCC / clang also have support for offloading.