Comment by halJordan

Comment by halJordan 2 days ago

33 replies

Honestly that was a hard read. I hope that guy gets an mi355 just for writing this.

AMD deserves exactly zero of the credulity this writer heaps onto them. They just spent four months not supporting their rdna4 lineup in rocm after launch. AMD is functionally capable of day120 support. None of the benchmarks disambiguated where the performance is coming from. 100% they are lying on some level, representing their fp4 performance against fp 8/16.

jchw 2 days ago

I still find their delay with properly investing in ROCm on client to be rather shocking, but in fairness they did finally announce that they would be supporting client cards on day 1[1]. Of course, AMD has to keep the promise for it to matter, but they really do seem to, for whatever reason, finally realized just how important it is that ROCm is well-supported across their entire stack (among many other investments they've announced recently.)

It's baffling that AMD is the same company that makes both Ryzen and Radeon, but the year-to-date for Radeon has been very good, aside from the official ROCm support for RDNA4 taking far too long. I wouldn't get overly optimistic; even if AMD finally committed hard to ROCm and Radeon it doesn't mean they'll be able to compete effectively against NVIDIA, but the consumer showing wasn't so bad so far with the 9070 XT and FSR4, so I'm cautiously optimistic they've decided to try to miss some opportunities to miss opportunities. Let's see how long these promises last... Maybe longer than a Threadripper socket, if we're lucky :)

[1]: https://www.phoronix.com/news/AMD-ROCm-H2-2025

  • roenxi a day ago

    Is this day 1 support a claim about the future or something they've demonstrated? Because if it involves the future it is safer to just assume AMD will muck it up somehow when it comes to their AI chips. It isn't like their failure in the space is a weird one-off - it has been confusingly systemic for years. It'd be nice if they pull it off, but it could easily be day 1 support for a chip that turns out to crash the computer.

    I dunno; I suppose they can execute on server parts. But regardless, a good plan here is to let someone else go first and report back.

    • jchw a day ago

      They've been able to execute well for Ryzen, EPYC, and Radeon in the data center. I don't really think there's any reason to believe they can't or even wouldn't be able to do ROCm on client cards, but up until recently they wouldn't commit.

pclmulqdq 2 days ago

AMD doesn't care about you being able to do computing on their consumer GPUs. The datacenter GPUs have a pretty good software stack and great support.

  • fc417fc802 2 days ago

    I'm inclined to believe it but that difference is exactly how nvidia got so far ahead of them in this space. They've consistently gone out of their way to put their GPGPU hardware and software in the hands of the average student and professional and the results speak for themselves.

    • tormeh a day ago

      I wouldn't say so. Nvidia bet on machine learning a decade or so before AMD got the memo. That was a good bet on Nvidia's part. In 2015 you just had to have an Nvidia card if you wanted to do ML research. Sure, Nvidia did hand them out in some cases, but even if you bought an AMD card it just wouldn't work. It was Nvidia or go home. Even if AMD now did everything right (and they don't), there's a decade+ of momentum in Nvidia's favor.

    • zombiwoof a day ago

      Just look at the disaster of rocm or you need to spend 300k on software engineers to get anything so work

  • stingraycharles 2 days ago

    Yes but then they fail to understand a lot of “long tail” home projects, opensource stuff etc is done on consumer GPUs at home, which is tremendously important for ecosystem support.

    • wmf a day ago

      What if they understand that and they don't care? Getting one hyperscaler as a customer is worth more than the entire long tail.

      • lhl a day ago

        On the corp side you have FB w/ PyTorch, xformers (still pretty iffy on AMD support tbt) and MS w/ DeepSpeed. But let's see about some others:

        Flash Attention: academia, 2y behind for AMD support

        bitsandbytes: academia, 2y behind for AMD support

        Marlin: academia, no AMD support

        FlashInfer: acadedmia/startup, no AMD

        ThunderKittens: academia, no AMD support

        DeepGEMM, DeepEP, FlashMLA: ofc, nothing from China supports AMD

        Without the long tail AMD will continue to always be in a position where they have to scramble to try to add second tier support years later themselves, while Nvidia continues to get all the latest and greatest for free.

        This is just off the top of my head on the LLM side where I'm focused on, btw. Whenever I look at image/video it's even more grim.

      • stingraycharles a day ago

        The problem is that this is short-term thinking. You need students and professionals playing around with your tools at home and/or on their work computers to drive hyperscale demand in the long term.

        This is why it’s so important AMD gets their act together quickly, as the benefits of these kind of things are measured in years, not months.

      • selectodude a day ago

        Then they’re fools. Every AI maestro knows CUDA because they learned it at home.

        • jiggawatts a day ago

          It’s the same reason there’s orders of magnitude more code written for Linux than for mainframes.

      • danielheath a day ago

        Why would a hyperscaler pick the technology that’s harder to hire for (because there’s no hobbyist-to-expert pipeline)?

      • moffkalast a day ago

        Then they will stay irrelevant in the GPU space like they have been so far.

      • littlestymaar a day ago

        Why should we care about them if they don't care?

        I mean of they want to stay at a fraction of the market value and profit of their direct competitor, good for them.

        • dummydummy1234 a day ago

          I want a competitive market so I can have cheaper gpus.

          It's Nvidia, AMD, and maybe Intel.

    • cma 2 days ago

      Nvidia started removing nvlink with the 4000 series, they aren't heavily focused on it either anymore and want to sell the workstation cards for uses like training models at home.

  • archerx 2 days ago

    If they care about their future they should. I am a die hard AMD supporter and even I am getting over their mediocrity and what seems to be constant self sabotage in the GPU department.

    • zombiwoof a day ago

      It’s the AMD management . They just are recycling 20 year VP lifers at AMD to take over key projects

      • archerx a day ago

        They could have slapped 48gb of vram on their new Radeon cards and they would have instantly sold out but that would cut into cousins profit margin at nvidia so that’s obviously a no go.

  • booder1 2 days ago

    I have had trained on both large AMD and Nvidia clusters and your right AMD support is good. I never had to talk to Nvidia support. That was better.

    They should care about the availability of their hardware so large customers don't have to find and fix their bugs. Let consumers do that...

  • pjmlp a day ago

    Except they forget people get to adopt technologies by learning them on their consumer hardware.

  • echelon 2 days ago

    > AMD doesn't care about you being able to do computing on their consumer GPUs

    Makes it a little hard to develop for without consumer GPU support...

  • caycep 2 days ago

    this is ROCm?

    • fooblaster 2 days ago

      Yes, the mi300x/mi250 are best supported as they directly compete with data center gpus from Nvidia which actually make money. Desktop is a rounding error by comparison.

  • shmerl a day ago

    Aren't they addressing it with the unified UDNA architecture? That's going to be a thing in the future GPUs, making consumer and datacenter ones share the same arch.

    Different architectures was probably a big reason for the above issue.