Comment by wmf

Comment by wmf 6 months ago

What if they understand that and they don't care? Getting one hyperscaler as a customer is worth more than the entire long tail.

stingraycharles 6 months ago

The problem is that this is short-term thinking. You need students and professionals playing around with your tools at home and/or on their work computers to drive hyperscale demand in the long term.

This is why it’s so important AMD gets their act together quickly, as the benefits of these kind of things are measured in years, not months.

Reply View 0 replies

lhl 6 months ago

On the corp side you have FB w/ PyTorch, xformers (still pretty iffy on AMD support tbt) and MS w/ DeepSpeed. But let's see about some others:

Flash Attention: academia, 2y behind for AMD support

bitsandbytes: academia, 2y behind for AMD support

Marlin: academia, no AMD support

FlashInfer: acadedmia/startup, no AMD

ThunderKittens: academia, no AMD support

DeepGEMM, DeepEP, FlashMLA: ofc, nothing from China supports AMD

Without the long tail AMD will continue to always be in a position where they have to scramble to try to add second tier support years later themselves, while Nvidia continues to get all the latest and greatest for free.

This is just off the top of my head on the LLM side where I'm focused on, btw. Whenever I look at image/video it's even more grim.

Reply View 2 replies

jimmySixDOF 6 months ago

Modular says Max/Mojo will change this and make refactoring between different vendors (and different lines of the same vendor) less of a showstopper but tbd for now

Reply View | 1 reply
- pjmlp 6 months ago
  
  The judge is still out there regarding if Max/Mojo is going to be something that the large majority cares about.
  
  Reply View | 0 replies

selectodude 6 months ago

Then they’re fools. Every AI maestro knows CUDA because they learned it at home.

Reply View 1 reply

jiggawatts 6 months ago

It’s the same reason there’s orders of magnitude more code written for Linux than for mainframes.

Reply View | 0 replies

danielheath 6 months ago

Why would a hyperscaler pick the technology that’s harder to hire for (because there’s no hobbyist-to-expert pipeline)?

Reply View 0 replies

moffkalast 6 months ago

Then they will stay irrelevant in the GPU space like they have been so far.

Reply View 0 replies

littlestymaar 6 months ago

Why should we care about them if they don't care?

I mean of they want to stay at a fraction of the market value and profit of their direct competitor, good for them.

Reply View 1 reply

dummydummy1234 6 months ago

I want a competitive market so I can have cheaper gpus.
It's Nvidia, AMD, and maybe Intel.

Reply View | 0 replies