Comment by Philpax

Comment by Philpax 6 days ago

Hence the "if" :-)

ROCm is getting some adoption, especially as some of the world's largest public supercomputers have AMD GPUs.

Some of this is also being solved by working at a different abstraction layer; you can sometimes be ignorant to the hardware you're running on with PyTorch. It's still leaky, but it's something.

Q6T46nT668w6i3m 5 days ago

Look at the state of PyTorch’s CI pipelines and you’ll immediately see that ROCm is a nightmare. Especially nowadays when TPU and MPS, while missing features, rarely create cascading failures throughout the stack.

Reply View 0 replies

physicsguy 5 days ago

I still don't see ROCm as that serious a threat, they're still a long way behind in library support.

I used to use ROCFFT as an example, it was missing core functionality that cuFFT has had since like 2008. It looks like they've finally caught up now, but that's one library among many.