Comment by chao-
Comment by chao- 16 hours ago
Comparing this against mobile dGPUs and the (finally real) DGX Spark, this feels like a latent market segment that has not arrived at its final form. I don't know what delayed the DGX Spark so long, but it granted AMD a huge boon by allowing them capture some market mindshare first.
Compared to discrete GPUs (mobile or not), the advantage of a dGPU is memory bandwidth. The disadvantage of a dGPU is power draw and memory capacity—if we set aside CUDA, which I grant is a HUGE thing to just "set aside".
If we mix in the small DGX Spark desktops, then those have an additional advantage in the dual 200Gb network ports that allow for RDMA across multiple boxes. One could get more from of a small stack (2, 3 or 4) of those than from the same number of Strix Halo 395 boxes. However, as sexy as my homelab-brain finds a small stack of DGX Spark boxes with RDMA, I would think that for professional use, I would rather have a GPU server (or Threadripper GPU workstation) than four DGX Spark boxes?
Because the DGX Spark isn't being sold in a laptop (AFAIK, CMIIW), that is another differentiator in favor of the Strix Halo. Once again, it points to this being a weird, emerging market segment, and I expect the next generation or two will iterate towards how these capabilities really ought to be packaged.
Next gen, AMD has the Medusa Halo with (reportedly) a 384bit LPDDR6 bus. This should get you twice the memory of what Strix Halo has with 1.7 times the throughput when using memory that's already announced, with even better modules coming later.
I think with the success of Strix Halo as an inference platform, this market segment is here to stay.