Comment by Q6T46nT668w6i3m
Comment by Q6T46nT668w6i3m 5 days ago
I agree that “learning CUDA wasn’t particularly difficult to get started,” there are Grand Canyon sized chasms between CUDA and its alternatives when attempting to crank performance.
Comment by Q6T46nT668w6i3m 5 days ago
I agree that “learning CUDA wasn’t particularly difficult to get started,” there are Grand Canyon sized chasms between CUDA and its alternatives when attempting to crank performance.
Well, I think to a degree that depends what you're targeting.
Single socket 8 core CPU? Yes.
If you spent some time playing with trying to eke out performance on Xeon Phi and have done NUMA-aware code for multi socket boards and optimising for the memory hierarchy of L1/L2/L3 then it really isn't that different.