Comment by doctorpangloss

Comment by doctorpangloss 9 hours ago

Hmm, but supposing the accelerated NVIDIA specific inference data types were available for Triton, then you would just use that? Why not contribute to Triton, they accept PRs? Like so what if you do free product ecosystem development for NVIDIA and giant corporations by contributing to Triton?

qeternity 8 hours ago

Second line of the post:

> The main objective is to learn writing attention in CUDA C++, since many features are not available in Triton, such as MXFP8 / NVFP4 MMA for sm120.

Reply View 2 replies

doctorpangloss an hour ago

Yes… I read it. If the feature is missing, why not contribute it instead?

Reply View | 1 reply
- almostgotcaught an hour ago
  
  How many PRs do you have landed in Triton that you can just blithely say "contribute it"?
  
  Reply View | 0 replies