Comment by nulld3v
https://github.com/triton-lang/triton/pull/7298#discussion_r...
> By disassembly of ptxas, it is indeed hard-coded that they have logic like: strstr(kernel_name, "cutlass").
> it is likely that, this is an unstable, experimental, aggressive optimization by NVIDIA, and blindly always enabling it may produce some elusive bugs.
Thanks for a little context, this is not my wheelhouse at all (never even heard of this project) and I could not make heads or tails of the title or the linked PR.