Comment by nathanielsimard

Comment by nathanielsimard 20 hours ago

0 replies

One of the main author here, the readme isn't really well up-to-date. We have our own gemm implementation based on CubeCL. It's still moving a lot, but we support tensor cores, use warp operations (Plane Operations in CubeCL), we even added TMA instructions for CUDA.