Comment by touisteur

Comment by touisteur 17 hours ago

0 replies

If you don't need full ieee-754 double precision, ozaki scheme (emulation with tensor cores) might do the trick. It's been added (just a little bit) to cublas recently.