Comment by littlestymaar
Comment by littlestymaar 4 hours ago
> The best way to drive inference cost down right now is to use TPUs
TPUs are cool, but the best leverage remains to reduce your (active) parameters count.
Comment by littlestymaar 4 hours ago
> The best way to drive inference cost down right now is to use TPUs
TPUs are cool, but the best leverage remains to reduce your (active) parameters count.