Comment by bloppe

Comment by bloppe 13 hours ago

1 reply

The best way to drive inference cost down right now is to use TPUs. Either that or invest tons of additional money and manpower into silicon design like Google did, but they already have a 10 year lead there.

littlestymaar 4 hours ago

> The best way to drive inference cost down right now is to use TPUs

TPUs are cool, but the best leverage remains to reduce your (active) parameters count.