Comment by typpilol

Comment by typpilol 7 hours ago

1 reply

Won't work at all. Or if it does it'll be so slow since it'll have to go to the disk for every single calculation so it won't ever finish.

karpathy 5 hours ago

It will work great with 40GB GPU, probably a bit less than twice slower. These are micro models of a few B param at most and fit easily during both training and inference.