Comment by Onavo

Comment by Onavo 4 hours ago

A GPU with 80GB VRAM costs around $1-3 USD an hour on commodity clouds (i.e. the non-Big 3 bare metal providers e.g. https://getdeploying.com/reference/cloud-gpu/nvidia-h100). I think it's accessible to most middle class users in first world countries.

antinomicus 3 hours ago

Isn’t the whole point to run your model locally?

Reply View 4 replies

theptip 3 hours ago

No, that’s clearly not a goal of this project.
This is a learning tool. If you want a local model you are almost certainly better using something trained on far more compute. (Deepseek, Qwen, etc)

Reply View | 0 replies
yorwba 3 hours ago

The 80 GB are for training with a batch size of 32 times 2048 tokens each. Since the model has only about 560M parameters, you could probably run it on CPU, if a bit slow.

Reply View | 0 replies
simonw 2 hours ago

You can run a model locally on much less expensive hardware. It's training that requires the really big GPUs.

Reply View | 0 replies
jsight 3 hours ago

I'd guess that this will output faster than the average reader can read, even while using only CPU inferencing on a modern-ish CPU.
The param count is small enough that even cheap (<$500) GPUs would work too.

Reply View | 0 replies