NanoChat – The best ChatGPT that $100 can buy
(github.com)143 points by huseyinkeles 3 hours ago
143 points by huseyinkeles 3 hours ago
A GPU with 80GB VRAM costs around $1-3 USD an hour on commodity clouds (i.e. the non-Big 3 bare metal providers e.g. https://getdeploying.com/reference/cloud-gpu/nvidia-h100). I think it's accessible to most middle class users in first world countries.
If I have let’s say 40gb RAM does it not work at all or just take twice as long to train?
I've always thought about the best way to contribute to humanity: number of people you help x how much you help them. I think what Karpathy is doing is one of the highest leverage ways to achieve that.
Our current world is build on top of open source projects. This is possible because there are a lot of free resources to learn to code so anyone from anywhere in the world can learn and make a great piece of software.
I just hope the same will happen with the AI/LLM wave.
Yes agree. Other high leverage ways are to control culture. Andrew Tate comes to mind.
not a particularly ethical guy and I wouldn't hold him up as a example of morality but the guy hasn't actually been found guilty YET. Multiple courts have tried. You'd think that for a guy under as much scrutiny as him that they would have SOMETHING to pin him on by now.
Innocent until PROVEN guilty is a foundational legal precedent for a reason.
He is definitely guilty of being a waste of human life, a massive asshole and a general detriment to society worldwide. Don’t need a court to prove that.
There are 6 criminal cases against him in several countries, let’s see how they pan out - but regardless he is not an innocent person.
I mean just an example. He obviously wasn't the most ethical person. Depends how you do it
Here's the announcement post [0] from Karpathy, which provides a bit of additional context.
Eureka Labs: https://github.com/EurekaLabsAI
What a prolific person Andrej is. It's been more than amazing to follow along!
Still under development, remaining work includes tuning nanochat (current state being solid v0.1) and finalizing the in-between projects so that students can "unlock" all complexity that hides underneath: `torch.Tensor`, `torch.dist`, `.backward()`, '.compile()`, etc. And then the more ops heavy aspects.
Karpathy says nanochat will become the capstone project of the course LLM101n being developed by Eureka Labs.
I guess it’s still a work in progress? Couldn’t find any other information elsewhere.
A bit more info [here](https://github.com/karpathy/LLM101n)
Should be "that you can train for $100"
Curios to try it someday on a set of specialized documents. Though as I understand the cost of running this is whatever GPU you can rent with 80GB of VRAM. Which kind of leaves hobbyists and students out. Unless some cloud is donating gpu compute capacity.