Comment by abricq
This is great ! Congratulations. I really like your project, especially I like how easily it is to peak at.
Do you plan on moving forward with this project ? I seem to understand that all the training is done on the CPU, and that you have next steps regarding optimizing that. Do you consider GPU accelerations ?
Also, do you have any benchmarks on known hardware ? Eg, how long would it take to train on a macbook latest gen or your own computer ?
HI! OG Author here.
Honestly, I don't know.
This was purely a toy project/thought experiment to challenge myself to learn exactly how these LLMs worked.
It was super cool to see the loss go down and it actually "train".
This is SUPER far from a the real deal. Maybe it could be cool to see how far a fully in memory LLM running on CPU can go.