Comment by thomask1995

HI! OG Author here.

Honestly, I don't know.

This was purely a toy project/thought experiment to challenge myself to learn exactly how these LLMs worked.

It was super cool to see the loss go down and it actually "train".

This is SUPER far from a the real deal. Maybe it could be cool to see how far a fully in memory LLM running on CPU can go.