Comment by giancarlostoro
Comment by giancarlostoro 2 days ago
This is something I've been wondering about myself. What's the "Minimally Viable LLM" that can have simple conversations. Then my next question is, how much can we push it so it can learn from looking up data externally, can we build a tiny model with an insanely larger context window? I have to assume I'm not the only one who has asked or thought of these things.
Ultimately, if you can build an ultra tiny model that can talk and learn on the fly, you've just fully localized a personal assistant like Siri.
This is extremely similar to Karpathy's idea of a "cognitive core" [1]; an extremely small model with near-0 encyclopedic knowledge and basic reasoning and tool-use capabilities.
[1] https://x.com/karpathy/status/1938626382248149433