Comment by fho
You might be interested in RWKV: https://www.rwkv.com/
Not exactly "minimal viable", but a "what if RNNs where good for LLMs" case study.
-> insanely fast on CPUs
You might be interested in RWKV: https://www.rwkv.com/
Not exactly "minimal viable", but a "what if RNNs where good for LLMs" case study.
-> insanely fast on CPUs
My personal idea revolves around "can I run it on a basic smartphone, with whatever the 'floor' for basic smartphones under lets say $300 is for memory (let's pretend RAM prices are normal).
Edit: The fact this runs on a Smartphone means it is highly relevant. My only thing is, how do we give such a model an "unlimited" context window, so it can digest as much as it needs. I know some models know multiple languages, I wouldnt be surprised if sticking to only English would reduce the model size / need for more hardware and make it even smaller / tighter.