frainfreeze 6 days ago

For advanced autocomplete (not code generation, but can do that too), basic planning, looking things up instead of web search, review & summary, even one shooting smaller scripts, the 32b Q4 models proved very good for me (24gb VRAM RTX 3090). All LLM caveats still apply, of course. Note that setting up local llm in cursor is pain because they don't support local host. Ngrok or vps and reverse ssh solve that though.

int_19h 6 days ago

It's not so much that it's slow, it's that local models are still a far cry from what SOTA cloud LLM providers offer. Depending on what you're actually doing, a local model might be good enough.