Comment by jwpapi

Comment by jwpapi 3 days ago

1 reply

On a side note I really thing latency is still important. Is there some benefit in choosing location for where you get your responses from? Like with Openrouter f.e.

Also I could think that a local model just for autocomplete could help reducing latency for completion suggestions.

oofbey 3 days ago

Latency matters for the autocomplete models. But IMHO those suck and generally just get in the way.

For the big agentic tasks or reasoned questions, the many seconds or even minutes of LLM time dwarf RTT even to another continent.

Side note: I recently had GPT5 in Cursor spend fully 45 minutes on one prompt chewing on why a bug was flaky, and it figured it out! Your laptop is not gonna do that anytime soon.