Comment by lordswork
As long as you're willing to wait up to an hour for your GPU to get scheduled when you do want to use it.
As long as you're willing to wait up to an hour for your GPU to get scheduled when you do want to use it.
Because you have Cloudflare (MITM 1), Openrouter (MITM 2) and finally the "AI" provider who can all read, store, analyze and resell your queries.
EDIT: Thanks for downvoting what is literally one of the most important reasons for people to use local models. Denying and censoring reality does not prevent the bubble from bursting.
you can use chutes.ai TEE (Trusted Execution Environment) and Kimi K2 is running at about 100t/s rn
I don’t understand what you’re saying. What’s preventing you from using eg OpenRouter to run a query against Kimi-K2 from whatever provider?