Comment by stingraycharles

Comment by stingraycharles 2 days ago

13 replies

I don’t think it will ever make sense; you can buy so much cloud based usage for this type of price.

From my perspective, the biggest problem is that I am just not going to be using it 24/7. Which means I’m not getting nearly as much value out of it as the cloud based vendors do from their hardware.

Last but not least, if I want to run queries against open source models, I prefer to use a provider like Groq or Cerebras as it’s extremely convenient to have the query results nearly instantly.

websiteapi 2 days ago

my issue is once you have it in your workflow I'd be pretty latency sensitive. imagine those record-it-all apps working well. eventually you'd become pretty reliant on it. I don't want to necessarily be at the whims of the cloud

  • stingraycharles 2 days ago

    Aren’t those “record it all” applications implemented as a RAG and injected into the context based on embedding similarity?

    Obviously you’re not going to always inject everything into the context window.

[removed] 2 days ago
[deleted]
lordswork 2 days ago

As long as you're willing to wait up to an hour for your GPU to get scheduled when you do want to use it.

  • stingraycharles 2 days ago

    I don’t understand what you’re saying. What’s preventing you from using eg OpenRouter to run a query against Kimi-K2 from whatever provider?

    • hu3 2 days ago

      and you'll get a faster model this way

    • bgwalter 2 days ago

      Because you have Cloudflare (MITM 1), Openrouter (MITM 2) and finally the "AI" provider who can all read, store, analyze and resell your queries.

      EDIT: Thanks for downvoting what is literally one of the most important reasons for people to use local models. Denying and censoring reality does not prevent the bubble from bursting.

      • irthomasthomas a day ago

        you can use chutes.ai TEE (Trusted Execution Environment) and Kimi K2 is running at about 100t/s rn

givinguflac 2 days ago

I think you’re missing the whole point, which is not using cloud compute.

  • stingraycharles 2 days ago

    Because of privacy reasons? Yeah I’m not going to spend a small fortune for that to be able to use these types of models.

    • givinguflac 2 days ago

      There are plenty of examples and reasons to do so besides privacy- because one can, because it’s cool, for research, for fine tuning, etc. I never mentioned privacy. Your use case is not everyone’s.

      • wyre 2 days ago

        All of those things you can still do renting AI server compute though? I think privacy and cool-factor are the only real reasons why it would be rational for someone to spend checks the apple store $19,000 on computer hardware...

        • givinguflac 20 hours ago

          Why do you look at this as a consumer? Have you never heard of businesses spending money on hardware???