Comment by embedding-shape

Comment by embedding-shape 20 hours ago

0 replies

> 6t/s is still around 24 characters per second which is faster than many people could read.

But again, not if you're using thinking/reasoning, which if you want to use this specific model properly, you are. Then you have a huge delay before the actual response comes through.

> MacStudio is the simplest solution to run it locally.

Obviously, that's Apple's core value proposition after all :) One does not acquire a state-of-the-art GPU and then expect simple stuff, especially when it's a fairly uncommon and new one. You cannot really be afraid of diving into CUDA code and similar fun rabbit holes. Simply two very different audiences for the two alternatives, and the Apple way is the simpler one, no doubt about it.