Comment by bigyabai
People with basement rigs generally aren't the target audience for these gigantic models. You'd get much better results out of an MoE model like Qwen3's A3B/A22B weights, if you're running a homelab setup.
People with basement rigs generally aren't the target audience for these gigantic models. You'd get much better results out of an MoE model like Qwen3's A3B/A22B weights, if you're running a homelab setup.
Reproducibility of results are also important in some cases.
There are consumer-ish hardware that can run large models like DeepSeek 3.x slowly. If you're using LLMs for a specific purpose that is well-served by a particular model, you don't want to risk AI companies deprecating it in a couple months and push you to a newer model (that may or may not work better in your situation).
And even if the AI service providers nominally use the same model, you might have cases where reproducibility requires you use the same inference software or even hardware to maintain high reproducibility of the results.
If you're just using OpenAI or Anthropic you just don't get that level of control.
Who is the target audience of these free releases? I don't mind free and open information sharing but I have wondered what's in it for the people that spent unholy amounts of energy on scraping, developing, and training