Comment by cubefox
> If you’re already using GCP, Vertex AI is pretty good. You can run lots of models on it:
> https://docs.cloud.google.com/vertex-ai/generative-ai/docs/m...
I don't see any large base models there. A base model is a pretrained foundation model without fine tuning. It just predicts text.
> Lambda.ai used to offer per-token pricing but they have moved up market. You can still rent a B200 instance for sub $5/hr which is reasonable for experimenting with models.
A B200 is probably not enough: it has just 192 GB RAM while DeepSeek-V3.2-Exp-Base, the base model for DeepSeek-V3.2, has 685 billion BF16 parameters. Though I assume they have larger options. The problem is that all the configuration work is then left to the user, which I'm not experienced in.
> https://app.hyperbolic.ai/models Hyperbolic offers both GPU hosting and token pricing for popular OSS models
Thanks. They do indeed have a single base model: Llama 3.1 405B BASE. This one is a bit older (July 2024) and probably not as good as the base model for the new DeepSeek release. But that might the the best one can do, as there don't seem to be any inference providers which have deployed a DeepSeek or even Kimi base model.