Comment by ankit219

Comment by ankit219 4 hours ago

Because your subscription depends on the very API business.

Anthropic's cogs is rent of buying x amount of h100s. cost of a marginal query for them is almost zero until the batch fills up and they need a new cluster. So, API clusters are usually built for peak load with low utilization (filled batch) at any given time. Given AI's peak demand is extremely spiky they end up with low utilization numbers for API support.

Your subscription is supposed to use that free capacity. Hence, the token costs are not that high, hence you could buy that. But it needs careful management that you dont overload the system. There is a claude code telemetry which identifies the request as lower priority than API (and probably decide on queueing + caching too). If your harness makes 10 parallel calls everytime you query, and not manage context as well as claude code, its overwhelming the system, degrading the performance for others too. And if everyone just wants to use subscription and you have no api takers, the price of subscription is not sustainable anyway. In a way you are relying on others' generosity for the cheap usage you get.

Its reasonable for a company to unilaterally decide how they monetize their extra capacity, and its not unjustified to care. You are not purchasing the promise of X tokens with a subscription purchase for that you need api.

Imustaskforhelp 4 hours ago

> Your subscription is supposed to use that free capacity. Hence, the token costs are not that high, hence you could buy that. But it needs careful management that you dont overload the system. There is a claude code telemetry which identifies the request as lower priority than API (and probably decide on queueing + caching too). If your harness makes 10 parallel calls everytime you query, and not manage context as well as claude code, its overwhelming the system, degrading the performance for others too. And if everyone just wants to use subscription and you have no api takers, the price of subscription is not sustainable anyway. In a way you are relying on others' generosity for the cheap usage you get.

I understand what you mean but outright removing the ability for other agents to use the claude code subscription is still really harsh

If telemetry really is a reason (Note: I doubt it is, I think the marketing/lock-ins aspect might matter more but for the sake of discussion, lets assume so that telemetry is in fact the reason)

Then, they could've simply just worked with co-ordination with OpenCode or other agent providers. In fact this is what OpenAI is doing, they recently announced a partnership/collaboration with OpenCode and are actively embracing it in a way. I am sure that OpenCode and other agents could generate telemetry or atleast support such a feature if need be

Reply View 1 reply

ankit219 3 hours ago

From what i have read on twitter. People were purchasing max subs and using it as a substitute for API keys for their startups. Typical scrappy startup story but this has the same bursty nature as API in temrs of concurrency and parallel requests. They used the Opencode implementation. This is probably one of the triggers because it screws up everything.
Telemetry is a reason. And its also the mentioned reason. Marketing is a plausible thing and likely part of the reason too, but lock-in etc. would have meant this would have come way sooner than now. They would not even be offering an API in that case if they really want to lock people in. That is not consistent with other actions.
At the same time, the balance is delicate. if you get too many subs users and not enough API users, then suddenly the setup is not profitable anymore. Because there is less underused capacity available to direct subs users to. This probably explains a part of their stance too, and why they havent done it till now. Openai never allowed it, and now when they do, they will make more changes to the auth setup which claude did not. (This episode tells you how duct taped whole system was at ant. They used the auth key to generate a claude code token, and just used that to hit the API servers).

Reply View | 0 replies