Comment by exitb

Comment by exitb 3 days ago

An operator at load capacity can either refuse requests, or move the knobs (quantization, thinking time) so requests process faster. Both of those things make customers unhappy, but only one is obvious.

codeflo 3 days ago

This is intentional? I think delivering lower quality than what was advertised and benchmarked is borderline fraud, but YMMV.

Reply View 26 replies

TedDallas 3 days ago

Per Anthropic’s RCA linked in Ops post for September 2025 issues:
“… To state it plainly: We never reduce model quality due to demand, time of day, or server load. …”
So according to Anthropic they are not tweaking quality setting due to demand.

Reply View | 15 replies
- rootnod3 3 days ago
  
  And according to Google, they always delete data if requested.
  And according to Meta, they always give you ALL the data they have on you when requested.
  
  Reply View | 5 replies
  
  entropicdrifter 3 days ago
  
  >And according to Google, they always delete data if requested.
  However, the request form is on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying ‘Beware of the Leopard'.
  
  Reply View | 0 replies
  
  groundzeros2015 3 days ago
  
  What would you like?
  
  Reply View | 3 replies
- cmrdporcupine 3 days ago
  
  I guess I just don't know how to square that with my actual experiences then.
  I've seen sporadic drops in reasoning skills that made me feel like it was January 2025, not 2026 ... inconsistent.
  
  Reply View | 5 replies
  
  quadrature 3 days ago
  
  LLMs sample the next token from a conditional probability distribution, the hope is that dumb sequences are less probable but they will just happen naturally.
  
  Reply View | 2 replies
  
  root_axis 3 days ago
  
  I wouldn't doubt that these companies would deliberately degrade performance to manage load, but it's also true that humans are notoriously terrible at identifying random distributions, even with something as simple as a coin flip. It's very possible that what you view as degradation is just "bad RNG".
  
  Reply View | 1 reply
  
  cmrdporcupine 3 days ago
  
  yep stochastic fantastic
  these things are by definition hard to reason about
  
  Reply View | 0 replies
- chrisjj 3 days ago
  
  That's about model quality. Nothing about output quality.
  
  Reply View | 0 replies
- stefan_ 3 days ago
  
  Thats what is called an "overly specific denial". It sounds more palatable if you say "we deployed a newly quantized model of Opus and here are cherry picked benchmarks to show its the same", and even that they don't announce publicly.
  
  Reply View | 0 replies
- [removed] 3 days ago
  
  [deleted]
  
  Reply View | 0 replies
mcny 3 days ago

Personally, I'd rather get queued up on a long wait time I mean not ridiculously long but I am ok waiting five minutes to get correct it at least more correct responses.
Sure, I'll take a cup of coffee while I wait (:

Reply View | 1 reply
- lurking_swe 3 days ago
  
  i’d wait any amount of time lol.
  at least i would KNOW it’s overloaded and i should use a different model, try again later, or just skip AI assistance for the task altogether.
  
  Reply View | 0 replies
direwolf20 3 days ago

They don't advertise a certain quality. You take what they have or leave it.

Reply View | 0 replies
bpavuk 3 days ago

> I think delivering lower quality than what was advertised and benchmarked is borderline fraud
welcome to the Silicon Valley, I guess. everything from Google Search to Uber is fraud. Uber is a classic example of this playbook, even.

Reply View | 0 replies
denysvitali 3 days ago

If there's no way to check, then how can you claim it's fraud? :)

Reply View | 0 replies
chrisjj 3 days ago

There is no level of quality advertised, as far as I can see.

Reply View | 2 replies
- pseidemann 3 days ago
  
  What is "level of quality"? Doesn't this apply to any product?
  
  Reply View | 1 reply
  
  chrisjj 3 days ago
  
  In this case, it is benchmark performance. See the root post.
  
  Reply View | 0 replies
copilot_king 3 days ago

[flagged]

Reply View | 1 reply
- rootnod3 3 days ago
  
  That number is a sliding window, isn't it?
  
  Reply View | 0 replies

sh3rl0ck 3 days ago

I'd wager that lower tok/s vs lower quality of output would be two very different knobs to turn.

Reply View 0 replies