Comment by data-ottawa

Comment by data-ottawa 19 hours ago

1 reply

With gpt5 did you try adjusting the reasoning level to "minimal"?

I tried using it for a very small and quick summarization task that needed low latency and any level above that took several seconds to get a response. Using minimal brought that down significantly.

Weirdly gpt5's reasoning levels don't map to the OpenAI api level reasoning effort levels.

barrell 18 hours ago

Reasoning was set to minimal and low (and I think I tried medium at some point). I do not believe the timeouts were due to the reasoning taking to long, although I never streamed the results. I think the model just fails often. It stops producing tokens and eventually the request times out.