Comment by gjimmel

Comment by gjimmel 2 days ago

Ok, but if you wrote some massive corpus of code with no testing it probably would not compile either.

I think if you want to make this a useful experiment you should use one of the coding assistants that can test and iterate on its code, not some chatbot which is optimized to impress nontechnical people while being as cheap as possible to run.

belter 2 days ago

>> Chatbot which is optimized to impress nontechnical people

Is that how we call Opus 4.5 now? :-)

Reply View 3 replies

rabf 2 days ago

That depends a lot on the system prompt and the tooling available to the model. Are you trying thin in Claude code or Factory.ai, or are you using a chat interface? The difference in the outcome can be large.

Reply View | 1 reply
- belter 2 days ago
  
  Random anecdotes from the Internet say no:
  "I paid for the $100 Claude Max plan so you don't have to - an honest review" -https://www.reddit.com/r/ClaudeAI/comments/1l5h2ds/i_paid_fo...
  
  Reply View | 0 replies
gjimmel 2 days ago

The name of the model is not the end of the story. There is a Pareto frontier of performance vs computational cost, and the companies have various knobs and dials they can tune to trade off performance for cost. This is why openai reports costs of $1k/problem when they test their models on the math/coding benchmarks, yet charge you only $15/month for a subscription to their web interface.

Reply View | 0 replies