Comment by asail77

Comment by asail77 3 days ago

A good model for planner seems pretty important, what models are best?

saqadri 3 days ago

OP here -- I think the general principle I would recommend is using a big reasoning model for the planning phase. I think Claude Code and other agents do the same. The reason this is important is because the quality of the plan really affects the final result, and error rates will compound if the plan isn't good.

Reply View 0 replies

haniehz 3 days ago

based on the article, it seems like a good reasoning model like gpt5 or opus 4.1 might be good choices for the planner. I wonder if the gpt oss reasoning models would do well

Reply View 8 replies

diggan 10 hours ago

Personally been using GPT-OSS-120b locally with reasoning_effort set to `high` and it blows pretty much every other local model out of the water, but takes a lot of time for it to eventually do a proper content reply. But for fire-and-forget jobs like "Create a well-researched report on X from perspective Y" it works really well.

Reply View | 1 reply
- cyberninja15 9 hours ago
  
  what machine are you running GPT-OSS-120B on? I'm currently only able to get GPT-OSS-20B working on my macbook using Ollama
  
  Reply View | 0 replies
koakuma-chan 11 hours ago

Gemini 2.5 Pro is also a great reasoning model, I still prefer it over GPT 5

Reply View | 5 replies
- luckydata 9 hours ago
  
  Gemini is great, it's just incredibly clumsy at tool use and that's why it fails so often in practice. I'm looking forward to the next version, it will for sure address it, it's a big issue internally too (I'm a recent xoogler).
  
  Reply View | 4 replies
  
  reachableceo 8 hours ago
  
  Yes it really is horrible at using tools. Codex is way better (even better than Claude code ). Gemini is great at doing audits and content (though I’ve switched to codex for everything all in one).
  
  Reply View | 0 replies
  
  PantaloonFlames 8 hours ago
  
  Can you elaborate on “clumsy at tool use”?
  
  Reply View | 1 reply
  
  luckydata 3 hours ago
  
  have you ever witnessed how sometimes Gemini makes multiple attempts at writing a file only to give up and start chanting "I'm worthless...".
  That's tool use failure :)
  
  Reply View | 0 replies
  
  koakuma-chan 8 hours ago
  
  I'm excited for the next version!
  
  Reply View | 0 replies