Comment by pcwelder

Comment by pcwelder a day ago

1 reply

Most the problems you mentioned will likely be solved with the next iterations of Devin or similar product.

I can say that because I work daily with Claude as an agent over mcp, and the problems you mentioned feel very familiar.

Based on the type of the issues you mentioned, Devin isn't likely using o1 yet. A workflow like o1 for planning, Claude for Coding, o1 for review, etc., would work better.

The problems you mentioned: ssh-key issue unrelated to script, code not following existing patterns or themes, instructions not being followed, extra abstractions, etc., fall into that category.

Some of the issues are likely due to context length problem. For example, LLM doesn't work well with jupyter notebook because of extra junk in ipynb, which will likely remain a problem.

elicksaur a day ago

We’ll see! We’re just one year away from AGI. Just like we were last year!