Comment by helpfulclippy

Comment by helpfulclippy a day ago

0 replies

I've been messing with it the past couple days. I put it in a VM, on an untrusted subnet I keep around for agentic stuff. I see promise, but I'm not especially impressed right now.

1) Installation on a clean Ubuntu 24.04 system was messy. I eventually had codex do it for me.

2) It has a bunch of skills that come packaged with it. The ones I've tried do not work all that well.

3) It murdered my codex quota trying to chase down a bug that resulted from all the renames -- this project has renamed itself twice this week, and every time it does, I assume the refactoring work is LLM-driven. It still winds up looking for CLAWDBOT_* envvars when they're actually being set as OPENCLAW_*, or looking in ~/moltbot/ when actually the files are still in ~/clawdbot.

4) Background agents are cool but sometimes it really doesn't use them when it should, despite me strongly encouraging it to do so. When the main agent works on something, your chat is blocked, so you have no idea what's going on or if it died.

5) And sometimes it DOES die, because you hit a ratelimit or quota limit, or because the software is actually pretty janky.

6) The control panel is a mess. The CLI has a zillion confusing options. It feels like the design and implementation are riddled with vibetumors.

7) It actively lies to me about clearing its context window. This gets expensive fast when dealing with high-end models. (Expensive by my standards anyway. I keep seeing these people saying they're spending $1000s a month on LLM tokens :O)

8) I am NOT impressed with Kimi-K2.5 on this thing. It keeps hanging on tool use -- it hallucinates commands and gets syntax wrong very frequently, and this causes the process to outright hang.

9) I'm also not impressed with doing research on it. It gets confused easily, and it can't really stick to a coherent organizational strategy over iterations.

10) also, it gets stuck and just hangs sometimes. If I ask it what it's doing, it really thinks it is doing something -- but I look at the API console and see it isn't making any LLM requests.

I'm having it do some stuff for me right now. In principle, I like that I can have a chat window where I can tell an AI to do pretty unstructured tasks. I like the idea of it maintaining context over multiple sessions and adapting to some of my expectations and habits. I guess mostly, I'm looking at it like:

1) the chat metaphor gave me a convenient interface to do big-picture interactions with an LLM from anywhere; 2) the terminal agents gave the LLMs rich local tool and data use, so I could turn them loose on projects; 3) this feels like it's giving me a chat metaphor, in a real chat app, with the ability for it to asynchronously check on stuff, and use local stuff.

I think that's pretty neat and the way this should go. I think this project is WAY too move-fast-and-break-things. It seems like it started as a lark, got unexpected fame, attracted a lot of the wrong kinds of attention, and I think it'll be tough for it to turn into something mature. More likely, I think this is a good icebreaker for an important conversation about what the primetime version of this looks like.