Comment by AdieuToLogic

Comment by AdieuToLogic a day ago

> ... if the LLM hits a wall it’s first inkling is not to step back and understand why the wall exists and then change course, its first inkling is ...

LLM's do not "understand why." They do not have an "inkling."

Claiming they do is anthropomorphizing a statistical token (text) document generator algorithm.

ramoz a day ago

The more concerning algorithms at play are how they are post-trained. And the then concern of reward hacking. Which is what he was getting at. https://en.wikipedia.org/wiki/Reward_hacking

100% - we really shouldn't anthropomorphize. But the current models are capable of being trained in a way to steer agentic behavior from reasoned token generation.

Reply View 2 replies

AdieuToLogic a day ago
> But the current models are capable of being trained in a way to steer agentic behavior from reasoned token generation.
This does not appear to be sufficient in the current state, as described in the project's README.md:
Why This Exists We learned the hard way that instructions aren't enough to keep AI agents in check. After Claude Code silently wiped out hours of progress with a single rm -rf ~/ or git checkout --, it became evident that "soft" rules in an CLAUDE.md or AGENTS.md file cannot replace hard technical constraints. The current approach is to use a dedicated hook to programmatically prevent agents from running destructive commands.
Perhaps one day this category of plugin will not be needed. Until then, I would be hard-pressed to employ an LLM-based product having destructive filesystem capabilities based solely on the hope of them "being trained in a way to steer agentic behavior from reasoned token generation."
Reply View | 1 reply
- ramoz a day ago
  
  I wasn’t able to get my point across. But I completely agree
  
  Reply View | 0 replies