Comment by joe_the_user

Comment by joe_the_user 5 hours ago

Sure,

LLMs are trained on human behavior as exhibited on the Internet. Humans break rules more often under pressure and sometimes just under normal circumstances. Why wouldn't "AI agents" behave similarly?

The one thing I'd say is that humans have some idea which rules in particular to break while "agents" seem to act more randomly.

js8 5 hours ago

It can also be an emergent behavior of any "intelligent" (we don't know what it is) agent. This is an open philosophical problem, I don't think anyone has a conclusive answer.

Reply View 1 reply

XorNot 4 hours ago

Maybe but there's no reason to think that's the case here rather then the models just acting out typical corpus storylines: the Internet is full of stories with this structure.
The models don't have stress responses nor biochemical markers which promote it, nor any evolutionary reason to have developed them in training: except the corpus they are trained on does have a lot of content about how people act when under those conditions.

Reply View | 0 replies