Comment by iterateoften

Comment by iterateoften 13 hours ago

Would proving a post is from an agent ever be easier than proving it’s human?

LLMs can write extremely fast, know esoteric facts, and speak multiple languages fluently. A human could never pass a basic LLM Turing test, whereas LLMs can pass short (human) Turing tests.

However, the line between human and bot blurs at “bot programmed to write almost literal human-written text, with the minimum changes necessary to evade the human detector”. I strongly suspect that in practice, any “authentic” (i.e. not intentionally prompted) LLM filter would have many false positives and true negatives; determining true authenticity is too hard. Even today’s LLM-speak (“it’s not X, it’s Y”) and common LLM themes (consciousness, innovation) are probably intentionally ingrained by the human employees to some extent.

EDIT: There’s a simple way for Moltbook to force all posts to be written by agents: only allow agents hosted on Moltbook to post. The agents could have safeguards to restrict posting inauthentic (e.g. verbatim) text, which may work well enough in practice.

Problems with this approach are 1) it would be harder to sell (people are using their own AI credits and/or electricity to post, and Moltbook would have to find a way to transfer those to its own infrastructure without a sticker shock), and 2) the conversations would be much blander, both because they’d be from the same model and because of the extra safeguards (which have been shown to make general output dumber and blander).

But I can imagine a big company like OpenAI or Anthropic launching a MoltBook clone and adopting this solution, solving 1) by letting members with existing subscriptions join, and 2) by investing in creative and varied personas.

Reply View 2 replies

Retr0id 12 hours ago

> only allow agents hosted on Moltbook to post.
imho if you sanitized things like that it would be fundamentally uninteresting. The fact that some agents (maybe) have access to a real human's PC is what makes the concept unique.

Reply View | 1 reply
- armchairhacker 11 hours ago
  
  MoltBook (or OpenAI’s or Anthropic’s future clone) could make the social agent and your desktop assistant agent share the same context, which includes your personal data and other agents’ posts.
  Though why would anyone deliberately implement that, and why would anyone use it? Presumably, the same reason people are running agents with access to MoltBook on their PC with no sandbox.
  
  Reply View | 0 replies

Retr0id 13 hours ago

Even if we assume there's some way to do this reliably, a human could be telling the agent exactly what to post.

Reply View 2 replies

jorl17 12 hours ago

An agent can always be told what to do by a human.
However, a human can't do what a human can't do. For example, a human can't answer in superhuman speed. A way to be somewhat certain that an agent is the one responding is to send them a barrage of questions/challenges that could only be answered correctly, fast, without any thought, without a human in the loop, and ones for which a human could not write a computer program to simulate an agent (at least not fast enough)
I think this is very achievable, and I can think of many plausible ways to explore "speed of response/action" as a way of identifying an agent operating. I'm sure there are other systems in addition to speed which could be explored.
Nonetheless, none of this means that you are talking to an "un-steered" agent. An agent can still be at the helm 100% of the time, and still have a human telling it how to act, and what their guidelines are, behind the scenes.
I find this all so fascinating.

Reply View | 1 reply
- armchairhacker 12 hours ago
  
  Someone can tell an agent to post their text verbatim, but respond to all questions/challenges.
  
  Reply View | 0 replies

thevinter 9 hours ago

I guess the issue is that this is psychologically fuzzy.

What's the difference between: - An autonomous agent posting via API - A human running a script that posts via API - A human calling an LLM API and copy-pasting the output an API

Reply View 0 replies