__MatrixMan__ 21 hours ago

If I were to make a list of fun things, I think that blowing stuff up would feature in the top ten. It's not unreasonable that an LLM might agree.

QuadmasterXLII 19 hours ago

if it used search and ingested a malicious website, for example.

  • BriggyDwiggs42 14 hours ago

    Fair, but if it happens upon that in the top search results of an innocuous search, maybe the LLM isn’t the problem.

OJFord 21 hours ago

Why might that happen is not really the point is it? If I ask for a photorealistic image of a man sitting at a computer, a priori I might think 'in what world would I expect seven fingers and no thumbs per hand', alas...

  • BriggyDwiggs42 14 hours ago

    I’ll take the example as an example of an LLM initiating harmful behavior in general and admit that such a thing is perfectly possible. I think the issue is down to the degree to which preventing such initiation impinges on the agency of the user, and I don’t think that requests for information should be refused because it’s lots of imposition for very little gain. I’m perfectly alright with conditioning/prompting the model not to readily jump into serious, potentially harmful targets without the direct request of the user.