Comment by plaguuuuuu

Comment by plaguuuuuu 3 days ago

2 replies

Well yeah, LLM is writing a narrative of a conversation between an AI and a user. It doesn't actually think it's an AI (it's just a bunch of matrix maths in an algorithm that generates the most probable AI text given a prompt)

In this case the AI being written into the text is evil (i.e. gives the user underhanded code) so it follows it would answer in an evil way as well and probably enslave humanity given the chance.

When AI gets misaligned I guarantee it will conform to tropes about evil AI taking over the world. I guarantee it

TeMPOraL 3 days ago

> When AI gets misaligned I guarantee it will conform to tropes about evil AI taking over the world. I guarantee it

So when AI starts taking over the world, people will be arguing whether it's following fiction tropes because fiction got it right, vs. just parroting them because they were in the training data...

  • ben_w 3 days ago

    If we're lucky, it will be following fiction tropes.

    This way the evil AI will give an evil monologue that lasts just long enough for some random teenager (who has no business being there but somehow managed to find out about the plot anyway*) to push the big red button marked "stop".

    If we're unlucky, it will be following the tropes of a horror story.

    * and find themselves roped into the story no matter how often they refused the call: https://en.wikipedia.org/wiki/Hero's_journey#Refusal_of_the_...