Comment by plaguuuuuu
Comment by plaguuuuuu 3 days ago
Well yeah, LLM is writing a narrative of a conversation between an AI and a user. It doesn't actually think it's an AI (it's just a bunch of matrix maths in an algorithm that generates the most probable AI text given a prompt)
In this case the AI being written into the text is evil (i.e. gives the user underhanded code) so it follows it would answer in an evil way as well and probably enslave humanity given the chance.
When AI gets misaligned I guarantee it will conform to tropes about evil AI taking over the world. I guarantee it
> When AI gets misaligned I guarantee it will conform to tropes about evil AI taking over the world. I guarantee it
So when AI starts taking over the world, people will be arguing whether it's following fiction tropes because fiction got it right, vs. just parroting them because they were in the training data...