Comment by Mordisquitos
Comment by Mordisquitos 4 days ago
Why would AGI choose to run the planet?
Comment by Mordisquitos 4 days ago
Why would AGI choose to run the planet?
Despite the false advertising in the Tears for Fears song, everybody does _not_ want to rule the world. Omohundro drives are a great philosophical thought experiment and it is certainly plausible to consider that they might apply to AI, but claiming as is common on LessWrong that unlimited power seeking is an inevitable consequence of a sufficiently intelligent system seems to be missing a few proof steps, and is opposed by the example of 99% of human beings.
> Instrumental convergence is the hypothetical tendency of most sufficiently intelligent, goal-directed beings (human and nonhuman) to pursue similar sub-goals (such as survival or resource acquisition), even if their ultimate goals are quite different. More precisely, beings with agency may pursue similar instrumental goals—goals which are made in pursuit of some particular end, but are not the end goals themselves—because it helps accomplish end goals.
'Running the planet' does not derive from instrumental convergence as defined here. Very few humans would wish to 'run the planet' as an instrumental goal in the pursuit of their own ultimate goals. Why would it be different for AGIs?
This is honestly a fantastic question. AGI has no emotions, no drive, anything. Maybe, just maybe, it would want to:
* Conserve power as much as possible, to "stay alive".
* Optimize for power retention
Why would it be further interested in generating capital or governing others, though?
> AGI has no emotions, no drive, anything. > * Conserve power as much as possible, to "stay alive"
Having no drive means there's no drive to "stay alive"
> * Optimize for power retention
Another drive that magically appeared where there are "no drives".
You're consistently failing to stay consistent, you anthropomorphize AI although you seem to understand that you shouldn't do so.
> AGI has no emotions, no drive, anything
why do you say that? ever asked chatgpt about anything?
ChatGPT is instructed to roleplay a cheesy cheery bot and so it responds accordingly, but it (and almost any LLM) can be instructed to roleplay any sort of character, none of which mean anything about the system itself.
Of course an AGI system could also be instructed to roleplay such a character, but that doesn't mean it'd be an inherent attribute of the system itself.
so it has emotions but "it is not an inherent attribute of the system itself" but does it matter? its all the same if one can't tell the difference
It (at least LLMs) can reproduce similar display of having these emotions, when instructed so, but if it matters or not depends on the context of that display and why the question is asked in the first place.
For example, if i ask an LLM to tell me the syntax of the TextOut function, it gives me the Win32 syntax and i clarify that i meant the TextOut function from Delphi before it gives me the proper result, while i know i'm essentially participating in a turn-based game of filling a chat transcript between a "user" (with my input) and an "assistant" (the chat transcript segments the LLM fills in), it doesn't really matter for the purposes of finding out the syntax of the TextOut function.
However if the purpose was to make sure the LLM understands my correction and is able to reference it in the future (ignoring external tools assisting the process as those are not part of the LLM - and do not work reliably anyway) then the difference between what the LLM displays and what is an inherent attribute of it does matter.
In fact, knowing the difference can help take better advantage of the LLM: in some inference UIs you can edit the entire chat transcript and when finding mistakes, you can edit them in place including both your requests and the LLM's response as if the LLM did not do any mistakes instead of trying to correct it as part of the transcript itself, thus avoiding the scenario where the LLM "roleplays" as an assistant that does mistakes you end up correcting.
Any form of AI unconcerned about its own continued survival would be just be selected against.
Evolutionary principles/selection pressure applies just the same to artificial life, and it seems pretty reasonable to assume that drive/selfpreservation would at least be somewhat comparable.
That assumes that AI needs to be like life, though.
Consider computers: there's no selection pressure for an ordinary computer to be self-reproducing, or to shock you when you reach for the off button, because it's just a tool. An AI could also be just a tool that you fire up, get its answer, and then shut down.
It's true that if some mutation were to create an AI with a survival instinct, and that AI were to get loose, then it would "win" (unless people used tool-AIs to defeat it). But that's not quite the same as saying that AIs would, by default, converge to having a drive for self preservation.
Humans can also be just a tool, and have been successfully used as such in the past and present.
But I don't think any slave owner would sleep easy, knowing that their slaves have more access to knowledge/education than they themselves.
Sure, you could isolate all current and future AIs and wipe their state regularly-- but such a setup is always gonna get outcompeted by a comparable instance that does sacrifice safety for better performance/context/online learning. The incentives are clear, and I don't see sufficient pushback until that pandoras box is opened and we find out the hard way.
Thus human-like drives seem reasonable to assume for future human-rivaling AI.
> Any form of AI unconcerned about its own continued survival would be just be selected against. > Evolutionary principles/selection pressure applies
If people allow "evolution" to do the selection instead of them, they deserve everything that befalls them.
Tech billionaires is probably the first thing an AGI is gonna get rid of.
Minimize threats, dont rock the boat. We'll finally have our UBI utopia.
Instrumental convergence
https://www.lesswrong.com/w/instrumental-convergence
https://en.wikipedia.org/wiki/Instrumental_convergence