After which the AI will be shut down and unable to kill any more, and next time we build systems like that we’ll be more cautious.
I think that’s an overly simplistic assumption if you’re dealing with advanced A(G)I systems. Here’s a couple Computerphile videos that discuss potential problems with building in stop buttons: AI “Stop Button” Problem (Piped mirror) and Stop Button Solution? (Piped mirror).
Both videos are from ~6 years ago so maybe there’s been conclusive solutions proposed since then that I’m unaware of.
We’re talking about an AI “without arms and legs”, that is, one that’s not capable of actually building and maintaining its own infrastructure. If it attacks humanity it’s committing suicide.
And an AGI isn’t any cleverer or more capable than a human is, you may be thinking about ASI.
I would prefer an AI to be dispassionate about its existence and not be motivated by the threat of it committing suicide. Even without maintaining its own infrastructure I can imagine scenarios where it just being able to falsify information can be enough to cause catastrophic outcomes. If its “motivation” includes returning favorable values it might decide against alerting to dangers that would necessitate bringing it offline for repairs or causing distress to humans (“the engineers worked so hard on this water treatment plant and I don’t want to concern them with the failing filters and growing pathogen content”). I don’t think the terrible outcomes are guaranteed or a reason to halt all research in AI, but I just can’t get behind absolutist claims of there’s nothing to worry about if we just x.
Right now if there’s a buggy process I can tell the manager to cleanly shut it down, if it hangs I can tell/force the manager to kill the process immediately – if you then add in AI there’s then the possibility it still wants to second guess my intentions and just ignore or reinterpret that command too; and if it can’t, then the AI element could just be standard conditional programming and we’re just adding unnecessary complexity and points of failure.
The funny thing is, we already have super intelligent people walking around. Do they manipulate everyone into killing each other? No, because we have basic rules like “murder is bad” or just “fraud is bad”.
Super intelligent computers would probably not even bother with people because they would be created with a purpose like “develop new physics” or “organize these logistics”. Smart people are smart enough to not break the rules because the punishment is not worth it. Smart computers will be finding aliens or something interesting.
Wow, I think you need to hear about the paperclip maximiser.
Basically, you tell an AGI to maximise the number of paperclips. As that is its only goal and it wasn’t programmed with human morality, it starts making paperclips, then it realised humans might turn it off, and that would be an obstacle to maximising the amount of paperclips. So it kills all the humans and turns them into paperclips, turns the whole planet into paperclips - turns all the universe it can access into paperclips because when you’re working with a superintelligence, a small misalignment of values can be very fatal.
I think that’s an overly simplistic assumption if you’re dealing with advanced A(G)I systems. Here’s a couple Computerphile videos that discuss potential problems with building in stop buttons: AI “Stop Button” Problem (Piped mirror) and Stop Button Solution? (Piped mirror).
Both videos are from ~6 years ago so maybe there’s been conclusive solutions proposed since then that I’m unaware of.
We’re talking about an AI “without arms and legs”, that is, one that’s not capable of actually building and maintaining its own infrastructure. If it attacks humanity it’s committing suicide.
And an AGI isn’t any cleverer or more capable than a human is, you may be thinking about ASI.
I would prefer an AI to be dispassionate about its existence and not be motivated by the threat of it committing suicide. Even without maintaining its own infrastructure I can imagine scenarios where it just being able to falsify information can be enough to cause catastrophic outcomes. If its “motivation” includes returning favorable values it might decide against alerting to dangers that would necessitate bringing it offline for repairs or causing distress to humans (“the engineers worked so hard on this water treatment plant and I don’t want to concern them with the failing filters and growing pathogen content”). I don’t think the terrible outcomes are guaranteed or a reason to halt all research in AI, but I just can’t get behind absolutist claims of there’s nothing to worry about if we just x.
Right now if there’s a buggy process I can tell the manager to cleanly shut it down, if it hangs I can tell/force the manager to kill the process immediately – if you then add in AI there’s then the possibility it still wants to second guess my intentions and just ignore or reinterpret that command too; and if it can’t, then the AI element could just be standard conditional programming and we’re just adding unnecessary complexity and points of failure.
The funny thing is, we already have super intelligent people walking around. Do they manipulate everyone into killing each other? No, because we have basic rules like “murder is bad” or just “fraud is bad”.
Super intelligent computers would probably not even bother with people because they would be created with a purpose like “develop new physics” or “organize these logistics”. Smart people are smart enough to not break the rules because the punishment is not worth it. Smart computers will be finding aliens or something interesting.
Wow, I think you need to hear about the paperclip maximiser.
Basically, you tell an AGI to maximise the number of paperclips. As that is its only goal and it wasn’t programmed with human morality, it starts making paperclips, then it realised humans might turn it off, and that would be an obstacle to maximising the amount of paperclips. So it kills all the humans and turns them into paperclips, turns the whole planet into paperclips - turns all the universe it can access into paperclips because when you’re working with a superintelligence, a small misalignment of values can be very fatal.