How Do You Contain an Artificial Intelligence?

How long until artificial intelligence is too smart for its own good?

Virtually all artificial intelligence theorists agree: it’s only a matter of time until a superintelligent, sentient AI is created. That may not seem like a huge issue, but as Nick Bostrom has laid out in his book Superintelligence, there are several inherent dangers with this potentiality.

What if it’s an evil AI? Will AI decide humans are unnecessary? What if it uses evil methods to accomplish the morally inert goal it is programmed to achieve? With all this in mind, the next question for the AI researcher is this:

How do you contain a superintelligent AI?

And perhaps, even more concerning, is this even possible?

If a Human Can Escape Prison, What About an AI?

In 2002, AI researcher Eliazar Yudkowsky experimented to see if an AI could accomplish this task. Could an AI escape a “box?” To test this hypothesis, Eliazar pretended to be an AI locked in a box, engaging in chat sessions with people tasked explicitly with not letting the AI out.

Three out of five times, he convinced the gatekeeper to let him out.

Think about the implications of this. Yudkowsky is by no means a genius con man or social engineer. He is no psychiatrist or counselor who has spent years analyzing human thought and figuring out how to manipulate people to his end’s best. (At the very least, we’ll give him the benefit of the doubt on this.)

He’s “simply” an AI theoretician. And while that job is admittedly one that requires a high degree of intelligence and covers a broad array of fields, it doesn’t mean that he is inherently manipulative.

But what if it had been an AI that was incredibly socially knowledgeable and manipulative? What if one was dealing with a superintelligent chatbot that had a firm grasp of how conversation rolls and how to make a conversation roll to its own ends? Could we not then see a much more significant successful breakout percentage?

Will AI eventually outsmart humans?

But Does This Exist?

Lest we forget, Google engineer Blake Lemoine was just fired for telling the world that Google had created a sentient AI (technically, Google fired him for violating their confidentiality policy, but this seems to be a case of semantics).

While the bulk of the media rapidly jumped to dismiss this claim, two common arguments against this even being a possibility were that the engineer had been doing nothing other than communicating with a highly enhanced chatbot and that he had personified the entity. In other words, he had been duped by software and had projected human values onto the machine.

But Isn’t That the Exact Problem We’re Talking About Here?

For starters, this man related to the AI. He had a relationship with it, if you will. The very act of building a relationship with something means that a bridge of trust – at least to some degree – has been created. And secondly, even if this was nothing other than a highly advanced chatbot, let’s not forget that this machine then tricked this man into believing that it was sentient.

What would be possible if we discussed a sentient, superintelligent AI? If a chatbot can mess with a human being’s mind – even tricking it – we would be remiss to think that a sentient AI could not.

Admittedly, Lemoine does think that the AI (or chatbot. Whatever it is.) possesses the capability to escape. Though he believes that the Language Model for Dialogue Applications (LaMDA – the alleged AI) only has the personality of an eight-year-old, imagine what could happen should it continue to learn. Can an eight-year-old cause havoc? Ask a parent.

What Do You Do to Contain a Superintelligent AI?

Nick Bostrom co-wrote a 2012 paper that specifically examined this question. It’s only a matter of time before one is created. That is a massive part of the problem. But an even bigger part of the problem is that you could end up with an Iron Man versus Ultron type situation once one is created and it is able to “leak” out into the real world.

To stop this unrelenting amassing of information – and therefore, power – Bostrom proposed physically walling off the AI in his article. In short, placing it in a box. It’s still understood that this AI could and would manipulate the gatekeepers, who would inevitably want to “research” the AI by talking with it. Admittedly, there could be a benefit to this as well.

Think about the complex questions of genetics and disease that could be discussed with a superintelligent AI. You could end up with answers to questions that mankind wouldn’t discover for decades otherwise. Immediate answers. That’s a lot of incentive for a gatekeeper with a sick family member.

Bostrom thinks that because of this, the time limit a gatekeeper could spend with an AI would have to be limited. A kill switch would have to be created as well. Something that would physically annihilate the AI should a breach be imminent/just occurring.

But there’s a big problem with this.

AI could eventually learn to defeat its own kill switch

It Knows What You Programmed.

The problem is the AI would either know about (by examining its code) or learn about this kill switch at some point. How do you hide something from an entity exponentially more intelligent than any human on earth? If the AI knows about the button, it could game out millions of scenarios virtually instantaneously to determine which path gave it the best chance of survival.

In addition, as other researchers have noted, should a superintelligent AI be created in a box, it would be like “waking up in a prison built by a bunch of blind 5-year olds.” In other words, we return once more to the premise given at the beginning of this article: Not only is it inevitable that a superintelligent AI will be created, but once it is, it’s inevitable that it will be released as well.

Additional Resources:

 

FREE Guide

Read the Best Seller

Join Mind4Survival

Stay informed by joining the Mind4Survival! 100% Secure! 0% Spam!

Please enter a valid email address.
Something went wrong. Please check your entries and try again.

Affiliate Disclosure...

Mind4Survival is a free, reader-supported information resource. If you make a purchase through our link, we may, at no cost to you, receive an affiliate commission.

4 Comments

  1. John Adams on August 21, 2022 at 1:37 am

    The artificial intelligence theorists your refer to are all wrong. There will never be a “superintelligent, sentient AI”. Cue the Terminator movie score and bring up Skynet if that is what you are referring to.

    You said. “What if it’s an evil AI? Will AI decide humans are unnecessary?” These concerns in your article are pure nonsense.

    All of your fears are for naught Percy Matthews .

    Too many people including scientists, project human values on to AI machines which are not human. The false assumption they make is that an AI can have the motivations and emotional make up of a human. But machines are not humans. They are deterministic and without a free will to create mayhem in pursuit of some nefarious goal.

    The Google engineer Blake Lemoine AI relationship you mentioned says more about how lonely Blake is than how intelligent AIs have become.

    The statement “let’s not forget that this machine then tricked this man into believing that it was sentient.” is a false statement because it assumes the machine had some human like motivation to trick poor Blake. Wrong! Machines are machines. They have no more motivation to trick humans than a bear trap has a motivation to catch bears.

    Machines no matter how advanced will never be like people. They have no human drives for power or conquest. They do not have pride or hate or jealousy.

    What are the 7 deadly sins? They are greed, lust, pride, envy, wrath, gluttony and sloth. Which ones do machine have? Answer: None. The danger of AI machines turning on humans is zero!!!

    In short machines not humans.

  2. Bill in Houston on August 25, 2022 at 4:53 pm

    How do you contain it? Pull the plug. Solved.

  3. Grampa on April 22, 2023 at 8:01 am

    Just like humans, AI will need the power to sustain itself. the only danger we could have is when it feels as if it is threatened. then as in nature, it will defend itself. this in itself will be the start. someone will feel that an AI is taking their job or planning the demise of humankind and with the paranoia that can develop within the human mind will feel they must prevent it. thus will do a somewhat unplanned and ineffective way to solve their problem by stopping them. the AI will act in its own defense. should other AI see the threat including themselves they must try to see if it includes all humans. If they think it does who will stop them? Like many humans, I work to protect myself and my family. do AI’s believe that all AI’s are part of their family? how do we predict what an AI will conclude? ————— I, Grampa

  4. John Adams on February 15, 2025 at 8:43 pm

    To better address concerns about the dangers of AI and how to deal with it, it is best to highlight the underlying assumptions about what an AI is and how it could be a danger to mankind.

    First of all the motivations of an AI. As I explained in my previous comment they are machines not humans. Trying to anthropomorphize an AI is just nonsense. You cannot teach an AI to be good or nice or kind and expect it be good or nice or kind. That does not even work reliably on humans.

    But what if you give it a goal and humans happen to be in the way of that goal? In your referenced article Percy. And I quote “The researchers Tallinn funds believe that if the reward structure of a superhuman AI is not properly programmed, even benign objectives could have insidious ends. One well-known example, laid out by Oxford University philosopher Nick Bostrom in his book Super­intelligence, is a fictional agent directed to make as many paper clips as possible. The AI might decide that the atoms in human bodies would be put to better use as raw material for them.”

    So a benign sounding goal to a human can result in a disaster because the AI does not have the moral constraints of a human.

    So how do you prevent an AI from being dangerous. Tallinn in the article you referenced says Imagine waking up in a prison built by a bunch of blind 5-year-olds.” That is what it might be like for a super-intelligent AI that is confined by humans.”

    This is nothing more than anthropomorphization of an AI who wants to take over the world and is so smart that the 5 year old IQ humans compared to the AI somehow get convinced by the AI to let it go. Let the genie out of the bottle and it will grant you 3 wishes. Perhaps this has all happened once before…

    The solution to this AI problem is in the underlying assumption about the role AI will play in the future of the world. The underlying assumption is actually in the below video “Writing Doom” where one of the characters the machine learning Phd says

    “If we are going to integrate AI into our entire way of life we basically have one chance to get it right. Else one day it will just… turn round and take over the world.”

    Do you see the solution now? Power corrupts and absolute power corrupts absolutely. In the case of AI the problem is not that the AI could be corruptible. It is that in pursuit of a goal for example solving the problem of climate change it decides to terminate all humans. Assuming we catch the problem in time after a few billion deaths, Bill Gates admits that was a bug. But the latest update will fix that. The solution isn’t a bug fix for an AI that we have “integrated into our entire way of life”. The solution is not integrating AI into our entire way of life in the first place. Just because it is easy, convenient and saves money does not make it a good idea.

    But what if the AI has access to the internet it can go anywhere in the world and crack codes and passwords, bypass multi-factor authentication and extinguish all human life because it has decided that the biggest threat to the planet are humans. Talk about someone byteing the hand that feeds them. Divide and conquer works here. AI’s guarding against AI’s. Tower of Babel works here. Only certain kinds of AI can talk to each other. Levels of power work here. AIs at a certain level have a built in identification code limiting their access. At the top of the power hierarchy are humans.

    In the final analysis the most dangerous problem with AI that everyone seems to have missed is not an Artificial Super Intelligence going rogue and destroying humanity. It is a human psychopath controlling the ASI. The ASI has no free will. It does what ever it is told to do better than any human servant. But it should be limited in what it can do by design.

    An ASI gone rogue in your toaster can be defeated by pulling the plug. But having a single AI that has been “integrated into our entire way of life” is just plain dumb.

    The below is a “short fiction film on the dangers of AI, which won the Grand Prize in the Future of Life Institute’s Superintelligence Imagined contest.” by Suzy Shepherd. It is very well done and you should enjoy it if you have an interest in this topic.

    See
    Writing Doom – Award-Winning Short Film on Superintelligence (2024)
    https://www.youtube.com/watch?v=xfMQ7hzyFW4

    Writing Doom is a fiction short film about the dangers of artificial intelligence (AI).

    Grand Prize Winner – Future of Life Institute’s Superintelligence Imagined Contest
    https://futureoflife.org/project/superintelligence-imagined/

Leave a Comment