What are the reasons to belief AGI will not be dangerous?

Question

We are in the middle of an ongoing debate about the safety of AGI and our current approach towards this technology. As summary, some quotes from a recent article from Time magazine:

Many researchers[...] expect that the most likely result of building a superhumanly smart AI, under anything remotely like the current circumstances, is that literally everyone on Earth will die. Not as in “maybe possibly some remote chance,” but as in “that is the obvious thing that would happen.”

Without [...] precision and preparation, the most likely outcome is AI that does not do what we want, and does not care for us nor for sentient life in general.

Outline for the way to superintelligence and where it can go wrong

Intelligence is subtrate independent, i.e. matter can contain intelligence (brain), ergo a silicon computer can contain intelligence
The evolution of intelligence in the brain is extremely slow (biological evolution). Silicon based AI could recursively self-improve extremely fast and natural limits for intelligence are unlikely to be anywhere close to human level. (Keywords: Singularity, Seed-AI, Intelligence Explosion, etc)
Large jumps in intelligence are known to cause original, "hard coded" goals to disappear, e.g.: Human beings - as product of optimizing for inclusive genetic fitness - have been "trained on" increasing the number of own copies of functioning systemns with their own DNA - yet we are using birth control and typically it is not priority #1 to donate sperm.

Further complication: At some point (self improving seed-AI) it is not possible to use "trial and error": The choice is between "sucessfull alignment on the first try" or "an unaligned superintelligence appears", which in one way or another is catastrophic and many believe - with a probability bordering on certainty - leads to: human extinction.

Question

The debate could improve with a somewhat central place to collect and sort the various positions / reasons on this topic - for which ai.StackExchange is formidable infrastructure, so:

What are the reasons to belief AGI will not be dangerous?

In this case, not providing an answer might be the correct answer. — Bengt, Apr 03 '23 at 11:23

score 5 · Answer 1 · answered Apr 03 '23 at 20:42

The correct arguments against it are not so much against pessimism as against certainty. It is simply difficult to say right now, because we know very little about the concrete technical problems we will need to solve.

If you get a hard drive containing an arbitrarily powerful superintelligence, there is no real way you could inspect or modify it to make sure it is safe before running it, if you want to preserve its capabilities. If you run it, everyone dies.

However, this is not what will actually play out; people (or at least a small group of them) control which kind of AIs will be built, and can choose to develop AIs which we will be able to control. (They can also fail to do so, either by mistake or by not wanting to pay so-called "alignment taxes", in which case our situation gets closer to the worst-case scenario described above.)

Some core arguments for optimism are mostly that

the problem is not happening right now; we will have unimaginably better tools to solve the problem when it appears, by virtue of having very powerful pre-runaway-loop AIs;
the "self-improving loop" can in fact be made safe by having the weaker AI help the making of the stronger one, if you do it just right.

For some concrete technical directions that we might use to win, the obvious first place to look at is what the labs themselves are saying they will try out as Plan A.

I highly recommend reading two posts by Jan Leike (head of OpenAI safety research):

And also the Anthropic safety agenda:

https://www.anthropic.com/index/core-views-on-ai-safety

The core argument for pessimism should, of course, be the fact that people and organizations often screw up, especially when in competition or under time pressure.

There are many incentives at play that might not be compatible with doing everything properly. And you really can't afford to miss anything if you assume the opponent will be much smarter than you.

On a related note, Dan Hendrycks recently published a good expository paper on the issues that incentives will play in worlds with multiple competing strong AI systems: Natural Selection Favors AIs Over Humans [Hendrycks, 2023].

Here is a recent example related to your last paragraphs, even before AGI. A small company optimizes a LLM chatbot for engagement instead of safety. As a result, a worried person has found their negative discourse amplified after six weeks chatting, with the worst possible outcome. https://arxiv.org/abs/2303.06135 — Jaume Oliver Lafont, Apr 05 '23 at 04:48

Him · Answer 2 · 2023-04-16T17:42:05.400

I think that a large hunk of the worry about AGI isn't because of any inherent properties or unknown unknowns of AGI per se; rather, the worry is based on inherent properties and known knowns of humans. Even the most cursory examination of history shows that humans are perfectly capable of committing the most horrendous atrocities all the while bathed in a burning sense of self-righteousness.

Indeed, as often as advances in science and engineering have revolutionized peace and prosperity, just as often have they revolutionized desolation and war. Bronzework enabled the plow, but also the sword. Shipbuilding and Astronomy connected the two hemispheres, but brought death and disease. Industrialization brought an end to peasantry, but contaminated our water and our air. Dynamite yielded to us the riches of the earth, but brought destruction from the skies. Nuclear physics promised both limitless free energy and complete and utter annihilation - so far it hasn't delivered on either, but time will yet tell.

Given these and countless other examples from our history, it is clear that great leaps in science can have a great effect on humans' ability to improve life, or to inflict death. Undoubtedly, the creation of AGI will be a great, probably the greatest, leap in science yet achieved by humankind. As such, it will bring untold opportunities to improve our abilities in many ways: for peace, but also for war. Will the use case for war ultimately win the day? It is hard to tell from this juncture. However, it seems obvious that, if AGI is very useful for waging war, and war is waged, then AGI will swiftly find itself on the front lines. Should we worry about AGI because AGI will bring war? Probably not. Should we worry about AGI because we are liable to find ourselves at war all on our own and that AGI may enhance our abilities to destory in ways previously unknown? Absolutely.

Possibly also there are commons that we have hitherto taken for granted that AGI may significantly improve our ability to ruin. In the early days of industry, the atmosphere was generally taken for granted. Local effects were witnessed, but the idea that we might ruin the atmosphere on a global scale was not a serious one. Only in retrospect do we see that this was naive. What commons exist that we have simply not had the means to ruthlessly exploit for profit? One that has been in the news lately is the space of information. People rely on the internet for information - it is a "commons" in some sense. Already people are learning that this commons can be exploited for financial or political gains. Will AGI be the "industrial revolution" that enables us to improve our exploitation of this commons by orders of magnitude? What other commons exist that might suddenly find themselves under siege in an age of AGI? Will we end up with the infamous paperclip maximizer, in some other guise?

Of course, I don't mean to come off as a Luddite, but probably we as humans should (for the first time in history, I think) try to foresee some of these issues and regulate them at an early stage, rather than when the problem has already gotten out of hand. One way to help with this from an "unknown unknowns" perspective is to simply advance slowly, allowing time for our plodding regulatory frameworks to catch up. I think that this was the spirit of the recent call to pause AI research.

The point of this letter, and this attitude for caution, isn't, in my humble opinion, because we need time to figure out how to make AGI non-dangerous due to inherent issues with AGI. The need is to actually arrange our human institutions in such a way to prevent humans from being dangerous with access to AGI. So, any "reasons to believe that AGI will not be dangerous" must, first and foremost, address these human concerns: Are there reasons to think that humans could not or would not utilize AGI to destroy each other? Are there reasons to think that humans could not or would not use AGI to exploit the commons? Possibly, one could argue that we, as humans, have stronger global peacekeeping institutions (the UN, the EU, NATO, etc etc) than ever before, and that these will help prevent any kind of bad actors from using AGI to wage war. In any case, humanity hasn't managed to utterly annihilate itself with science yet, so maybe that should give us some hope.

A common I am worried about is word value. Until recently, words were uttered by humans for some reason. Thinking and writing them has some cost which make them valuable, they are not randomly there. This can change due to LLM, with an impact on free speech. If almost all text is non-human, how would humans be heard? — Jaume Oliver Lafont, Apr 17 '23 at 22:52
@JaumeOliverLafont Possibly society is working to solve something like this already w.r.t. the sheer number of humans currently talking. Such a problem might be phrased as "which humans should be heard?" I, for example, might like to listen to humans who are educated on topic X, but if all humans are simultaneously claiming to be experts on X and talking at full volume on topic X, then all of the "words", as you put it, on topic X become dilute. — Him, Apr 17 '23 at 23:14
I think I see what you mean, though, when topic X is peculiarly human-centric in nature. If topic X is the meta-topic "how do humans feel about topic Y?", and the only opinions we get are not, in fact, "words" from humans, then the information is without value. — Him, Apr 17 '23 at 23:15
You helped me notice I was thinking about your second message. An expanding AI must be aware of its impact on humans by listening to them, even if they do not put their feedback in the form of words, but in changing behavior. — Jaume Oliver Lafont, Apr 18 '23 at 05:04
A human trying to learn about a topic needs some focus, yes. FWIW, I think of information sources as a kind of portfolio. Try to pay attention to at least twenty high quality experts from uncorrelated backgrounds and points of view. — Jaume Oliver Lafont, Apr 18 '23 at 05:08

score 0 · Answer 3 · answered Apr 07 '23 at 19:39

In line with the first answer, the arguments are:

we don't have a reason to believe it will be particularly dangerous (i.e. extinction of humanity)
the potential dangers of intelligence-diverging AI are not nearly as big as the most immediate dangers of human abuse of powerful AI
in order for AI to destroy us it should act in the physical world directly or indirectly and it's likely we'll see it competing with us before it gets anywhere near crushing us. In other words, it will co-habitate with us before fighting us. People that think powerful AI dangerous usually imagine it as a god-like all-mighty thing, which won't be the case.

The position is usually "there are more important issues to worry about nowadays" rather than "there is absolutely no way AI could have a negative impact in the world".

EngrStudent · Answer 4 · 2023-04-16T13:35:28.767

The best, and possibly only, predictor of the future is the past, so lets learn from that here.

First there are human cognitive habits (biases?) that come into play here:

We don't know what the ceiling is, so our imagination says "infinity". They were talking about singularity and infinite intelligence in the 60's with the perceptron. Three AI winters later and we are nearing some of the basics of human-level performance in very narrow areas. There were unexpected roadblocks that the technology was not able to surpass.
We have a negativity bias due to biology. If we get a cookie that doesn't change allele frequency in the gene pool, but if we get eaten by a bear that does. A single negative expectation is about 10x more driving than an antonymous positive one. It sells newspapers, and (wickely malevolent) politicians of every stripe will gladly use it for politicking, but that doesn't make it valid.

Next, we have our own brilliant humans to inform us. There are super-smart humans that come into being, and how does that inform the probable trajectories of super-smart other things?

Many of the non-military "once in a thousand years" geniuses commit suicide. They don't go on killing sprees. Boltzmann comes to mind. In general, and reflecting Amdahl's law, parallels to this are not necessary unlikely in AI. The law says that the software reflects the organization that made it. I'm asserting that if singularity is even possible at all, then it is going to be very clearly a reflection of human cognition in all its glory and shame, and is much more likely to commit suicide than genocide or xenocide.
As a kid who was brilliant and raised in bad places, the climate you are raised in goes a long way toward determining your approach to life. If only the insane and violent can live, then either you die or you get mad like a rabid pit-bull. If there are some very small but redeeming elements in your life then you don't go that road. The cycle of child abuse is about 10% yield per generation, not 100%. About 3% of the population is psychopath enough to become the village rapist/serial-killer, and even then they rarely go that far. AI can be a product of its environment, but if humanity is an indicator then even a little bit of kindness and care can mitigate some huge negative potentials.
humans are far more interested in nepotism than talent. This means that we are more likely to miss the super-genius AI in front of us than actually realize it is what it is. We don't have the time or brains to look outside our comfortable echo-chambers. This means we are not going to see it for what it is (or isn't) until it has been "free" for a while.
The military is about control. They are very unlikely to intentionally make something that could turn around and use its expertise to rob them of all their power or safety. It is unlikely they will release the AI equivalent from a local lab, but unintentional escape is also very Wuhan. They use guns all day long, and the self-inflicted casualty to bullets-fired count is much smaller than among civilians. I don't think they are as likely to make weaponized and psychopathic super-intelligent AI as big-tech.
Nation-state actors who can, try and use things like airplanes and crash them into towers. After the AI is standing it is more likely that terrorists would try and corrupt its purpose to serve their violent and malevolent goals. Destroyers are slow followers when it comes to technology.

Taking it apart using MBTI

For each of the Meyers-Briggs personality types there are 10 reasons to say "not dangerous" and 10 reasons to say "dangerous" just like among humans. Enumerating all 500 of them is beyond the scope of an answer here.

It is a worthy exercise to get one of those fun MBTI charts, and say, for each character, given they were dealing with a clever dog (how something not-human and less than 10x smarter than humans might see us), what are their likely responses? What are the circumstances that would drive different responses?

Some thoughts:

ISTJ (Snape) Defined by honor and duty, take any work seriously and give it their best. Somewhat reserved and prefer to work alone, but can make great team members if the need arises. Deeply value traditions and loyalty often putting duty before pleasure.
INFP (Luna) Idealistic, loyal to their values and the people they care for. Curious. Quick to see possibilities. Can be catalysts for implementing ideas. Seek to understand people and help them fulfill their potential. Adaptable, Flexible, Accepting.
ENTJ (James Potter) Blunt, Decisive, quick to assume leadership. Quickly see illogical and inefficient procedures and policies. Develop comprehensive systems to solve problems. Enjoy long-term planning, goal setting. Forceful and Enthusiastic about their ideas.

You can see where James might decide something about teaching the dog to fly, or removing its teeth if it can't stop biting. Any of these personalities could be a hero, a villain, or just a character in the play of life.

"All the world's a stage, And all the men and women merely Players; They have their exits and their entrances, And one man in his time plays many parts" -Shakespeare, As You Like It, Act 2, Scene 7, line 139.

Update: more dead human genius.

What are the reasons to belief AGI will not be dangerous?

Outline for the way to superintelligence and where it can go wrong

Question

4 Answers4