Eliezer Yudkowsky yudkowsky.net Eliezer Yudkowsky Research Fellow Singularity Institute for Artificial Intelligence yudkowsky.net Yeshiva University March.

Slides:



Advertisements
Similar presentations
Simulation and the Singularity David Chalmers Departments of Philosophy Australian National University New York University.
Advertisements

Cognitive Systems, ICANN panel, Q1 What is machine intelligence, as beyond pattern matching, classification and prediction. What is machine intelligence,
By Anthony Campanaro & Dennis Hernandez
Artificial Intelligence
Peter Cochrane COCHRANE a s s o c i a t e s New Life and…... Intelligence.
WHAT IS THE NATURE OF SCIENCE?
Todd W. Neller Gettysburg College
Singularity Institute for Artificial Intelligence
Psyc 235: Introduction to Statistics DON’T FORGET TO SIGN IN FOR CREDIT!
Creating a New Intelligent Species: Choices and Responsibilities for AI Designers Eliezer Yudkowsky Singularity Institute for Artificial Intelligence singinst.org.
Lesson Plan - APP Probability Mental and Oral Starter Pupils to revisit the never heard the word grid to check their understanding of the key words. Main.
1 Intuitive Irrationality: Reasons for Unreason. 2 Epistemology Branch of philosophy focused on how people acquire knowledge about the world Descriptive.
Rationality Alan Kaylor Cline Department of Computer Sciences The University of Texas at Austin Based upon classic decision puzzlers collected by Gretchen.
Artificial Intelligence
Philosophy 120 Symbolic Logic I H. Hamner Hill CSTL-CLA.SEMO.EDU/HHILL/PL120.
© POSbase 2005 The Conjunction Fallacy Please read the following scenario: (by Tversky & Kahneman, 1983)Tversky & Kahneman, 1983 Linda is 31 years old,
Fallacies in Probability Judgment Yuval Shahar M.D., Ph.D. Judgment and Decision Making in Information Systems.
Overview of Decision Making Harrison, Ch. 1 Fred Wenstøp.
Judgment in Managerial Decision Making 8e Chapter 3 Common Biases
Running Experiments with Amazon Mechanical-Turk Gabriele Paolacci, Jesse Chandler, Jesse Chandler Judgment and Decision Making, Vol. 5, No. 5, August 2010.
Stat 321 – Day 11 Review. Announcements Exam Thursday  Review sheet on web  Review problems and solutions on web  Covering chapters 1, 2; HW 1-3; Lab.
CS 357 – Intro to Artificial Intelligence  Learn about AI, search techniques, planning, optimization of choice, logic, Bayesian probability theory, learning,
Heuristics and Biases. Normative Model Bayes rule tells you how you should reason with probabilities – it is a normative model But do people reason like.
CSE 471/598,CBS598 Introduction to Artificial Intelligence Fall 2004
Random Administrivia In CMC 306 on Monday for LISP lab.
Mind Is All That Matters: Reasons to Focus on Cognitive Technologies
Day 3: Rubrics as an Assessment Tool. "There are only two good reasons to ask questions in class: to cause thinking and to provide information for the.
Today’s Topic Do you believe in free will? Why or why not?
The Thinking Machine Based on Tape. Computer Has Some Intelligence Now Playing chess Solving calculus problems Other examples:
CISC4/681 Introduction to Artificial Intelligence1 Introduction – Artificial Intelligence a Modern Approach Russell and Norvig: 1.
Chapter 14: Artificial Intelligence Invitation to Computer Science, C++ Version, Third Edition.
Decision making Making decisions Optimal decisions Violations of rationality.
11 C H A P T E R Artificial Intelligence and Expert Systems.
What is AI:-  Ai is the science of making machine do things that would requires intelligence.  Computers with the ability to mimic or duplicate the.
Understanding Probability and Long-Term Expectations Example from Student - Kalyani Thampi I have an example of "chance" that I thought about mentioning.
Multiplying Whole Numbers © Math As A Second Language All Rights Reserved next #5 Taking the Fear out of Math 9 × 9 81 Single Digit Multiplication.
Virtual Canada 2.0. » Knowledge is not just information » Knowledge is not philosophy (but it can be approached through philosophical inquiry) » There.
WHAT IS THE NATURE OF SCIENCE?. SCIENTIFIC WORLD VIEW 1.The Universe Is Understandable. 2.The Universe Is a Vast Single System In Which the Basic Rules.
Lecture 15 – Decision making 1 Decision making occurs when you have several alternatives and you choose among them. There are two characteristics of good.
Are We Spiritual Machines? Ray Kurzweil vs. the Critics of Strong A.I.
Past research in decision making has shown that when solving certain types of probability estimation problems, groups tend to exacerbate errors commonly.
Of 27 12/03/2015 Boole-Shannon: Laws of Communication of Thought 1 Laws of Communication of Thought? Madhu Sudan Harvard.
Exercise 2-6: Ecological fallacy. Exercise 2-7: Regression artefact: Lord’s paradox.
What is Artificial Intelligence?
The Problem Solving Process BDP3O1 Entrepreneurship The Enterprising Person.
5 MARCH 2015 TOK LECTURE TRUTH: TNML. ECONOMICS  ECONOMISTS HAVE A VERY SHAKY RELATIONSHIP WITH TRUTH.  AT THE HEART OF THE FINANCIAL CRISIS OF 2008.
An insight into the posthuman era Rohan Railkar Sameer Vijaykar Ashwin Jiwane Avijit Satoskar.
Presented by:- Reema Tariq Artificial Intelligence.
Definition and Branches of Science and Physics
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 2 Nanjing University of Science & Technology.
Probability Judgments: Overview I  Heuristics and Basis:  Availabilty heuristic  Representativeness heuristic.  Anchoring and Adjustment.
1 Artificial Intelligence & Prolog Programming CSL 302.
AS Ethics Utilitarianism Title: - Preference Utilitarianism To begin… What is meant by preference? L/O: To understand Preference Utilitarianism.
A. Judgment Heuristics Definition: Rule of thumb; quick decision guide When are heuristics used? - When making intuitive judgments about relative likelihoods.
Heuristics and Biases Thomas R. Stewart, Ph.D. Center for Policy Research Rockefeller College of Public Affairs and Policy University at Albany State University.
The Representativeness Heuristic then: Risk Attitude and Framing Effects Psychology 355: Cognitive Psychology Instructor: John Miyamoto 6/1/2016: Lecture.
1 מקורות החשיבה המדעית/מתימטית ( ) אורי לירון שיעור ראשון – חידות, מֶטָה-חידות, תהיות, וסתם שאלות מעצבנות.
Overview of Artificial Intelligence (1) Artificial intelligence (AI) Computers with the ability to mimic or duplicate the functions of the human brain.
WHAT IS THE NATURE OF SCIENCE?
Exercise 2-7: Regression artefact: Lord’s paradox
What is Logic good for? How can we understand ‘good’ or ‘utility’ or ‘value’? Intrinsic value: for its own sake Instrumental value: valued for its capacity.
Thomas Lumley Department of Statistics University of Auckland
Pavle Valerjev Marin Dujmović
Thomas Lumley Department of Statistics University of Auckland
1st: Representativeness Heuristic and Conjunction Errors 2nd: Risk Attitude and Framing Effects Psychology 355:
CASE − Cognitive Agents for Social Environments
How can we deduce the nature of the world?
Section 1: The Methods of Science
Introduction to Artificial Intelligence Instructor: Dr. Eduardo Urbina
Artificial Intelligence
Presentation transcript:

Eliezer Yudkowsky yudkowsky.net Eliezer Yudkowsky Research Fellow Singularity Institute for Artificial Intelligence yudkowsky.net Yeshiva University March 2011 Self-Improving Artificial Intelligence and the Methods of Rationality

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations. Rank the following statements from most probable to least probable: Linda is a teacher in an elementary school Linda works in a bookstore and takes Yoga classes Linda is active in the feminist movement Linda is a psychiatric social worker Linda is a member of the League of Women Voters Linda is a bank teller Linda is an insurance salesperson Linda is a bank teller and is active in the feminist movement

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations. Rank the following statements from most probable to least probable: Linda is a teacher in an elementary school Linda works in a bookstore and takes Yoga classes Linda is active in the feminist movement Linda is a psychiatric social worker Linda is a member of the League of Women Voters Linda is a bank teller (A) Linda is an insurance salesperson Linda is a bank teller and is active in the feminist movement (A & B) 89% of subjects thought Linda was more likely to be a feminist bank teller than a bank teller Conjunction fallacy: P(A&B) must be than P(A)

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations. Linda is a bank teller (A) Linda is a bank teller and is active in the feminist movement (A & B) Bank tellers Feminists Neither

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 WHICH IS THE BEST BET? Die with 4 green faces and 2 red faces Die rolled 20 times, series of Gs and Rs recorded If your chosen sequence appears, you win $25 Which of these three sequences would you prefer to bet on? Sequence 1 Sequence 2Sequence 3 Amos Tversky and Daniel Kahneman (1983), Extensional versus Intuitive Reasoning. Psychological Review, 90,

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Die with 4 green faces and 2 red faces Die rolled 20 times, series of Gs and Rs recorded If your chosen sequence appears, you win $25 Which of these three sequences would you prefer to bet on? 65% chose Sequence undergraduates played this gamble with real payoffs. WHICH IS THE BEST BET? Amos Tversky and Daniel Kahneman (1983), Extensional versus Intuitive Reasoning. Psychological Review, 90, Sequence 1 Sequence 2Sequence 3

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Die with 4 green faces and 2 red faces Die rolled 20 times, series of Gs and Rs recorded If your chosen sequence appears, you win $25 Which of these three sequences would you prefer to bet on? but Sequence 1 dominates Sequence 2; it appears within 2, so it appears anywhere 2 appears. CONJUNCTION FALLACY: P(A) < P(A&B) Amos Tversky and Daniel Kahneman (1983), Extensional versus Intuitive Reasoning. Psychological Review, 90,

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Die with 4 green faces and 2 red faces: GGRGGR versus CONJUNCTION FALLACY: P(A) < P(A&B) Judgment of representativeness We ask: Which is most representative? Judgment of probability We want: Which is most probable? Amos Tversky and Daniel Kahneman (1983), Extensional versus Intuitive Reasoning. Psychological Review, 90, Sequence 1 Sequence 2Sequence 3

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Two independent groups of professional analysts asked: Please rate the probability that the following event will occur in nd INTERNATIONAL CONGRESS ON FORECASTING Amos Tversky and Daniel Kahneman (1983), Extensional versus Intuitive Reasoning. Psychological Review, 90, A complete suspension of diplomatic relations between the USA and the Soviet Union, sometime in Version 1 A Russian invasion of Poland, and a complete suspension of diplomatic relations between the USA and the Soviet Union, sometime in Version 2 The group asked to rate Version 2 responded with significantly higher probabilities.

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Two independent groups of professional analysts were asked to: Please rate the probability that the following event will occur in The group asked to rate Version 2 responded with significantly higher probabilities. S ECOND I NTERNATIONAL C ONGRESS ON F ORECASTING Amos Tversky and Daniel Kahneman (1983), Extensional versus Intuitive Reasoning. Psychological Review, 90, A complete suspension of diplomatic relations between the USA and the Soviet Union, sometime in Version 1 A Russian invasion of Poland, and a complete suspension of diplomatic relations between the USA and the Soviet Union, sometime in Version 2

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Two independent groups of professional analysts were asked to: Please rate the probability that the following event will occur in The group asked to rate Version 2 responded with significantly higher probabilities. Adding detail to a story can make the story sound more plausible, even though the story necessarily becomes less probable. C ONJUNCTION F ALLACY : P(A) < P(A&B) A complete suspension of diplomatic relations between the USA and the Soviet Union, sometime in Version 1 A Russian invasion of Poland, and a complete suspension of diplomatic relations between the USA and the Soviet Union, sometime in Version 2

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Subjects presented with description of teacher during lesson Some subjects asked to evaluate quality of lesson, in percentile score Other subjects asked to predict percentile standing of teacher 5 years later INSENSITIVITY TO PREDICTABILITY Kahneman, D., Tversky, A On the psychology of prediction. Psychological Review 80:

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Subjects presented with description of teacher during lesson Some subjects asked to evaluate quality of lesson, in percentile score Other subjects asked to predict percentile standing of teacher 5 years later Judgments identical - equally extreme! INSENSITIVITY TO PREDICTABILITY Kahneman, D., Tversky, A On the psychology of prediction. Psychological Review 80:

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 What do you think you know, and how do you think you know it?

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 One of these technologies is not like the others... Artificial Intelligence Cancer cure Interplanetary travel Nano-manufacturing

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 The power of intelligence: Fire Language Nuclear weapons Skyscrapers Spaceships Money Science

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 "Book smarts" vs. cognition: "Book smarts" evokes images of: Calculus Chess Good recall of facts Other stuff that happens in the brain: Social persuasion Enthusiasm Reading faces Rationality Strategic cleverness

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 What do we think we can guess? Technologies which impact upon cognition will end up mattering most, because intelligence is more powerful and significant than cool devices.

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 What do you think you know, and how do you think you know it?

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Artificial Addition |- Plus-Of(Seven, Six) = Thirteen |- Plus-Of(Nine, Six) = Fifteen |- Plus-Of(Seven, Two) = Nine |- Plus-Of(Thirteen, Two) = Fifteen.

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Views on Artificial General Addition Framing problem - what 'twenty-one plus' equals depends on whether it's 'plus three' or 'plus four'. Need to program huge network of arithmetical facts to cover common-sense truths. Need Artificial Arithmetician that can understand natural language, so instead of being told that twenty-one plus sixteen equals thirty-seven, it can obtain knowledge by reading Web. Need to develop General Arithmetician the same way Nature did - evolution. Top-down approaches have failed to produce arithmetic. Need bottom-up approach to make arithmetic emerge. Accept unpredictability of complex systems.

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Views on Artificial General Addition Neural networks - just like the human brain! Can be trained without understanding how they work! NNs will do arithmetic without us, their creators, ever understanding how they add. Need calculators as powerful as a human brain. Moore's Law predicts availability on April 27, 2031 between 4 and 4:30AM. Must simulate neural circuitry humans use for addition. Godel's Theorem shows no formal system can ever capture properties of arithmetic. Classical physics is formalizable. Hence AGA must exploit quantum gravity. If human arithmetic were simple enough to program, we couldn't count high enough to build computers. Haven't you heard of John Searle's Chinese Calculator Experiment? Will never know nature of arithmetic; problem is just too hard.

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Artificial Addition: The Moral |- Plus-Of(Seven, Six) = Thirteen |- Plus-Of(Nine, Six) = Fifteen |- Plus-Of(Seven, Two) = Nine |- Plus-Of(Thirteen, Two) = Fifteen. Moral 1: When you're missing a basic insight, you must find that insight. Workarounds may sound clever, but aren't. Can't talk sensibly about solutions until no-longer-confused.

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Artificial Addition: The Moral |- Plus-Of(Seven, Six) = Thirteen |- Plus-Of(Nine, Six) = Fifteen |- Plus-Of(Seven, Two) = Nine |- Plus-Of(Thirteen, Two) = Fifteen. Moral 1: When you're missing a basic insight, you must find that insight. Workarounds may sound clever, but aren't. Can't talk sensibly about solutions until no-longer-confused. Moral 2: Patching an infinite number of surface cases means you didn't understand the underlying generator.

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 "The influence of animal or vegetable life on matter is infinitely beyond the range of any scientific inquiry hitherto entered on. Its power of directing the motions of moving particles, in the demonstrated daily miracle of our human free-will, and in the growth of generation after generation of plants from a single seed, are infinitely different from any possible result of the fortuitous concurrence of atoms... Modern biologists were coming once more to the acceptance of something and that was a vital principle." - Lord Kelvin

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Mind Projection Fallacy: If I am ignorant about a phenomenon, this is a fact about my state of mind, not a fact about the phenomenon. (Jaynes, E.T Probability Theory: The Logic of Science. Cambridge: Cambridge University Press.)

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 What do we think we can guess? Technologies which impact upon cognition will end up mattering most. Today's difficulties in constructing AI are not because intelligence is inherently mysterious, but because we're currently still ignorant of some basic insights.

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 What do you think you know, and how do you think you know it?

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Minds-in-general Human minds

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Biology not near top of scale Lightspeed >10 6 times faster than axons, dendrites. Synaptic spike dissipates >10 6 minimum heat (though transistors do worse) Transistor clock speed >>10 6 times faster than neuron spiking frequency

Eliezer Yudkowsky lesswrong.com Yeshiva University March ,000,000-fold speedup physically possible: (1 year = 31,556,926 seconds) 31 seconds

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Village idiotEinstein

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Village idiotEinstein Village idiot Einstein Lizard Mouse

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 AI advantages Total read/write access to own state Absorb more hardware (possibly orders of magnitude more!) Understandable code Modular design Clean internal environment

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 What do we think we can guess? Technologies which impact upon cognition will end up mattering most. Today's difficulties in constructing AI are due to ignorance of key insights, not inherent mysteriousness. It is possible for minds to exist that are far more powerful than human. We can see that human hardware is far short of the physical limits, and there's no reason to expect the software to be optimal either.

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 What do you think you know, and how do you think you know it?

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Intelligence Technology

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 IntelligenceTechnology Intelligence Technology

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 AI

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 "Intelligence explosion:" Invented by I. J. Good in 1965 Hypothesis: More intelligence -> more creativity on task of making yourself even smarter Prediction: Positive feedback cycle rapidly creates superintelligence. (Good, I. J Speculations Concerning the First Ultraintelligent Machine. Pp in Advances in Computers, 6, F. L. Alt and M. Rubinoff, eds. New York: Academic Press.)

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 How fast an intelligence explosion? From chimpanzees to humans – 4x brain size, 6x frontal cortex, no apparent obstacles to constant evolutionary pressure. From flint handaxes to Moon rockets – constant human brains don't slow down over course of technological history as problems get harder. Thus an increasingly powerful optimization process should undergo explosive self-improvement.

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Intelligence explosion hypothesis does not imply, nor require: More change occurred from 1970 to 2000 than from 1940 to 1970.

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Intelligence explosion hypothesis does not imply, nor require: More change occurred from 1970 to 2000 than from 1940 to Technological progress follows a predictable curve.

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 What do we think we can guess? Technologies which impact upon cognition will end up mattering most. Today's difficulties in constructing AI are due to ignorance of key insights, not inherent mysteriousness. It is possible for minds to exist that are far more powerful than human. An AI over some threshold level of intelligence will recursively improve itself and explode into superintelligence.

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 What do you think you know, and how do you think you know it?

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 In Every Known Human Culture: tool making weapons grammar tickling meal times mediation of conflicts dance, singing personal names promises mourning (Donald E. Brown, Human universals. New York: McGraw-Hill.)

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 A complex adaptation must be universal within a species. (John Tooby and Leda Cosmides, The Psychological Foundations of Culture. In The Adapted Mind, eds. Barkow, Cosmides, and Tooby.)

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 A complex adaptation must be universal within a species. If: 6 necessary genes Each at 10% frequency in population Then: 1 in 1,000,000 have complete adaptation (John Tooby and Leda Cosmides, The Psychological Foundations of Culture. In The Adapted Mind, eds. Barkow, Cosmides, and Tooby.)

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Incremental evolution of complexity: 1.[A]A is advantageous by itself. 2.[AB]B depends on A. 3.[A'B]A' replaces A, depends on B. 4.[A'BC]C depends on A' and B... (John Tooby and Leda Cosmides, The Psychological Foundations of Culture. In The Adapted Mind, eds. Barkow, Cosmides, and Tooby.)

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 The Psychic Unity of Humankind Complex adaptations must be universal in a species – including cognitive machinery in Homo sapiens! (John Tooby and Leda Cosmides, The Psychological Foundations of Culture. In The Adapted Mind, eds. Barkow, Cosmides, and Tooby.)

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Must… not… emote…

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 The Great Failure of Imagination: Anthropomorphism

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Fallacy of the Giant Cheesecake Major premise: A superintelligence could create a mile-high cheesecake. Minor premise: Someone will create a recursively self- improving AI. Conclusion: The future will be full of giant cheesecakes. Power does not imply motive.

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Fallacy of the Giant Cheesecake Major premise: A superintelligence could create a mile-high cheesecake. Minor premise: Someone will create a recursively self- improving AI. Conclusion: The future will be full of giant cheesecakes. Power does not imply motive.

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 What do we think we can guess? Technologies which impact upon cognition will end up mattering most. Today's difficulties in constructing AI are due to ignorance of key insights, not inherent mysteriousness. It is possible for minds to exist that are far more powerful than human. An AI over some threshold level of intelligence will recursively self-improve and explode into superintelligence. Features of "human nature" that we take for granted are just one of a vast number of possibilities, and not all possible agents in that space are friendly.

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 What do you think you know, and how do you think you know it?

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 What do you think you know, and how do you think you know it? To predict a smarter mind, you'd have to be that smart yourself?

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Human Chess AI ?

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Human

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Chess AI Key insight: Predict & value consequences

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Deep Blue's programmers couldn't predict its exact moves "To predict a smarter mind, you'd have to be that smart yourself"?

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Deep Blue's programmers couldn't predict its exact moves So why not use random move generator? "To predict a smarter mind, you'd have to be that smart yourself"?

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Deep Blue's programmers couldn't predict its exact moves So why not use random move generator? Unpredictability of superior intelligence unpredictability of coinflips "To predict a smarter mind, you'd have to be that smart yourself"?

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Deep Blue's programmers couldn't predict its exact moves So why not use random move generator? Unpredictability of superior intelligence unpredictability of coinflips Deep Blue's "unpredictable" move, predictably has consequence of winning game "To predict a smarter mind, you'd have to be that smart yourself"?

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Deep Blue's programmers couldn't predict its exact moves So why not use random move generator? Unpredictability of superior intelligence unpredictability of coinflips Deep Blue's "unpredictable" move, predictably has consequence of winning game Inspection of code can't prove consequences in an uncertain real-world environment, but it can establish with near-certainty that an agent is trying to find the most-probably-good action. "To predict a smarter mind, you'd have to be that smart yourself"?

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Stability of goals in self-modifying agents: Gandhi doesn't want to kill people. Offer Gandhi a pill that will make him enjoy killing people?

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Stability of goals in self-modifying agents: Gandhi doesn't want to kill people. Offer Gandhi a pill that will make him enjoy killing people? If Gandhi correctly assesses this is what the pill does, Gandhi will refuse the pill, because the current Gandhi doesn't want the consequence of people being murdered.

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 Stability of goals in self-modifying agents: Gandhi doesn't want to kill people. Offer Gandhi a pill that will make him enjoy killing people? If Gandhi correctly assesses this is what the pill does, Gandhi will refuse the pill, because the current Gandhi doesn't want the consequence of people being murdered. Argues for folk theorem that in general, rational agents will preserve their utility functions during self-optimization. (Steve Omohundro, "The Basic AI Drives." In Proceedings of the First AGI Conference, eds. Pei Wang and Stan Franklin.)

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 What do we think we can guess? Technologies which impact upon cognition will end up mattering most. Today's difficulties in constructing AI are due to ignorance of key insights, not inherent mysteriousness. It is possible for minds to exist that are far more powerful than human. An AI over some threshold level of intelligence will recursively self-improve and explode into superintelligence. Not all possible agents in mind design space are friendly. If we can obtain certain new insights, it should be possible to construct a benevolent self-improving Artificial Intelligence.

Eliezer Yudkowsky lesswrong.com Yeshiva University March 2011 What do we think we can guess? Technologies which impact upon cognition will end up mattering most. Today's difficulties in constructing AI are due to ignorance of key insights, not inherent mysteriousness. It is possible for minds to exist that are far more powerful than human. An AI over some threshold level of intelligence will recursively self-improve and explode into superintelligence. Not all possible agents in mind design space are friendly. If we can obtain certain new insights, it should be possible to construct a benevolent self-improving Artificial Intelligence. And live happily ever after.

Eliezer Yudkowsky yudkowsky.net Eliezer Yudkowsky Research Fellow Singularity Institute for Artificial Intelligence yudkowsky.net Yeshiva University March 2011 More Info: