Modeling human action understanding as inverse planning

Slides:



Advertisements
Similar presentations
Situation Calculus for Action Descriptions We talked about STRIPS representations for actions. Another common representation is called the Situation Calculus.
Advertisements

Partially Observable Markov Decision Process (POMDP)
SA-1 Probabilistic Robotics Planning and Control: Partially Observable Markov Decision Processes.
Ai in game programming it university of copenhagen Reinforcement Learning [Outro] Marco Loog.
Chapter 1 Thinking About Social Problems Key Terms.
Children’s Thinking Lecture 3 Methodological Preliminaries Introduction to Piaget.
Introduction to Research Methodology
Introduction  Bayesian methods are becoming very important in the cognitive sciences  Bayesian statistics is a framework for doing inference, in a principled.
INTRODUCTION TO SOCIAL DECISION-MAKING PY151. Social-Decision Making is formally known in the field of psychology as “Social Cognition”. This presentation.
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
4.00 Understand promotion and intermediate uses of marketing- information.
BRS 214 Introduction to Psychology Methodology used in psychology field Dawn Stewart BSC, MPA, PHD.
Perfection and bounded rationality in the study of cognition Henry Brighton.
Starter What is a schema? Name the 3 processes involved in adapting a schema. Name the method of Piaget’s study. Name 3 features of a stage theory. What.
Learning Progressions: Some Thoughts About What we do With and About Them Jim Pellegrino University of Illinois at Chicago.
TKK | Automation Technology Laboratory Partially Observable Markov Decision Process (Chapter 15 & 16) José Luis Peralta.
Abstract This presentation questions the need for reinforcement learning and related paradigms from machine-learning, when trying to optimise the behavior.
Statistics What is the probability that 7 heads will be observed in 10 tosses of a fair coin? This is a ________ problem. Have probabilities on a fundamental.
Julian Rotter’s Expectancy-Value Theory
RULES Patty Nordstrom Hien Nguyen. "Cognitive Skills are Realized by Production Rules"
Artificial Intelligence: Research and Collaborative Possibilities a presentation by: Dr. Ernest L. McDuffie, Assistant Professor Department of Computer.
Detecting and Tracking Hostile Plans in the Hats World.
Generalized Point Based Value Iteration for Interactive POMDPs Prashant Doshi Dept. of Computer Science and AI Institute University of Georgia
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Today’s session You are learning about...You are learning to... Piaget’s stage theory of cognitive development The sensorimotor stage Object permanence.
A Bayesian Model of Imitation in Infants and Robots
Structured Probabilistic Models: A New Direction in Cognitive Science
Attitudes and Intentions
Ch.2 Values, Attitudes, Emotions and Culture
Learning from Observations
Learning from Observations
Principle Of Learning and Education Course NUR 315
Cognition: Thinking and Language
Introduce to machine learning
CS 4700: Foundations of Artificial Intelligence
Notes 1.3 Intro to Chemistry
Presented By S.Yamuna AP/CSE
MSDM AAMAS-09 Two Level Recursive Reasoning by Humans Playing Sequential Fixed-Sum Games Authors: Adam Goodie, Prashant Doshi, Diana Young Depts.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 14
Teaching and Educational Psychology
Biomedical Data & Markov Decision Process
Rationalism versus Empiricism
conference on Working Longer and Retirement November 2 & 3, 2017
Lecture Software Process Definition and Management Chapter 3: Descriptive Process Models Dr. Jürgen Münch Fall
Stat 112 Notes 4 Today: Review of p-values for one-sided tests
Announcements Homework 3 due today (grace period through Friday)
Developing and Evaluating Theories of Behavior
Introduction to Psychology Chapter 1
Chapter 3: The Reinforcement Learning Problem
Topic 3: Interpersonal Relationship.
Dr. Unnikrishnan P.C. Professor, EEE
User interface design.
Moral Reasoning 1.
Chapter 3: The Reinforcement Learning Problem
Chapter 3: The Reinforcement Learning Problem
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 14
Learning from Observations
CS 416 Artificial Intelligence
Reinforcement Learning Dealing with Partial Observability
Scientific Method Basic procedures
Learning from Observations
Myers’ EXPLORING PSYCHOLOGY (6th Ed)
Decision trees One possible representation for hypotheses
Markov Decision Processes
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
Markov Decision Processes
Making Simple Decisions
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 14
Presentation transcript:

Modeling human action understanding as inverse planning Chris Baker, Josh Tenenbaum, Rebecca Saxe MIT

Heider and Simmel demo

Intuitive psychology How do we infer hidden mental states of other agents that cause their observed behavior? Beliefs, desires, plans, intentions, emotions. How do we use mental-state inferences to learn about the world? Pulling out into traffic, jumping off a summit… What is the structure of intuitive theories of psychology that support these inferences, and how are those theories acquired?

Why did the man cross the street?

Principle of rationality Intuitively, we assume other agents will tend to take sequences of actions that most effectively achieve their goals given their beliefs. More formally: inverse planning in a goal-based Markov Decision Process (MDP). Beliefs (B) Goals (G) Rational Planning (MDP) Actions (A)

Caution! We are not proposing this as a computational model of how people plan! It is a computational model of people’s mental model of how people plan. Whether planning is “rational” or well-described as solving an MDP is an interesting but distinct question. Beliefs (B) Goals (G) Rational Planning (MDP) Actions (A)

Rational action understanding in infants (Gergely & Csibra) schema:

The present research Aims Experiments 1 & 2: goal inference To test how accurately and precisely the inverse planning framework can explain people’s intuitive psychological judgments. To use the inverse planning framework as a tool to test alternative accounts of people’s intuitive theories of psychology. Experiments 1 & 2: goal inference Experiments 3 & 4: action prediction

Experiment 1: goal inference Method Subjects (N=16) view animated trajectories in simple maze-like environments. Subjects observe partial action sequences with several candidate goals and are asked to rate relative probabilities of goals at different points along each trajectory.

Experiment 1: goal inference Method Subjects (N=16) view animated trajectories in simple maze-like environments. Subjects observe partial action sequences with several candidate goals and are asked to rate relative probabilities of goals at different points along each trajectory. Set up Cover story: intelligent aliens moving in their natural environment. Assume fully observable world: agent’s beliefs = true states and transition functions of the environment. 100 total judgments, with 3-6 judgment points along each of 36 different trajectories (= 4 goal positions x 3 kinds of obstacles x 3 goals).

Specific inverse planning models Model M1(β): fixed goal The agent acts to achieve a particular state of the environment, which is fixed for a given action sequence. Small negative cost for each step that does not reach the goal.

Specific inverse planning models Model M2(β,γ): switching goals Just like M1, but the agent’s goal can change at any time step with probability γ. Agent plans greedily, not anticipating its own potential goal changes.

Sample behavioral data

Modeling results M1(β) Fixed goal People M2(β,γ) Goal switching People

Modeling results People M2(2,0.33) Goal switch

An alternative heuristic account? Last-step heuristic: infer goal based on only the last movement (instead of the entire path) a special case of M2, equivalent to M2(β,.67). This model correlates highly with people’s judgments in Experiment 1. However, there are qualitative differences between this model’s predictions and people’s judgments that suggest that people are using a more sophisticated form of temporal integration.

An alternative heuristic account? A thought experiment… M2(2,0.1) Goal switch M2(2,0.67) One-step heuristic

Human goal inferences M2(2,0.1) M2(2,0.67) M1(2) Goal switch Last-step B B B B Human goal inferences M2(2,0.1) Goal switch M2(2,0.67) Last-step heuristic M1(2) Fixed goal

Analysis with full data set Critical stimuli

95% CI M1(2) M2(2,0.1) M2(2,0.25) M2(2,0.5) M2(2,0.67) Step 10: Before Fixed goal human judgment Correlation w/ M2(2,0.1) M2(2,0.25) M2(2,0.5) Goal switch M2(2,0.67) Last-step heuristic Step 10: Before disambiguation Step 11: After disambiguation

Summary Inverse planning is a framework for inferring mental states from behavior, assuming a rational agent. Inverse planning can be used to predict people’s goal attributions with high accuracy (at least in simple environments). Goal attributions are better explained by inverse planning with a dynamic space of goals than a simpler model with fixed goals or various “one-step” heuristics. Intuitive psychology appears to be based on a precise predictive model, much like intuitive physics. Intuitive psychology may be “more rational” than actual psychology….

Open directions More complex environments. Hierarchical goal structures, plans. Richer mental-state representations, e.g. recursive belief: “I’m guessing that you think Mary is wrong, but trust me, she isn’t.” Competitive interactions (e.g., Jun Zhang) Language understanding The acquisition of intuitive psychology. The relation between psychology (how people actually think and plan) and intuitive psychology.