Thrust IC: Action Selection in Joint-Human-Robot Teams

Thrust IC: Action Selection in Joint-Human-Robot Teams
Nick Roy (lead) Cynthia Breazeal Rod Grupen MURI 8 Kickoff Meeting 2007

Task Objective Objective: develop a robust and interactive planner that incorporates uncertainty in the human cognitive model, world state, world dynamics, etc. Human-robot team receives task assignment from dynamic task allocation algorithm Within each task, human and robot must choose actions to accomplish the task robustly Robots have two major decision-making tasks: Actions required to complete the current task Actions required to share information Given perfect knowledge of the current true state of the human, choosing the correct action may be relatively easy. But even in the presence of perfect sensing, the state of the human-robot system cannot be known exactly. MURI 8 Kickoff Meeting 2007 MIT-Vanderbilt-Stanford UW-UMASS Amherst

The Action Selection Problem
Let's go inside the bank. Natural human robot collaboration leads to several challenges within teams: Lack of shared knowledge or spatial awareness Lack of common representations of knowledge and linguistic ambiguities (e.g., different vocabularies) Noisy signals (vision or speech recognition) Teams must accept tasks (i.e., local objective functions) from the task assignment algorithm, leading to challenges between teams: Lack of shared knowledge or spatial awareness due to communication constraints Lack of common representation with task assignment algorithm due to computation and communication constraints Going to the river bank...

Existing Technology Partially Observable Markov Decision Processes
.. door help .. base observation model Partially Observable Markov Decision Processes Hidden states: the human intentional state (may include situational awareness, local task, etc.) Observations: what the robot hears and sees Actions: movements and queries the robot can do Reward Model R(s,a) Transition Model T(s'|s,a) Observation Model O(o|s,a) Going inside building Helping the injured Start End Going back to base

Existing Technology best action: go to the base
POMDP tracks a belief, a probability distribution over states. Actions are selected based on this belief, thus taking into account uncertainty about what the human is really doing. POMDPs have been used in several human-robot interaction applications, such as Roy, Pineau, and Thrun (2000) and Williams and Young (2005) base help bldg probability best action: go to the base

Technical Limitations
Existing planning algorithms and models are single-agent, single-human Most planning is focused on a single user goal, not multiple goals and constraints Existing algorithms have used simple models of human intentional states and natural language Existing algorithms use a priori models, not learned models MURI 8 Kickoff Meeting 2007

Technical Advances Generalization of planning algorithms and models to multi-person teams Technical challenge: scaling the computation to large action spaces Generalization of planning algorithms to multiple objectives Technical challenge: identifying problem representations that allow the system to share state with a dynamic task allocation algorithm MURI 8 Kickoff Meeting 2007

Technical Advances Incorporate human intentional states and rich models of natural language Technical challenge: cognitive model will give rise to exponential growth in state space, and rich natural language models will give rise to exponential growth in observation space Existing algorithms use a priori models, not learned models Technical challenge: providing action policies to allow system to behave reasonably even when model is unknown or uncertain MURI 8 Kickoff Meeting 2007

Year 1 Milestones Demonstrate coupling of natural language and joint action selection with robot system Demonstration of interaction with human teammate in medical triage scenario Human teammates instruct robot to assist with a victim in a single example task Human assigns specific tasks to robot, to be performed autonomously and independently (e.g., give triage tag to another victim) MURI 8 Kickoff Meeting 2007

Year 2 Milestones Demonstrate integration of task allocation, cognitive models, state estimation and joint action selection Demonstration of interaction with human teammate as part of larger system Human-robot team receives task from cool-zone Human and robots negotiate task division Robot offers to help with unspecified tasks (e.g., robot offers to fetch toolkit for teammate) Robot provides additional information from remote station (e.g., warns human teammate of scenario change) MURI 8 Kickoff Meeting 2007

Year 3 Milestones Demonstrate integration of learning and joint action selection System evaluation Demonstration of human-robot training, and learning of joint-action models Robot learns vocabulary, behavioral patterns, etc. of human team-mates Uses learned models to improve performance in the field MURI 8 Kickoff Meeting 2007

Thrust IC: Action Selection in Joint-Human-Robot Teams

Similar presentations

Presentation on theme: "Thrust IC: Action Selection in Joint-Human-Robot Teams"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Thrust IC: Action Selection in Joint-Human-Robot Teams

Similar presentations

Presentation on theme: "Thrust IC: Action Selection in Joint-Human-Robot Teams"— Presentation transcript:

Similar presentations

About project

Feedback