A Confidence-Based Approach to Multi-Robot Demonstration Learning Sonia Chernova Manuela Veloso Carnegie Mellon University Computer Science Department.

Slides:

Advertisements

Similar presentations

Trajectory Analysis of Broadcast Soccer Videos Computer Science and Engineering Department Indian Institute of Technology, Kharagpur by Prof. Jayanta Mukherjee.

Advertisements

Manuela Veloso, Anthony Stentz, Alexander Rudnicky Brett Browning, M. Bernardine Dias Faculty Thomas Harris, Brenna Argall, Gil Jones Satanjeev Banerjee.

Towards Self-Testing in Autonomic Computing Systems Tariq M. King, Djuradj Babich, Jonatan Alava, and Peter J. Clarke Software Testing Research Group Florida.

Perception and Perspective in Robotics Paul Fitzpatrick MIT Computer Science and Artificial Intelligence Laboratory Humanoid Robotics Group Goal To build.

1 Kshitij Judah, Alan Fern, Tom Dietterich TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: UAI-2012 Catalina Island,

CMRoboBits: Creating an Intelligent AIBO Robot Instructors: Prof. Manuela Veloso & Scott Lenser TAs: Douglas Vail & Nick Aiwazian Further Support: Dr.

An Overview of Machine Learning

Carnegie Mellon University School of Computer Science Carnegie Mellon University School of Computer Science Cognitive Primitives for Mobile Robots Development.

ROBOT BEHAVIOUR CONTROL SUCCESSFUL TRIAL OF MARKERLESS MOTION CAPTURE TECHNOLOGY Student E.E. Shelomentsev Group 8Е00 Scientific supervisor Т.V. Alexandrova.

G. Valenzise *, L. Gerosa, M. Tagliasacchi *, F. Antonacci *, A. Sarti * IEEE Int. Conf. On Advanced Video and Signal-based Surveillance, 2007 * Dipartimento.

Waste Management Challenge 6. The goal of an effective Waste Management company is to collect, sort, and remove waste according to the type of waste.

Machine Learning CSE 473. © Daniel S. Weld Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanning.

Color Recognition and Image Processing CSE321 MyRo Project Rene van Ee and Ankit Agarwal.

RETSINA: A Distributed Multi-Agent Infrastructure for Information Gathering and Decision Support The Robotics Institute Carnegie Mellon University PI:

Presentation in IJCNN 2004 Biased Support Vector Machine for Relevance Feedback in Image Retrieval Hoi, Chu-Hong Steven Department of Computer Science.

1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.

Hidden Process Models Rebecca Hutchinson Tom M. Mitchell Indrayana Rustandi October 4, 2006 Women in Machine Learning Workshop Carnegie Mellon University.

Robotic Systems Trends, Research, Future CSCi 338 :: Distributed Systems :: Fall 2005 Aleksandar Stefanovski.

Impact of Problem Centralization on Distributed Constraint Optimization Algorithms John P. Davin and Pragnesh Jay Modi Carnegie Mellon University School.

Jennifer Goodall, Nick Webb, Katy DeCorah

Robots at Work Dr Gerard McKee Active Robotics Laboratory School of Systems Engineering The University of Reading, UK

Biointelligence Laboratory School of Computer Science and Engineering Seoul National University Cognitive Robots © 2014, SNU CSE Biointelligence Lab.,

Ethics Aspects Of Embedded And Cyber-Physical Systems Abhilash Thekkilakattil 1, Gordana Dodig-Crnkovic 1,2 1 Mälardalen University, Sweden 2 Chalmers.

 A set of objectives or student learning outcomes for a course or a set of courses.  Specifies the set of concepts and skills that the student must.

Humanoid Robots Debzani Deb.

Supervised Learning and k Nearest Neighbors Business Intelligence for Managers.

An Adaptive Modeling for Robust Prognostics on a Reconfigurable Platform Behrad Bagheri Linxia Liao.

Last Words COSC Big Data (frameworks and environments to analyze big datasets) has become a hot topic; it is a mixture of data analysis, data mining,

Artificial Intelligence Lecture No. 28 Dr. Asad Ali Safi Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.

1 Intelligent Systems ISCRAM 2013 Validating Procedural Knowledge in the Open Virtual Collaboration Environment Gerhard Wickler AIAI, University.

Interactive Dialogue Systems Professor Diane Litman Computer Science Department & Learning Research and Development Center University of Pittsburgh Pittsburgh,

Thesis Proposal PrActive Learning: Practical Active Learning, Generalizing Active Learning for Real-World Deployments.

Composing Adaptive Software Authors Philip K. McKinley, Seyed Masoud Sadjadi, Eric P. Kasten, Betty H.C. Cheng Presented by Ana Rodriguez June 21, 2006.

Classification / Regression Neural Networks 2

Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

K. J. O’Hara AMRS: Behavior Recognition and Opponent Modeling Oct Behavior Recognition and Opponent Modeling in Autonomous Multi-Robot Systems.

Lecture 10: 8/6/1435 Machine Learning Lecturer/ Kawther Abas 363CS – Artificial Intelligence.

CS 4630: Intelligent Robotics and Perception Case Study: Motor Schema-based Design Chapter 5 Tucker Balch.

2 3  A machine  Built to help us  Autonomous (not remote control)  If we want robots to do things for us, we have.

AI – CS289 Machine Learning - Labs Machine Learning – Lab 4 02 nd November 2006 Dr Bogdan L. Vrusias

“Artificial Intelligence” in my research Seung-won Hwang Department of CSE POSTECH.

Stefan Mutter, Mark Hall, Eibe Frank University of Freiburg, Germany University of Waikato, New Zealand The 17th Australian Joint Conference on Artificial.

Director: Prof. Maja J Matarić Associate Director: Prof. Gaurav S. Sukhatme Founder: Prof. George A. Bekey G OALS Automate the process of robot controller.

Evolving the goal priorities of autonomous agents Adam Campbell* Advisor: Dr. Annie S. Wu* Collaborator: Dr. Randall Shumaker** School of Electrical Engineering.

Distributed Algorithms for Multi-Robot Observation of Multiple Moving Targets Lynne E. Parker Autonomous Robots, 2002 Yousuf Ahmad Distributed Information.

Confidence Based Autonomy: Policy Learning by Demonstration Manuela M. Veloso Thanks to Sonia Chernova Computer Science Department Carnegie Mellon University.

Some questions -What is metadata? -Data about data.

ICIP 2004, Singapore, October A Comparison of Continuous vs. Discrete Image Models for Probabilistic Image and Video Retrieval Arjen P. de Vries.

Course Overview  What is AI?  What are the Major Challenges?  What are the Main Techniques?  Where are we failing, and why?  Step back and look at.

Agnostic Active Learning Maria-Florina Balcan*, Alina Beygelzimer**, John Langford*** * : Carnegie Mellon University, ** : IBM T.J. Watson Research Center,

A Reliable Skin Detection Using Dempster-Shafer Theory of Evidence

June 13-15, 2007Policy 2007 Infrastructure-aware Autonomic Manager for Change Management H. Abdel SalamK. Maly R. MukkamalaM. Zubair Department of Computer.

© 2013 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED Machine Learning and Failure Prediction in Hard Disk Drives Dr. Amit Chattopadhyay Director.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

Learning for Physically Diverse Robot Teams Robot Teams - Chapter 7 CS8803 Autonomous Multi-Robot Systems 10/3/02.

Jivko Sinapov, Kaijen Hsiao and Radu Bogdan Rusu Proprioceptive Perception for Object Weight Classification.

Bringing Order to the Web : Automatically Categorizing Search Results Advisor ： Dr. Hsu Graduate ： Keng-Wei Chang Author ： Hao Chen Susan Dumais.

When Communication is expensive Xiang Song Paper List Communication in Domains with Unreliable, Single- Channel, Low-Bandwidth Communication, Peter Stone,

ASSEMBLY AND DISASSEMBLY: AN OVERVIEW AND FRAMEWORK FOR COOPERATION REQUIREMENT PLANNING WITH CONFLICT RESOLUTION in Journal of Intelligent and Robotic.

AAAI Spring Symposium : 23 March Brenna D. Argall : The Robotics Institute Learning Robot Motion Control from Demonstration and Human Advice Brenna.

Efficient Image Classification on Vertically Decomposed Data

Efficient Image Classification on Vertically Decomposed Data

A Fast and Scalable Nearest Neighbor Based Classification

INF 5860 Machine learning for image classification

Learning a Policy for Opportunistic Active Learning

MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING

MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING

Chapter 7: Transformations

Ryan Layer CU Boulder CS Ryan Layer

Presentation transcript:

A Confidence-Based Approach to Multi-Robot Demonstration Learning Sonia Chernova Manuela Veloso Carnegie Mellon University Computer Science Department

2 Policy Development Learning from Demonstration: 1.Access/select sensor data 2.Develop actions 3.Provide demonstrations Traditional Policy Development: 1.Access/select sensor data 2.Develop actions 3.Code the policy (C++, Python, …)

3 Research Questions How can we teach a single robot through demonstration by enabling both the robot and the teacher to select demonstration examples? How can we teach collaborative multi-robot tasks through demonstration?

4 Confidence-Based Autonomy (CBA) algorithm Confident Execution Corrective Demonstration Single Robot Learning Teacher Policy state action Environment demonstration demonstration request demonstration [Sonia Chernova and Manuela Veloso. Interactive Policy Learning through Confidence-Based Autonomy. Journal of Artificial Intelligence Research. Vol. 34, 2009.]

5 Task Representation  Robot state  Robot actions  Training dataset:  Policy represented by classifier (e.g. Gaussian Mixture Model, Support Vector Machine, etc) – policy action – decision boundary with greatest confidence for the query – classification confidence w.r.t. decision boundary sensor data f1f1 f2f2

6 Assumptions  Teacher understands and can demonstrate the task  High-level task learning –Discrete actions –Non-negligible action duration  State space contains all information necessary to learn the task policy  Robot is able to stop to request demonstration –… however, the environment may continue to change

7 Policy NoYes Confidence-Based Autonomy s2s2 stst …sisi …s4s4 s3s3 s1s1 Time Current State sisi Request Demonstration ? Execute Action a p Relearn Classifier Execute Action a d Request Demonstration adad Add Training Point (s i, a d )

8 Corrective Demonstration Confidence-Based Autonomy Confident Execution Policy NoYes sisi Request Demonstration ? Execute Action a p Relearn Classifier Execute Action a d Request Demonstration adad Add Training Point (s i, a d ) acac Teacher Relearn Classifier Add Training Point (s i, a c )

9 Research Questions How can we teach a single robot through demonstration by enabling both the robot and the teacher to select demonstration examples? How can we teach collaborative multi-robot tasks through demonstration?

10 Multi-Robot Learning from Demonstration  Challenges –Limited human attention –Human-robot interaction –Teaching collaborative behavior

11 flexMLfD Multi-Robot Learning from Demonstration Teacher CBA demonstrations

12 Teaching Collaborative Behavior Without Communication Implicit Coordination With Communication Coordination through Shared State – communication occurs automatically Coordination through Active Communication – demonstration of communication actions Robots perform complementary actions based on individual policies

13 Implicit Coordination Coordination without communication Observed state and environmental effects of physical actions enable complementary behaviors

14 Coordination through Shared State Coordination based on state features that are automatically communicated to teammates each time their value changes.

15 Coordination through Shared State

16 Coordination through Shared State From: Robot1 Label: Data3 Value:

17 Coordination through Active Communication Coordination based on demonstrated communication actions that are incorporated directly into the robot's policy

18 Coordination through Active Communication

19 Coordination through Active Communication

20 Coordination through Active Communication From: Robot1 Label: Data3 Value:

21 QRIO Ball Sorting Domain  Domain objectives: –Sort balls into bins by color –Distribute balls between robots Share balls if teammate runs out Communication required for collaboration Red Yellow Blue Empty Drop left, drop right, pass ramp, wait stateactions

22 Shared State Video

23 Active Communication Video

24 Algorithm Comparison Ball Sorting Task

25 Algorithm Comparison  Ordered Ball Sorting Task –Sort balls in order by color (all red first, then blue, then yellow)

26 Playground Domain

27 Scalability of Teaching Collaborative Multi-Robot Behaviors  Scalability evaluation with up to 7 AIBO robots –Synchronous learning start times –Offset learning start times –Common policy learning  Conclusions: –Linear trend for training time, number of demonstrations, etc –No absolute upper bound on number of robots –Approach likely to be limited by domain-specific factors training time acceptable demonstration delay

28 Summary flexMLfD multi-robot learning framework Each robot learns individual policy using Confidence-Based Autonomy Three techniques for teaching collaborative multi-robot behavior Implicit coordination Coordination through active communication Coordination through shared state Scalability analysis using up to 7 robots

29 Questions?

30 State and Action Selection Robot Sensors Camera Microphone Buttons Network Robot Actions R G B NumAIBOsNear HearBell Q1 dist Q2 dist Q1 angle Q2 angle Q1 near Q2 near InOpenSpace Empty Button1 Button2 … State Features Forward Left Right TurnLeft TurnRight DropLeft DropRight PassRamp Gesture Sit Wait Wave Search WalkToOpenSpace … RedBall YellowBall BlueBall Empty Wait DropLeft DropRight PassRamp Task Configuration Identify task-relevant state features and actions Communication Data Communication actions Internal state Shared state

31 Autonomous Ball Sorting Video

32 Types of Communication Data  Each state feature that must be communicated for coordination may be: –Relevant over its full range of values Communicated each time its value changes –Relevant over a reduced range of values Communicated only when relevant