Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning Maxim Likhachev, Michael Kaess, and Ronald C. Arkin Mobile Robot Laboratory.

Slides:

Advertisements

Similar presentations

Georgia Tech / Mobile Intelligence 1 Multi-Level Learning in Hybrid Deliberative/Reactive Mobile Robot Architectural Software Systems DARPA MARS Review.

Advertisements

GRASP University of Pennsylvania NRL logo? Autonomous Network of Aerial and Ground Vehicles Vijay Kumar GRASP Laboratory University of Pennsylvania Ron.

Optimizing Flocking Controllers using Gradient Descent

AuRA: Principles and Practice in Review

Multi-Level Learning in Hybrid Deliberative/Reactive Mobile Robot Architectural Software Systems Georgia Tech College of Computing Georgia Tech Research.

Chapter 10 Artificial Intelligence © 2007 Pearson Addison-Wesley. All rights reserved.

Nonholonomic Multibody Mobile Robots: Controllability and Motion Planning in the Presence of Obstacles (1991) Jerome Barraquand Jean-Claude Latombe.

Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.

Multi-Agent Coalition Formation for Long-Term Task or Mobile Network Hsiu-Hui Lee and Chung-Hsien Chen.

Carnegie Mellon 1 Maximum Likelihood Estimation for Information Thresholding Yi Zhang & Jamie Callan Carnegie Mellon University

Sampling Strategies for Narrow Passages Presented by Irena Pashchenko CS326A, Winter 2004.

Project  Now it is time to think about the project  It is a team work Each team will consist of 2 people  It is better to consider a project of your.

Planning for Humanoid Robots Presented by Irena Pashchenko CS326a, Winter 2004.

Integrating POMDP and RL for a Two Layer Simulated Robot Architecture Presented by Alp Sardağ.

Cluster Analysis (1).

Design of Curves and Surfaces by Multi Objective Optimization Rony Goldenthal Michel Bercovier School of Computer Science and Engineering The Hebrew University.

CS Reinforcement Learning1 Reinforcement Learning Variation on Supervised Learning Exact target outputs are not given Some variation of reward is.

Georgia Tech / Mobile Intelligence 1 Multi-Level Learning in Hybrid Deliberative/Reactive Mobile Robot Architectural Software Systems DARPA MARS Review.

IE 594 : Research Methodology – Discrete Event Simulation David S. Kim Spring 2009.

Applied Transportation Analysis ITS Application SCATS.

Constraints-based Motion Planning for an Automatic, Flexible Laser Scanning Robotized Platform Th. Borangiu, A. Dogar, A. Dumitrache University Politehnica.

Active Learning for Class Imbalance Problem

Accuracy-Aware Aquatic Diffusion Process Profiling Using Robotic Sensor Networks Yu Wang, Rui Tan, Guoliang Xing, Jianxun Wang, Xiaobo Tan Michigan State.

Introduction to variable selection I Qi Yu. 2 Problems due to poor variable selection: Input dimension is too large; the curse of dimensionality problem.

Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.

DARPA Mobile Autonomous Robot SoftwareLeslie Pack Kaelbling; March Adaptive Intelligent Mobile Robotics Leslie Pack Kaelbling Artificial Intelligence.

Case Base Maintenance(CBM) Fabiana Prabhakar CSE 435 November 6, 2006.

Bayesian parameter estimation in cosmology with Population Monte Carlo By Darell Moodley (UKZN) Supervisor: Prof. K Moodley (UKZN) SKA Postgraduate conference,

Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Chapter 9 Using Past History Explicitly as Knowledge: Case-based Reasoning.

Study on Genetic Network Programming (GNP) with Learning and Evolution Hirasawa laboratory, Artificial Intelligence section Information architecture field.

Analytic Models and Empirical Search: A Hybrid Approach to Code Optimization A. Epshteyn 1, M. Garzaran 1, G. DeJong 1, D. Padua 1, G. Ren 1, X. Li 1,

Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.

Transfer Learning Motivation and Types Functional Transfer Learning Representational Transfer Learning References.

Object Detection with Discriminatively Trained Part Based Models

Handover and Tracking in a Camera Network Presented by Dima Gershovich.

3 rd Nov CSV881: Low Power Design1 Power Estimation and Modeling M. Balakrishnan.

Data Sampling & Progressive Training T. Shinozaki & M. Ostendorf University of Washington In collaboration with L. Atlas.

Center for Radiative Shock Hydrodynamics Fall 2011 Review Assessment of predictive capability Derek Bingham 1.

Spatio-Temporal Case-Based Reasoning for Behavioral Selection Maxim Likhachev and Ronald Arkin Mobile Robot Laboratory Georgia Tech.

1 Distributed and Optimal Motion Planning for Multiple Mobile Robots Yi Guo and Lynne Parker Center for Engineering Science Advanced Research Computer.

Evolving the goal priorities of autonomous agents Adam Campbell* Advisor: Dr. Annie S. Wu* Collaborator: Dr. Randall Shumaker** School of Electrical Engineering.

Georgia Tech / Mobile Intelligence 1 Multi-Level Learning in Hybrid Deliberative/Reactive Mobile Robot Architectural Software Systems DARPA MARS Kickoff.

Learning to Navigate Through Crowded Environments Peter Henry 1, Christian Vollmer 2, Brian Ferris 1, Dieter Fox 1 Tuesday, May 4, University of.

Tetris Agent Optimization Using Harmony Search Algorithm

Strategies for Distributed CBR Santi Ontañón IIIA-CSIC.

August 30, 2004STDBM 2004 at Toronto Extracting Mobility Statistics from Indexed Spatio-Temporal Datasets Yoshiharu Ishikawa Yuichi Tsukamoto Hiroyuki.

Joseph Xu Soar Workshop Learning Modal Continuous Models.

A Hyper-heuristic for scheduling independent jobs in Computational Grids Author: Juan Antonio Gonzalez Sanchez Coauthors: Maria Serna and Fatos Xhafa.

Cognitive Robotics: Lessons from the SmartWheeler project Joelle Pineau, School of Computer Science, McGill University September 22,

Behavior-based Multirobot Architectures. Why Behavior Based Control for Multi-Robot Teams? Multi-Robot control naturally grew out of single robot control.

Selection of Behavioral Parameters: Integration of Case-Based Reasoning with Learning Momentum Brian Lee, Maxim Likhachev, and Ronald C. Arkin Mobile Robot.

Learning Momentum: Integration and Experimentation Brian Lee and Ronald C. Arkin Mobile Robot Laboratory Georgia Tech Atlanta, GA.

September 28, 2000 Improved Simultaneous Data Reconciliation, Bias Detection and Identification Using Mixed Integer Optimization Methods Presented by:

Application of machine learning to RCF decision procedures Zongyan Huang.

1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.

Parameter Sweep and Resources Scaling Automation in Scalarm Data Farming Platform J. Liput, M. Paciorek, M. Wrona, M. Orzechowski, R. Slota, and J. Kitowski.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

Learning for Physically Diverse Robot Teams Robot Teams - Chapter 7 CS8803 Autonomous Multi-Robot Systems 10/3/02.

1 Dynamic Speed and Sensor Rate Adjustment for Mobile Robotic Systems Ala’ Qadi, Steve Goddard University of Nebraska-Lincoln Computer Science and Engineering.

Tuning using Synthetic Workload Summary & Future Work Experimental Results Schema Matching Systems Tuning Schema Matching Systems Formalization of Tuning.

Optimization of a Dynamic Vaulting Behavior for LittleDog Alex Grubb and Nathan Ratliff ACRL 04/28/09.

Search Control.. Planning is really really hard –Theoretically, practically But people seem ok at it What to do…. –Abstraction –Find “easy” classes of.

A Case-based Reasoning Approach to Imitating RoboCup Players Michael W. Floyd, Babak Esfandiari and Kevin Lam FLAIRS 21 May 15, 2008.

He Xiangnan (PhD student) 11/2/2012 Research Updates.

Neuro-evolving Maintain-Station Behavior for Realistically Simulated Boats Nathan A. Penrod David Carr Sushil J. Louis Bobby D. Bryant Evolutionary Computing.

Particle Swarm Optimization (2)

Chapter 11: Artificial Intelligence

From Vision to Grasping: Adapting Visual Networks

Search-Based Footstep Planning

Realizing Closed-loop, Online Tuning and Control for Configurable-Cache Embedded Systems: Progress and Challenges Islam S. Badreldin*, Ann Gordon-Ross*,

Presentation transcript:

Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning Maxim Likhachev, Michael Kaess, and Ronald C. Arkin Mobile Robot Laboratory Georgia Tech This research was funded under the DARPA MARS program.

Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning Maxim Likhachev, Michael Kaess, and Ronald C. Arkin2 Motivation Constant parameterization of robotic behavior results in inefficient robot performance Manual selection of “right” parameters is difficult and tedious work

Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning Maxim Likhachev, Michael Kaess, and Ronald C. Arkin3 Motivation (cont’d) Use of Case-Based Reasoning (CBR) methodology – an automatic selection of optimal parameters at run-time (ICRA’01) –each case is a set of behavioral parameters indexed by environmental features “ front-obstructed ” case “ clear-to-goal ” case

Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning Maxim Likhachev, Michael Kaess, and Ronald C. Arkin4 Motivation for the Current Research The CBR module –improves robot performance (in simulations and on real robots) –avoids the manual configuration of behavioral parameters The CBR module still required the creation of a case library which –is dependent on a robot architecture –needs extensive experimentation to optimize cases –requires good understanding of how CBR works Solution: to extend the CBR module to learn –new cases from scratch or optimize existing cases –in a separate training process or during missions

Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning Maxim Likhachev, Michael Kaess, and Ronald C. Arkin5 Related Work Use of Case-Based Reasoning in the selection of behavioral parameters –ACBARR [Georgia Tech ’92], SINS [Georgia Tech ’93] –KINS [Chagas and Hallam] Automatic optimization of behavioral parameters –genetic programming (e.g., GA-ROBOT [Ram, et. al.]) –reinforcement learning (e.g., Learning Momentum [Lee, et. al.])

Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning Maxim Likhachev, Michael Kaess, and Ronald C. Arkin6 Behavioral Control and CBR Module CBR Module controls (case output parameters): Weights for each behavior BiasMove Vector Noise PersistenceObstacle Sphere

Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning Maxim Likhachev, Michael Kaess, and Ronald C. Arkin7 Case Indices: Environmental Features Spatial features: traversability vector split environment into K = 4 angular regions compute obstacle density within each region transform the density into traversability Temporal features: Short-term velocity towards the goal Long-term velocity towards the goal f 0 =0.92 f 1 =0.58 f 2 =1.0 f 3 =0.68 f 0 =0.02 f 1 =0.22 f 2 =0.63 f 3 =0.02 V spatial: f 0 =0.92 f 1 =0.58 f 2 =1.00 f 3 =0.68 V temporal ShortTerm : R s =1.0 LongTerm : R l =0.7 V temporal ShortTerm: R s =0.01 LongTerm: R l =1.0 V spatial: f 0 =0.02 f 1 =0.22 f 2 =0.63 f 3 =0.02

Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning Maxim Likhachev, Michael Kaess, and Ronald C. Arkin8 Overview of non-learning CBR Module Case switching Decision tree Case Adaptation current environment Feature Identification spatial & temporal feature vectors Spatial Features Vector Matching (1st stage of Case Selection) Temporal Features Vector Matching (2nd stage of Case Selection) set of spatially matching cases set of spatially and temporally matching cases Case Library all the cases in the library best matching or currently used case Case Application case ready for application case output parameters (behavioral assemblage parameters) Random Selection Process (3rd stage of Case Selection) best matching case

Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning Maxim Likhachev, Michael Kaess, and Ronald C. Arkin9 Making CBR Module to Learn Case output parameters ( behavioral assemblage parameters) Random Selection Biased by Case Success and Spatial and Temporal Similarities best matching or currently used case case ready for application last K cases new or existing best matching case current environment Feature Identification spatial & temporal feature vectors Spatial Features Vector Matching (1st stage of Case Selection) Temporal Features Vector Matching (2nd stage of Case Selection) set of spatially matching cases set of spatially and temporally matching cases Case switching Decision tree best matching case last K cases with adjusted performance history Case Library all the cases in the library Old Case Performance Evaluation New Case Creation (if necessary) Case Adaptation Case Application best matching or currently used case

Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning Maxim Likhachev, Michael Kaess, and Ronald C. Arkin10 Random selection of cases with the probability of the selection proportional to: –spatial similarity with the environment ( 1 st step) –temporal similarity with the environment (2 nd step) –weighted sum of the case past performance and spatial and temporal similarities (3 rd step) Extensive Exploration of Cases: Modified Case Selection Process set of spatially & temporally matching cases: {C 1,, C 4 } C1C1 spatial similarity P(selection) C2C2 C4C4 C3C3 C5C5 set of spatially matching cases: {C 1, C 2, C 4 } temporal similarity P(selection) C1C1 C4C4 C2C2 weighted sum of spatial and temporal similarities and case success P(selection) C1C1 C4C4 best matching case: C 1

Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning Maxim Likhachev, Michael Kaess, and Ronald C. Arkin11 Positive and Negative Reinforcement: Case Performance Evaluation Criteria for the evaluation of the case performance : the average velocity with which the robot approaches its goal during the application of the case –opportunities for intermediate case performance evaluations –may not always be the right criteria such cases exhibit no positive velocity towards the goal the evaluation of the performance is delayed by K (=2) cases –case_success (represents case performance) is: increased if the average velocity is increased or sustained high decreased otherwise

Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning Maxim Likhachev, Michael Kaess, and Ronald C. Arkin12 Maximization of Reinforcement: Case Adaptation Maximize case_success as a noisy function of case output parameters (behavioral assemblage parameters) –maintain the adaptation vector A(C) for each case C –if the last series of adaptations result in the increase of case_success then continue the adaptation: O(C) = O(C) + A(C) –otherwise switch the direction of the adaptation, add a random component and scale proportionally to case_success: A(C) = - ·A(C) + ·R O(C) = O(C) + A(C)

Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning Maxim Likhachev, Michael Kaess, and Ronald C. Arkin13 Maximization of Reinforcement: Case Adaptation (cont’d) Incorporate prior knowledge into the search: –fixed adaptation of the Noise_Gain and Noise_Persistence parameters based on the short- and long-term velocities of the robot Constrain the search: –limit Obstacle_Gain to be higher than the sum of the other schema gains (to avoid collisions)

Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning Maxim Likhachev, Michael Kaess, and Ronald C. Arkin14 The Growth of the Case Library: Case Creation Decision To avoid divergence a new case is created whenever: –case_success of the selected case is high and spatial and temporal similarities with the environment are low to moderate –case_success of the selected case is low to moderate and spatial and temporal similarities are low Limit the maximum size of the library (10 in this work) New case is initialized with: –the spatial and temporal features of the environment –the output parameter values of the selected case

Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning Maxim Likhachev, Michael Kaess, and Ronald C. Arkin15 Experimental Analysis: Example Learning CBR: first run (starting with an empty library)

Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning Maxim Likhachev, Michael Kaess, and Ronald C. Arkin16 Experimental Analysis: Example Learning CBR: a run after 54 training runs on various environments library of ten cases was learned 36 percent shorter travel distance A case of a “clear-to-goal” strategy is learned for such environments A case of a “squeezing” strategy is learned for such environments

Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning Maxim Likhachev, Michael Kaess, and Ronald C. Arkin17 Experiments: Statistical Results Simulation results (after 250 training runs for learning CBR system) Heterogeneous environmentHomogeneous environment Average number of steps Mission completion rate learning CBR CBR non-adaptive learning CBR CBR non-adaptive learning CBR CBR non-adaptive non-adapt. CBR learn

Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning Maxim Likhachev, Michael Kaess, and Ronald C. Arkin18 Real Robot Experiments: In Progress RWI ATRV-Jr Sensors: –SICK laser scanners in front and back –Compass –Gyroscope Experiments in progress, no statistical results yet

Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning Maxim Likhachev, Michael Kaess, and Ronald C. Arkin19 Conclusions New and existing cases are learned and optimized during a training process or as part of mission executions Performance: – substantially better than that of a non-adaptive system –comparable to a non-learning CBR system Neither manual selection of behavioral parameters nor careful creation and optimization of case library is required from a user Future Work –real robot experiments –case “forgetting” component –integration with other adaptation & learning methods (e.g., Learning Momentum, RL for Behavioral Assemblage Selection)