Presentation is loading. Please wait.

Presentation is loading. Please wait.

1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

Similar presentations


Presentation on theme: "1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano."— Presentation transcript:

1 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano Zappacosta, Daniele Caligiore, Fabian Chersi, Vieri Santucci, Simona Bosco

2 2/30 Outline IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots The figures of the project The project vision The 3 pillars of the project idea + 4 S/T objectives WP3: Experiments WP4: Abstraction WP5: Intrinsic motivations WP6: Hierarchical architectures WP7: Integration and demonstrators Conclusions

3 3/30 Outline IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Integrated project Call: Cognitive Systems, Interactions and Robotics EU funds: 5.9 ml euros 7 (8) partners Start: May 2009 End: April 2013

4 4/30 Vision: the problem How can we create truly intelligent robots? Versatile: have many goals; re-use skills Robust: function in different conditions, with noise Autonomous: learning is paramount Weng, McClelland, Pentland, Sporns, Stockman, Sur, Thelen, (Science, 2001): …knowledge-based systems (e.g. production systems)… …learning systems focussed on single tasks (e.g. RL)… …evoluationary systems… Important results, but limited autonomy and scalability...... on the contrary...... organisms do scale, are flexible, and are robust!

5 5/30 Vision: the idea Why are organisms so special? Looking at children…

6 6/30 Vision: the idea Ingredients: Powerful abstractions: elefant on table leg, it slides down Explore Record interesting states Intrinsic motivations (interesting states, learning rates): motivate to reproduce states (goals) guide learning of skills Skills are re-used and composed: to explore to produce new skills Science: which brain and behavioural mechanisms are behind these processes? Technology: can we reverse engineer them? can we design algorithms with a similar power?

7 7/30 Vision: 2 promises Science: we can understand organisms Technology: we can develop a new methodology for designing robots… … in particular … Learn actions cumulatively … …on the basis of intrinsic motivations… …re-use them to build other actions… …and achieve externally assigned goals with them.

8 8/30 Vision: how we will do it: 3 pillars + 4 S/T objectives WP4: Abstraction and attention WP5: Intrinsic motivations WP6: Hierarchical architectures to support cumulative learning 1. Empirical investigations: - Monkeys - Children - Adults - Parkinson patients 4. Two robotic demonstrators: - CLEVER-B - CLEVER-K 2. Computational bio-constrained models: mechanisms underlying brain and behaviour Suitable representations Focussing learning Science From Science to Technology Technology 3. Machine- learning models: powerful algorithms and architectures From Technology to Science

9 9/30 The project WPs WP4: Abstraction and attention WP5: Intrinsic motivations WP6: Hierarchical architectures to support cumulative learning 1. Empirical investigations: - Monkeys - Children - Adults - Parkinson patients 4. Two robotic demonstrators: - CLEVER-B - CLEVER-K 2. Computational bio-constrained models: mechanisms underlying brain and behaviour Suitable representations Focussing learning Science From Science to Technology Technology 3. Machine- learning models: powerful algorithms and architectures WP3 WP4 WP5 WP6 WP7

10 10/30 WP3: Experiments and mechatronic board WP4: Abstraction and attention WP5: Intrinsic motivations WP6: Hierarchical architectures to support cumulative learning 1. Empirical investigations: - Monkeys - Children - Adults - Parkinson patients 4. Two robotic demonstrators: - CLEVER-B - CLEVER-K 2. Computational bio-constrained models: mechanisms underlying brain and behaviour Suitable representations Focussing learning Science From Science to Technology Technology 3. Machine- learning models: powerful algorithms and architectures WP3

11 11/30 WP3: Joystick experiment background USFD (Peter Redgrave & Kevin Gurney) Actions novel outcomes dopamine BG learning Redgrave Gurney, 2006, Nature Rev. Neuroscience

12 12/30 WP3: Empirical Experiments: Joystick experiment Method: Adult humans and Parkinsonian patients Joystick manoeuvring (gesture, location, timing) of a cursor on a screen to obtain reinforcement or salient event For studying: Actions novel outcomes dopamine BG learning

13 13/30 WP3: Empirical Experiments: Board experiment UCBM-LBRB (Eugenio Guglielmelli); Mechatronic board, intelligent sensors UCBM-LDN (Flavio Keller): children CNR-ISTC-UCP (Elisabetta Visalberghi): monkeys; Goals: (a) Investigating properties of stimuli causing intrinsic motivations; (b) acquisition of skills based on intrinsic motivations Inertial/magnetic unit + battery + wireless Tactile sensors Sabbatini, Stammati, Tavares, Visalberghi, 2007, Amer. J. Primatology Campolo, Taffoni, Schiavone, Formica, Guglielmelli, Keller, 2009, Int. J. Sicial Robotics

14 14/30 WP4: Abstraction WP4: Abstraction and attention WP5: Intrinsic motivations WP6: Hierarchical architectures to support cumulative learning 1. Empirical investigations: - Monkeys - Children - Adults - Parkinson patients 4. Two robotic demonstrators: - CLEVER-B - CLEVER-K 2. Computational bio-contrained models: mechanisms underlying brain and behaviour Suitable representations Focussing learning Science From Science to Technology Technology 3. Machine- learning models: powerful algorithms and architectures WP4

15 15/30 WP4 Abstraction: motor, perception, attention, vergence, Abstraction is a key ingredient for intrinsic motivations and hierarchical actions Motor: key in hierarchies Perceptual: key in intrinsic motivations: e.g., retina images would be always novel without abstraction Attention/vergence: two key forms of abstraction

16 16/30 WP4 Intrinsic motivations for developing vergence and perceptual abstraction FIAS (Jochen Triesch) E.g.: reward when target fixated with both eyes drives development of vergence Similar mechanisms to develop perceptual abstraction Weber Triesch, 2009, IJCNNFranz & Triesch, 2007, ICDL

17 17/30 WP5: Novelty detection WP4: Abstraction and attention WP5: Intrinsic motivations WP6: Hierarchical architectures to support cumulative learning 1. Empirical investigations: - Monkeys - Children - Adults - Parkinson patients 4. Two robotic demonstrators: - CLEVER-B - CLEVER-K 2. Computational bio-contrained models: mechanisms underlying brain and behaviour Suitable representations Focussing learning Science From Science to Technology Technology 3. Machine- learning models: powerful algorithms and architectures WP5

18 18/30 WP5 Intrinsic (extrinsic) motivations Extrinsic motivations (e.g. food, sex, money): Psychology (Berlyne, White, Deci & Rayan): motivate actions to achieve specific goals Drive actions whose effects directly increase fitness Come back again with the homeostatic needs they are associated with Intrinsic motivations (skill/knowledge acquis.): Psychology: motivate actions for their own sake Drive actions whose effects are an increase in: (a) knowledge or prediction ability; (b) competence to do Terminate to drive actions when knowledge/ competence is acquired

19 19/30 WP5 Intrinsic motivations CNR-LOCEN (Gianluca Baldassarre, Marco Mirolli) Young robot: low level of hierarchy develps skills based on evolved reinforcers (knowledge-based intrinsic motivations) Young robot: high level of hierarchy selects skills which produce the highest suprise (competence-based intrinsic motivations) Adult robot: high level of hierarchy performs skill composition to achieve salient goals (external rewards fitness measure) Adult robot tasks Child robot task Young robot: results Before learningAfter learning Adult robot: results Schembri, Mirolli, Baldassare, 2007, ICDL, ECAL, EPIROB

20 20/30 WP5 Novelty detection with habituable neural networks UU: (Ulrich Nehmzow) Task: find novel elements in world Image pre-processing (abstraction) Habituable neural network From Marsland et al. 2005 (J. Rob. Aut. Sys.) Neto Nehmzow, 2007, Rob. & Aut. Syst. Task

21 21/30 WP5 Intrinsic motivations based on information theory IDSIA (Juergen Schmidhuber) Theoretic ML, robotics, information-theory intrins. mot. Data compression improvement = intrinsic motivation Schmidhuber, 2009, Journal of SICE

22 22/30 WP6: Hierarchical architectures WP4: Abstraction and attention WP5: Intrinsic motivations WP6: Hierarchical architectures to support cumulative learning 1. Empirical investigations: - Monkeys - Children - Adults - Parkinson patients 4. Two robotic demonstrators: - CLEVER-B - CLEVER-K 2. Computational bio-mimetic models: mechanisms underlying brain and behaviour Suitable representations Focussing learning Science From Science to Technology Technology 3. Machine- learning models: powerful algorithms and architectures WP6

23 23/30 WP6 Hierarchical architectures Cumulative learning needs hierarchical architectures: To avoid catastrophic forgetting To find solutions by composing skills: dirty but fast solutions, then refine Because brain is hierarchical Because brain has a (soft) modularity at all levels From Fuster, 2001, Neuron Mcgovern Sutton Fagg

24 24/30 WP6 Intrinsic motivations, hierarchical RL (options) UMASS (Andrew Barto) Intrinsically Motivated Reinforcement Learning HRL: options theory Simsek Barto, 2006, ICML; Singh Barto Chentanez, 2004, NIPS Sutton et al., Option theory

25 25/30 WP6 Bio-inspired / bio-constrained hierarchical reinforcement learning CNR-LOCEN (Gianluca Baldassarre & Marco Mirolli) Piaget theory: actions support learning of other actions Camera, dynamic arm, reaching tasks Continuous state/action reinforcement learning Hierarchical RL: segmentation, Piaget Caligiore Borghi Parisi Mirolli Baldassarre, ongoing

26 26/30 WP6 Development sensorimotor mappings in robots AU (Mark Lee) Developmental psychology and robotics Staged development of sensorimotor behaviour LCAS – Lift Constraint, Act, and Saturate Lee Meng Chao, 2007, Rob. & Auton. Sys.Lee Meng Chao, 2007, Adaptive Behaviour;

27 27/30 WP7: Integration WP4: Abstraction and attention WP5: Intrinsic motivations WP6: Hierarchical architectures to support cumulative learning 1. Empirical investigations: - Monkeys - Children - Adults - Parkinson patients 4. Two robotic demonstrators: - CLEVER-B - CLEVER-K 2. Computational bio-mimetic models: mechanisms underlying brain and behaviour Suitable representations Focussing learning Science From Science to Technology Technology 3. Machine- learning models: powerful algorithms and architectures WP7

28 28/30 Leave a robot alone for a month or so… on the basis of intrinsic motivations… …it will build up a repertoire of actions incrementally. Come back and assign it a goal (e.g. by reward)… …and it will learn to accomplish it very quickly. WP7 CLEVER-K: Kitchen scenario Main responsible: IDSIA, UU …interacting with the environment: 3 iCub robots from IIT (Giorgio Metta)

29 29/30 WP7 CLEVER-B: Board scenario Main responsible: AU, CNR-LOCEN

30 30/30 Conclusions: A timely project Timely research goals: intrinsic motivations, hierarchical architectures Within important trends: developmental robotics computational system neuroscience emotions/motivations In synergy with various events: EpiRob, ICDL, J. of Autonomous Mental Development In line with EU calls: Cognitive Systems, Interactions and Robotics First EU Integrated Project wholly focussed on these topics www.im-clever.eu


Download ppt "1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano."

Similar presentations


Ads by Google