IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano Zappacosta, Daniele Caligiore, Fabian Chersi, Vieri Santucci, Simona Bosco

Outline IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots
The figures of the project The project vision The 3 pillars of the project idea + 4 S/T objectives WP3: Experiments WP4: Abstraction WP5: Intrinsic motivations WP6: Hierarchical architectures WP7: Integration and demonstrators Conclusions

Outline IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots
Integrated project Call: Cognitive Systems, Interactions and Robotics EU funds: 5.9 ml euros 7 (8) partners Start: May 2009 End: April 2013

Vision: the problem How can we create “truly intelligent” robots?
Versatile: have many goals; re-use skills Robust: function in different conditions, with noise Autonomous: learning is paramount Weng, McClelland, Pentland, Sporns, Stockman, Sur, Thelen, (Science, 2001): …knowledge-based systems (e.g. production systems)… …learning systems focussed on single tasks (e.g. RL)… …evoluationary systems…  Important results, but limited autonomy and scalability. . . . . . on the contrary . . . . . . organisms do scale, are flexible, and are robust!

Vision: the idea Why are organisms so special? Looking at children…

Vision: the idea Ingredients:
Powerful abstractions: “elefant on table leg”, “it slides down” Explore Record interesting states Intrinsic motivations (interesting states, learning rates): motivate to reproduce states (goals) guide learning of skills Skills are re-used and composed: to explore to produce new skills Science: which brain and behavioural mechanisms are behind these processes? Technology: can we reverse engineer them? can we design algorithms with a similar power?

Vision: 2 promises Science: we can understand organisms
Technology: we can develop a new methodology for designing robots… … in particular … Learn actions cumulatively … …re-use them to build other actions… …on the basis of intrinsic motivations… …and achieve externally assigned goals with them.

Vision: how we will do it: 3 pillars + 4 S/T objectives
From Technology to Science From Science to Technology WP4: Abstraction and attention 2. Computational bio-constrained models: mechanisms underlying brain and behaviour Science Technology 1. Empirical investigations: - Monkeys - Children - Adults - Parkinson patients Suitable representations 4. Two robotic demonstrators: - CLEVER-B - CLEVER-K WP5: Intrinsic motivations 3. Machine-learning models: powerful algorithms and architectures Focussing learning WP6: Hierarchical architectures to support cumulative learning

The project WPs WP4 WP7 WP3 WP5 WP6 From Science to Technology
WP4: Abstraction and attention 2. Computational bio-constrained models: mechanisms underlying brain and behaviour Technology Science 1. Empirical investigations: - Monkeys - Children - Adults - Parkinson patients Suitable representations 4. Two robotic demonstrators: - CLEVER-B - CLEVER-K WP5: Intrinsic motivations WP5 3. Machine-learning models: powerful algorithms and architectures Focussing learning WP6: Hierarchical architectures to support cumulative learning WP6

WP3: Experiments and mechatronic board
From Science to Technology WP3 WP4: Abstraction and attention 2. Computational bio-constrained models: mechanisms underlying brain and behaviour Technology Science 1. Empirical investigations: - Monkeys - Children - Adults - Parkinson patients Suitable representations 4. Two robotic demonstrators: - CLEVER-B - CLEVER-K WP5: Intrinsic motivations 3. Machine-learning models: powerful algorithms and architectures Focussing learning WP6: Hierarchical architectures to support cumulative learning

WP3: “Joystick experiment” background
USFD (Peter Redgrave & Kevin Gurney) Actions  novel outcomes  dopamine  BG learning Redgrave Gurney, 2006, Nature Rev. Neuroscience

WP3: Empirical Experiments: “Joystick experiment”
Method: Adult humans and Parkinsonian patients Joystick manoeuvring (gesture, location, timing) of a cursor on a screen to obtain reinforcement or salient event For studying: Actions  novel outcomes  dopamine  BG learning

WP3: Empirical Experiments: “Board experiment”
UCBM-LBRB (Eugenio Guglielmelli); Mechatronic board, intelligent sensors UCBM-LDN (Flavio Keller): children CNR-ISTC-UCP (Elisabetta Visalberghi): monkeys; Goals: (a) Investigating properties of stimuli causing intrinsic motivations; (b) acquisition of skills based on intrinsic motivations Sabbatini, Stammati, Tavares, Visalberghi, 2007, Amer. J. Primatology Campolo, Taffoni, Schiavone, Formica, Guglielmelli, Keller, 2009, Int. J. Sicial Robotics Tactile sensors Inertial/magnetic unit + battery + wireless

WP4: Abstraction WP4 From Science to Technology Technology Science
WP4: Abstraction and attention 2. Computational bio-contrained models: mechanisms underlying brain and behaviour Technology Science 1. Empirical investigations: - Monkeys - Children - Adults - Parkinson patients Suitable representations 4. Two robotic demonstrators: - CLEVER-B - CLEVER-K WP5: Intrinsic motivations 3. Machine-learning models: powerful algorithms and architectures Focussing learning WP6: Hierarchical architectures to support cumulative learning

WP4 Abstraction: motor, perception, attention, vergence,
Abstraction is a key ingredient for intrinsic motivations and hierarchical actions Motor: key in hierarchies Perceptual: key in intrinsic motivations: e.g., retina images would be always novel without abstraction Attention/vergence: two key forms of abstraction 4.1:based both on unsupervised learning (e.g. guided by optimisation of maximum entropy, independence, and sparseness) and learning guided by the system’s goals (e.g. guided by optimisation of action potentialities) add top-down guidance to the visual bars task Split (McCallum) & subdivide (Doya) vs. state aggregation (e.g. Singh, Jaakkola, Jordan 1995) 4.2: The main goal of the Task is to develop pre-processing algorithms capable of detecting changes in visual motion images, to be used as signal pre-processors in novelty detectors and hierarchical sensorimotor architectures. 4.3: We will study and model the interplay of action selection, action outcomes and feature learning, modelling the emergence of higher-level features and attention control. With relation to Tasks 4.5 and 6.4 we will furthermore implement predictive models for natural image sequences. Such models will capture the statistical regularities at the level of small image regions

WP4 Intrinsic motivations for developing vergence and perceptual abstraction
FIAS (Jochen Triesch) E.g.: reward when target fixated with both eyes drives development of vergence Similar mechanisms to develop perceptual abstraction 4.1:based both on unsupervised learning (e.g. guided by optimisation of maximum entropy, independence, and sparseness) and learning guided by the system’s goals (e.g. guided by optimisation of action potentialities) add top-down guidance to the visual bars task Split (McCallum) & subdivide (Doya) vs. state aggregation (e.g. Singh, Jaakkola, Jordan 1995) 4.2: The main goal of the Task is to develop pre-processing algorithms capable of detecting changes in visual motion images, to be used as signal pre-processors in novelty detectors and hierarchical sensorimotor architectures. 4.3: We will study and model the interplay of action selection, action outcomes and feature learning, modelling the emergence of higher-level features and attention control. With relation to Tasks 4.5 and 6.4 we will furthermore implement predictive models for natural image sequences. Such models will capture the statistical regularities at the level of small image regions Franz & Triesch, 2007, ICDL Weber Triesch, 2009, IJCNN

WP5: Novelty detection WP5 From Science to Technology Technology
WP4: Abstraction and attention 2. Computational bio-contrained models: mechanisms underlying brain and behaviour Technology Science 1. Empirical investigations: - Monkeys - Children - Adults - Parkinson patients Suitable representations 4. Two robotic demonstrators: - CLEVER-B - CLEVER-K WP5: Intrinsic motivations WP5 3. Machine-learning models: powerful algorithms and architectures Focussing learning WP6: Hierarchical architectures to support cumulative learning

WP5 Intrinsic (extrinsic) motivations
Extrinsic motivations (e.g. food, sex, money): Psychology (Berlyne, White, Deci & Rayan): motivate actions to achieve specific goals Drive actions whose effects directly increase fitness Come back again with the homeostatic needs they are associated with Intrinsic motivations (skill/knowledge acquis.): Psychology: motivate actions for their own sake Drive actions whose effects are an increase in: (a) knowledge or prediction ability; (b) competence to do Terminate to drive actions when knowledge/ competence is acquired

WP5 Intrinsic motivations
CNR-LOCEN (Gianluca Baldassarre, Marco Mirolli) Young robot: low level of hierarchy develps skills based on evolved ‘reinforcers’ (knowledge-based intrinsic motivations) Young robot: high level of hierarchy selects skills which produce the highest suprise (competence-based intrinsic motivations) Adult robot: high level of hierarchy performs skill composition to achieve salient goals (external rewards  fitness measure) Young robot: results Before learning After learning Adult robot: results Child robot task Adult robot tasks Schembri, Mirolli, Baldassare, 2007, ICDL, ECAL, EPIROB

WP5 Novelty detection with habituable neural networks
UU: (Ulrich Nehmzow) Task: find novel elements in world Image pre-processing (abstraction) Habituable neural network From Marsland et al (J. Rob. Aut. Sys.)‏ Task Neto Nehmzow, 2007, Rob. & Aut. Syst.

WP5 Intrinsic motivations based on information theory
IDSIA (Juergen Schmidhuber) Theoretic ML, robotics, information-theory intrins. mot. ‘Data compression improvement’ = intrinsic motivation Schmidhuber, 2009, Journal of SICE

WP6: Hierarchical architectures
From Science to Technology WP4: Abstraction and attention 2. Computational bio-mimetic models: mechanisms underlying brain and behaviour Technology Science 1. Empirical investigations: - Monkeys - Children - Adults - Parkinson patients Suitable representations 4. Two robotic demonstrators: - CLEVER-B - CLEVER-K WP5: Intrinsic motivations 3. Machine-learning models: powerful algorithms and architectures Focussing learning WP6: Hierarchical architectures to support cumulative learning WP6

WP6 Hierarchical architectures
Cumulative learning needs hierarchical architectures: To avoid catastrophic forgetting To find solutions by ‘composing skills’: dirty but fast solutions, then refine Because brain is hierarchical Because brain has a (soft) modularity at all levels From Fuster, 2001, Neuron Mcgovern Sutton Fagg

WP6 Intrinsic motivations, hierarchical RL (options)
UMASS (Andrew Barto) Intrinsically Motivated Reinforcement Learning HRL: options theory Sutton et al., Option theory Simsek Barto, 2006, ICML; Singh Barto Chentanez, 2004, NIPS

WP6 Bio-inspired / bio-constrained hierarchical reinforcement learning
CNR-LOCEN (Gianluca Baldassarre & Marco Mirolli) Piaget theory: actions support learning of other actions Camera, dynamic arm, reaching tasks Continuous state/action reinforcement learning Hierarchical RL: segmentation, Piaget Caligiore Borghi Parisi Mirolli Baldassarre, ongoing

WP6 Development sensorimotor mappings in robots
AU (Mark Lee) Developmental psychology and robotics Staged development of sensorimotor behaviour LCAS – Lift Constraint, Act, and Saturate Lee Meng Chao, 2007, Adaptive Behaviour; Lee Meng Chao, 2007, Rob. & Auton. Sys.

WP7: Integration WP7 From Science to Technology Technology Science
WP4: Abstraction and attention 2. Computational bio-mimetic models: mechanisms underlying brain and behaviour Technology Science 1. Empirical investigations: - Monkeys - Children - Adults - Parkinson patients Suitable representations 4. Two robotic demonstrators: - CLEVER-B - CLEVER-K WP5: Intrinsic motivations 3. Machine-learning models: powerful algorithms and architectures Focussing learning WP6: Hierarchical architectures to support cumulative learning

WP7 CLEVER-K: Kitchen scenario
3 iCub robots from IIT (Giorgio Metta) Leave a robot alone for a month or so… …it will build up a repertoire of actions incrementally. …interacting with the environment: Come back and assign it a goal (e.g. by reward)… on the basis of intrinsic motivations… …and it will learn to accomplish it very quickly. Main responsible: IDSIA, UU

WP7 CLEVER-B: Board scenario
Main responsible: AU, CNR-LOCEN

Conclusions: A timely project
Timely research goals: intrinsic motivations, hierarchical architectures Within important trends: developmental robotics computational system neuroscience emotions/motivations In synergy with various events: EpiRob, ICDL, J. of Autonomous Mental Development In line with EU calls: “Cognitive Systems, Interactions and Robotics” First EU Integrated Project wholly focussed on these topics

IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

Similar presentations

Presentation on theme: "IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

Similar presentations

Presentation on theme: "IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano."— Presentation transcript:

Similar presentations

About project

Feedback