Abstract This presentation questions the need for reinforcement learning and related paradigms from machine-learning, when trying to optimise the behavior.

Slides:



Advertisements
Similar presentations
Internal models, adaptation, and uncertainty
Advertisements

Thomas Trappenberg Autonomous Robotics: Supervised and unsupervised learning.
David Rosen Goals  Overview of some of the big ideas in autonomous systems  Theme: Dynamical and stochastic systems lie at the intersection of mathematics.
How much about our interactions with – and experience of – our world can be deduced from basic principles? This talk reviews recent attempts to understand.
Free Energy Workshop - 28th of October: From the free energy principle to experimental neuroscience, and back The Bayesian brain, surprise and free-energy.
Dynamic Bayesian Networks (DBNs)
How much about our interaction with – and experience of – our world can be deduced from basic principles? This talk reviews recent attempts to understand.
The free-energy principle: a rough guide to the brain? Karl Friston
Artificial Spiking Neural Networks
Yiannis Demiris and Anthony Dearden By James Gilbert.
Summarized by Eun Seok Lee BioIntelligence Lab 20 Sep, 2012
How much about our interaction with – and experience of – our world can be deduced from basic principles? This talk reviews recent attempts to understand.
Chapter 4: Towards a Theory of Intelligence Gert Kootstra.
Baysian Approaches Kun Guo, PhD Reader in Cognitive Neuroscience School of Psychology University of Lincoln Quantitative Methods 2011.
Computational aspects of motor control and motor learning Michael I. Jordan* Mark J. Buller (mbuller) 21 February 2007 *In H. Heuer & S. Keele, (Eds.),
Intelligent Agents: an Overview. 2 Definitions Rational behavior: to achieve a goal minimizing the cost and maximizing the satisfaction. Rational agent:
CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.
How much about our interactions with – and experience of – our world can be deduced from basic principles? This talk reviews recent attempts to understand.
Free energy and active inference Karl Friston Abstract How much about our interaction with – and experience of – our world can be deduced from basic principles?
Bayesian Filtering for Robot Localization
Optimality in Motor Control By : Shahab Vahdat Seminar of Human Motor Control Spring 2007.
Computational and Physiological Models Part 1
Abstract We start with a statistical formulation of Helmholtz’s ideas about neural energy to furnish a model of perceptual inference and learning that.
The free-energy principle: a rough guide to the brain? K Friston Summarized by Joon Shik Kim (Thu) Computational Models of Intelligence.
From Bayesian Filtering to Particle Filters Dieter Fox University of Washington Joint work with W. Burgard, F. Dellaert, C. Kwok, S. Thrun.
Break-out Group # D Research Issues in Multimodal Interaction.
Towards Cognitive Robotics Biointelligence Laboratory School of Computer Science and Engineering Seoul National University Christian.
A Gentle Introduction to..... Frith, Rees, and Friston's (1998) "Forward Model" of Self Derek J. SMITH High Tower Consultants Limited
T. Bajd, M. Mihelj, J. Lenarčič, A. Stanovnik, M. Munih, Robotics, Springer, 2010 ROBOT CONTROL T. Bajd and M. Mihelj.
Computational models for imaging analyses Zurich SPM Course February 6, 2015 Christoph Mathys.
Perceptual Multistability as Markov Chain Monte Carlo Inference.
Abstract This talk summarizes recent attempts to integrate action and perception within a single optimization framework. We start with a statistical formulation.
Reinforcement Learning 主講人:虞台文 Content Introduction Main Elements Markov Decision Process (MDP) Value Functions.
Motor Control. Beyond babbling Three problems with motor babbling: –Random exploration is slow –Error-based learning algorithms are faster but error signals.
Abstract This talk summarizes our recent attempts to integrate action and perception within a single optimization framework. We start with a statistical.
Recent advances in the theory of brain function
Abstract We offer a formal treatment of choice behaviour based on the premise that agents minimise the expected free energy of future outcomes. Crucially,
Abstract Predictive coding models and the free-energy principle, suggests that cortical activity in sensory brain areas reflects the precision of prediction.
Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.
Abstract How much about our interactions with – and experience of – our world can be deduced from basic principles? This talk reviews recent attempts.
How much about our interactions with – and experience of – our world can be deduced from basic principles? This talk reviews recent attempts to understand.
Free energy and active inference
Reasons to be careful about reward A flow (policy) cannot be specified with a scalar function of states: the fundamental theorem of vector calculus – aka.
Lecture 1: Overview of Motor Control. What is Motor Control?
Chapter 7. Learning through Imitation and Exploration: Towards Humanoid Robots that Learn from Humans in Creating Brain-like Intelligence. Course: Robots.
Free-energy and active inference
Abstract We suggested recently that attention can be understood as inferring the level of uncertainty or precision during hierarchical perception. In.
Abstract This presentation will look at action, perception and cognition as emergent phenomena under a unifying perspective: This Helmholtzian perspective.
Workshop on Mathematical Models of Cognitive Architectures December 5-9, 2011 CIRM, Marseille Workshop on Mathematical Models of Cognitive Architectures.
Abstract In this presentation, I will rehearse the free-energy formulation of action and perception, with a special focus on the representation of uncertainty:
How much about our interaction with – and experience of – our world can be deduced from basic principles? This talk reviews recent attempts to understand.
How much about our interaction with – and experience of – our world can be deduced from basic principles? This talk reviews recent attempts to understand.
Abstract In this presentation, I will rehearse the free-energy formulation of action and perception, with a special focus on the representation of uncertainty:
Workshop on: The Free Energy Principle (Presented by the Wellcome Trust Centre for Neuroimaging) July 5 (Thursday) - 6 (Friday) 2012 Workshop on: The.
Bayesian inference Lee Harrison York Neuroimaging Centre 23 / 10 / 2009.
How much about our interactions with – and experience of – our world can be deduced from basic principles? This talk reviews recent attempts to understand.
Abstract How much about our interactions with – and experience of – our world can be deduced from basic principles? This talk reviews recent attempts to.
Ali Ghadirzadeh, Atsuto Maki, Mårten Björkman Sept 28- Oct Hamburg Germany Presented by Jen-Fang Chang 1.
Simulation of Characters in Entertainment Virtual Reality.
Tutorial Session: The Bayesian brain, surprise and free-energy Value-learning and perceptual learning have been an important focus over the past decade,
A Bayesian Model of Imitation in Infants and Robots
Modeling human action understanding as inverse planning
Variational filtering in generated coordinates of motion
Nicolas Alzetta CoNGA: Cognition and Neuroscience Group of Antwerp
Free energy and life as we know it
Computational models for imaging analyses
Reinforcement Learning with Partially Known World Dynamics
Wellcome Trust Centre for Neuroimaging University College London
The free-energy principle: a rough guide to the brain? K Friston
The Organization and Planning of Movement Ch
Presentation transcript:

Abstract This presentation questions the need for reinforcement learning and related paradigms from machine-learning, when trying to optimise the behavior of an agent. We show that it is fairly simple to teach an agent complicated and adaptive behaviors under a free-energy principle. This principle suggests that agents adjust their internal states and sampling of the environment to minimize their free-energy. In this context, free-energy represents a bound on the probability of being in a particular state, given the nature of the agent, or more specifically the model of the environment an agent entails. We show that such agents learn causal structure in the environment and sample it in an adaptive and self-supervised fashion. The result is a policy that reproduces exactly the policies that are optimized by reinforcement learning and dynamic programming. Critically, at no point do we need to invoke the notion of reward, value or utility. We illustrate these points by solving a benchmark problem in dynamic programming; namely the mountain-car problem using just the free-energy principle. The ensuing proof of concept is important because the free-energy formulation also provides a principled account of perceptual inference in the brain and furnishes a unified framework for action and perception. Action and active inference: A free-energy formulation

Perception, memory and attention (Bayesian Brain) Action and value learning (Optimum control) causes ( ) Prediction error sensory input S R Q CS reward (US) action S-R S-S The free-energy principle

Overview The free energy principle and action Active inference and prediction error Orientation and stabilization Intentional movements Cued movements Goal-directed movements Autonomous movements Forward and inverse models

agent - m environment Separated by a Markov blanket External states Internal states Sensation Action Exchange with the environment

Perceptual inference Perceptual learning Perceptual uncertainty Action to minimise a bound on surprise The free-energy principle Perception to optimise the bound The conditional density and separation of scales

Overview The free energy principle and action Active inference and prediction error Orientation and stabilization Intentional movements Cued movements Goal-directed movements Autonomous movements Forward and inverse models

Hierarchical model Top-down messagesBottom-up messages Prediction error Action Active inference: closing the loop (synergy)

action perception Action needs access to sensory-level prediction error

prediction From reflexes to action action dorsal horndorsal root ventral rootventral horn

Overview The free energy principle and action Active inference and prediction error Orientation and stabilization Intentional movements Cued movements Goal-directed movements Autonomous movements Forward and inverse models

sensory prediction and errorhidden states (location) cause (perturbing force)perturbation and action Active inference under flat priors (movement with percept) time time time time Visual stimulus Sensory channels

sensory prediction and errorhidden states (location) cause (perturbing force)perturbation and action Active inference under tight priors (no movement or percept) time time time time

under flat priorsunder tight priors action perceived and true perturbation Retinal stabilisation or tracking induced by priors Visual stimulus displacement displacement time time real perceived

Overview The free energy principle and action Active inference and prediction error Orientation and stabilization Intentional movements Cued movements Goal-directed movements Autonomous movements Forward and inverse models

sensory prediction and error cause (prior)perturbation and action Active inference under tight priors (movement and percept) Proprioceptive input time time time time hidden states (location)

robust to perturbation and change in motor gain displacementtime trajectories real perceived action and causes action perceived cause (prior) exogenous cause Self-generated movements induced by priors

Overview The free energy principle and action Active inference and prediction error Orientation and stabilization Intentional movements Cued movements Goal-directed movements Autonomous movements Forward and inverse models

from reflexes to action Jointed arm Cued movements and sensorimotor integration

Trajectory Cued reaching with noisy proprioception

Noisy proprioception Noisy vision position Conditional precisions Bayes optimal integration of sensory modalities

Overview The free energy principle and action Active inference and prediction error Orientation and stabilization Intentional movements Cued movements Goal-directed movements Autonomous movements Forward and inverse models

position velocity null-clines The mountain car problem equations of motion position Height position Forces Desired location

flow and density nullclines velocity velocity position velocity position Uncontrolled Controlled Expected

Learning in controlled environment Active inference in uncontrolled environment

Using just the free-energy principle and a simple gradient ascent scheme, we have solved a benchmark problem in optimal control theory with just a handful of learning trials. At no point did we use reinforcement learning or dynamic programming. Goal-directed behaviour and trajectories

prediction and error time hidden states time velocity behaviouraction position velocity time perturbation and actionbehaviour Action under perturbation

Simulating Parkinson's disease?

Overview The free energy principle and action Active inference and prediction error Orientation and stabilization Intentional movements Cued movements Goal-directed movements Autonomous movements Forward and inverse models

velocity velocity position position velocity controlled velocity before after trajectoriesdensities Learning autonomous behaviour

prediction and error time hidden states time position velocity learnt time perturbation and action Autonomous behaviour under random perturbations

Overview The free energy principle and action Active inference and prediction error Orientation and stabilization Intentional movements Cued movements Goal-directed movements Autonomous movements Forward and inverse models

Desired and inferred states Sensory prediction error Motor command (action) Forward model (generative model) Inverse model Desired and inferred states Sensory prediction error Forward model Motor command (action) Environment Free-energy formulationForward-inverse formulation Inverse model (control policy) Corollary discharge Efference copy

Summary The free-energy can be minimised by action (through changes in states generating sensory input) or perception (through optimising the predictions of that input) The only way that action can suppress free-energy is through reducing prediction error at the sensory level (speaking to a juxtaposition of motor and sensory systems) Action fulfils expectations, which can manifest as an explaining away of prediction error through resampling sensory input (e.g., visual tracking); Or intentional movement, fulfilling expectations furnished by empirical priors. In an optimum control setting a training environment can be constructed by minimising the cross-entropy between the ensemble density and some desired density. This can be learnt and reproduced under active inference.