Presentation is loading. Please wait.

Presentation is loading. Please wait.

CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy.

Similar presentations


Presentation on theme: "CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy."— Presentation transcript:

1

2 CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy Abstract Value-learning and perceptual learning have been an important focus over the past decade, attracting the concerted attention of experimental psychologists, neurobiologists and the machine learning community. Despite some formal connections; e.g., the role of prediction error in optimizing some function of sensory states, both fields have developed their own rhetoric and postulates. In work, we show that perceptual learning is, literally, an integral part of value learning; in the sense that perception is necessary to integrate out dependencies on the inferred causes of sensory information. This enables the value of sensory trajectories to be optimized through action. Furthermore, we show that acting to optimize value and perception are two aspects of exactly the same principle; namely the minimization of a quantity (free energy) that bounds the probability of sensory input, given a particular agent or phenotype. This principle can be derived, in a straightforward way, from the very existence of agents, by considering the probabilistic behavior of an ensemble of agents belonging to the same class.

3 “Objects are always imagined as being present in the field of vision as would have to be there in order to produce the same impression on the nervous mechanism” - Hermann Ludwig Ferdinand von Helmholtz Thomas Bayes Geoffrey Hinton Richard Feynman From the Helmholtz machine and the Bayesian Brain to Action and self-organization Hermann Haken

4 Overview Ensemble dynamics Entropy and equilibria Free-energy and surprise The free-energy principle Action and perception Generative models Perception Birdsong and categorization Simulated lesions Action Active inference Reaching Policies Control and attractors The mountain-car problem

5 Particle density contours showing Kelvin-Helmholtz instability, forming beautiful breaking waves. In the self- sustained state of Kelvin-Helmholtz turbulence the particles are transported away from the mid-plane at the same rate as they fall, but the particle density is nevertheless very clumpy because of a clumping instability that is caused by the dependence of the particle velocity on the local solids-to-gas ratio (Johansen, Henning, & Klahr 2006) temperature pH falling transport Self-organization that minimises an ensemble density to ensure a limited repertoire of states are occupied (i.e., ensuring states have a random attracting set).

6 How can an active agent minimise its equilibrium entropy? This entropy is bounded by the entropy of sensory signals (under simplifying assumptions) Crucially, because the density on sensory signals is at equilibrium, it can be interpreted as the proportion of time each agent entertains them (the sojourn time). This ergodic argument means that entropy is the path integral of surprise experienced by a particular agent: This means agents minimise surprise at all times. But there is one small problem… Agents cannot access surprise; however, they can evaluate a free-energy bound on surprise, which is induced with a recognition density q :

7 Overview Ensemble dynamics Entropy and equilibria Free-energy and surprise The free-energy principle Action and perception Generative models Perception Birdsong Simulated lesions Action Active inference Reaching Polices Control and attractors The mountain-car problem

8 Action External states in the world Internal states of the agent (m) Sensations The free-energy principle Action to minimise a bound on surprisePerception to optimise the bound

9 The free-energy rests on expected Gibb’s energy and can be evaluated, given a generative model comprising a likelihood and prior: So what models might the brain use? The generative model

10 Processing hierarchy Backward (nonlinear) Forward (linear) lateral Ensemble dynamics Entropy and equilibria Free-energy and surprise The free-energy principle Action and perception Generative models Perception Birdsong Simulated lesions Action Active inference Reaching Polices Control and attractors The mountain-car problem

11 Hierarchical (deep) dynamic models

12 Structural priors Dynamical priors Likelihood and empirical priors Hierarchal form Gibb’s energy: a simple function of prediction error Prediction errors

13 Synaptic gain Synaptic activity Synaptic efficacy Activity-dependent plasticity Functional specialization Attentional gain Enabling of plasticity Attention and salience Perception and inferenceLearning and memory The recognition density and its sufficient statistics Mean-field approximation: Laplace approximation:

14 Backward predictions Forward prediction error Perception and message-passing Synaptic plasticitySynaptic gain

15 Overview Ensemble dynamics Entropy and equilibria Free-energy and surprise The free-energy principle Action and perception Generative models Perception Birdsong and categorization Simulated lesions Action Active inference Reaching Polices Control and attractors The mountain-car problem

16 Synthetic song-birds SyrinxVocal centre Time (sec) Frequency Sonogram 0.511.5

17 102030405060 -5 0 5 10 15 20 prediction and error time 102030405060 -5 0 5 10 15 20 hidden states time Backward predictions Forward prediction error 102030405060 -10 -5 0 5 10 15 20 Causal states time (bins) Recognition and message passing stimulus 0.20.40.60.8 2000 2500 3000 3500 4000 4500 5000 time (seconds)

18 Perceptual categorization Frequency (Hz) Song A 0.20.40.60.8 2000 3000 4000 5000 time (seconds) Song B 0.20.40.60.8 2000 3000 4000 5000 Song C 0.20.40.60.8 2000 3000 4000 5000 ABCABC time (seconds) 00.20.40.60.81 -20 -10 0 10 20 30 40 50

19 Generative models of birdsong: sequences of sequences Syrinx Neuronal hierarchy Time (sec) Frequency (KHz) sonogram 0.511.5 Kiebel et al

20 Frequency (Hz) percept 11.5 2000 2500 3000 3500 4000 4500 5000 Frequency (Hz) no structural priors 11.5 2000 2500 3000 3500 4000 4500 5000 time (seconds) Frequency (Hz) no dynamical priors 0.511.5 2000 2500 3000 3500 4000 4500 5000 0500100015002000 -40 -20 0 20 40 60 LFP (micro-volts) LFP 0500100015002000 -60 -40 -20 0 20 40 60 LFP (micro-volts) LFP 0500100015002000 -60 -40 -20 0 20 40 60 peristimulus time (ms) LFP (micro-volts) LFP Simulated lesion studies: a model for false inference in psychopathology?

21 Ensemble dynamics Entropy and equilibria Free-energy and surprise The free-energy principle Action and perception Generative models Perception Birdsong Simulated lesions Action Active inference Reaching Polices Control and attractors The mountain-car problem

22 prediction From reflexes to action action dorsal root ventral horn True dynamics Generative model

23 From reflexes to action Jointed arm Movement trajectory Descending sensory prediction error visual input proprioceptive input

24 Overview Ensemble dynamics Entropy and equilibria Free-energy and surprise The free-energy principle Action and perception Generative models Perception Birdsong Simulated lesions Action Active inference Reaching Polices Control and attractors The mountain-car problem

25 Energies sensory prediction error sensory surprise surprise free-energy complexity How do policies minimise entropy? perceptual divergence Path integrals sensory entropy entropy Under ergodic assumptions free-action perceptionpolicy (model)action

26 Richard Bellman Cost-functions, value and optimal control (polices that lead to sparse distal goals) Using the Helmholtz decomposition flow ( i.e., policy) can be expressed in terms of scalar and vector potentials Where value V is proportional to negative surprise and can be defined as expected (negative) cost This means the cost-function is defined by the equilibrium density but not vice versa; this is the problem addressed by dynamic programming and reinforcement learning.

27 Cost-functions and attracting sets (polices with attractors) At equilibrium we have: This means maxima of the equilibrium density must have negative divergence. We can exploit this to ensure maxima lie in A, using a Langevin-based policy; where cost plays the role of dissipation Adriaan Fokker Max Planck

28 equations of motion Exploitation exploration Exploration and exploitation under Langevin dynamics

29 True equations of motion -2012 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 position height The mountain car problem positionsatiety The cost-function Policy (expected equations of motion) The environment

30 20406080100120 -0.5 0 0.5 prediction and error time 20406080100120 -0.5 0 0.5 hidden states time -2012 -2 0 1 2 position velocity Trajectory of one trial -2012 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 position height leaned (after 16 trials) and true potential Learning the environment With no cost (i.e., Hamiltonian dynamics)

31 20406080100120 -5 0 5 10 15 20 25 30 conditional expectations 20406080100120 -3 -2 0 1 2 3 time action -2012 -2 0 1 2 velocity trajectories -2012 -30 -25 -20 -15 -10 -5 0 5 position force cost-function (priors) With cost (i.e., exploratory dynamics) Exploring & exploiting the environment

32 Using just the free-energy principle and a simple gradient ascent scheme, we have solved a benchmark problem in optimal control theory using a handful of learning trials. Note that we use reinforcement learning or dynamic programming. Adaptive policies and trajectories

33 2004006008001000120014001600 -4 -2 0 2 4 6 time action -2 0 2 0 2 0 2 4 6 8 position trajectories velocity satiety 2004006008001000120014001600 -4 -2 0 2 4 6 8 10 prediction and error time 2004006008001000120014001600 -5 0 5 10 15 20 25 expected hidden states time Self-organisation with (happiness) dynamics on cost policy (expected flow) true flow

34 Thank you And thanks to collaborators: Jean Daunizeau Stefan Kiebel James Kilner Klaas Stephan And colleagues: Peter Dayan Jörn Diedrichsen Paul Verschure Florentin Wörgötter


Download ppt "CDB Exploring Science and Society Seminar Thursday 19 November 2009 at 5.30pm Host: Prof Giorgio Gabella The Bayesian brain, surprise and free-energy."

Similar presentations


Ads by Google