Presentation is loading. Please wait.

Presentation is loading. Please wait.

The free-energy principle: a rough guide to the brain? Karl Friston

Similar presentations


Presentation on theme: "The free-energy principle: a rough guide to the brain? Karl Friston"— Presentation transcript:

1 The free-energy principle: a rough guide to the brain? Karl Friston
Presented by : Gokrna Poudel

2 Guiding question. Q1: Explain the following terms: KL divergence, entropy, ergodic, free energy, Bayesian surprise, generative model, recognition density, sufficient statistics. Q2: Explain the free-energy principle of the brain, i.e. the fact that self- organizing biological agents resist a tendency to disorder and therefore minimize the entropy of their sensory states. Give various forms of free energy. Q3: How can action reduce free energy? How can perception reduce free energy? How can active sampling of the sensorium contribute to the free energy reduction? Q4: Explain the neurobiological architecture for implementing the free- energy principle in Figure 1 in Box 1. Describe each of the modules in the figure and their functions as well as the quantities that define the free energy.

3 Guiding question. Q5: Describe the sufficient statistics representing a hierarchical dynamic model of the world in the brain in Figure 1 in Box 2. How are they related with each other? How are the changes in synaptic activity, connectivity, and gain involved with perceptual inference, learning and attention? Q6: Formulate and describe the neuronal architecture for the hierarchical dynamic model in Figure 1 in Box 3. How are the forward prediction errors computed? How are the backward predictions made? What are the sources of the forward and backward connections in terms of brain anatomy? Q7: A key implementational issue is how the brain encodes the recognition density. There are two forms of probabilistic neuronal codes: free forms and fixed forms. Give examples of each form and explain them. Q8: What kinds of optimization schemes does the brain use? Does it use deterministic search on free energy to optimize action and perception? Or, does it use stochastic search? What is your opinion?

4 KL[Kullback-Leibler] divergence:
Information divergence, information gain, cross or relative entropy is a non-commutative measure of the difference between two probability distributions. In other words KL Divergence  is a non- symmetric measure of the difference between two probability distributions . P and Q. KL measures the expected number of extra bits required to code samples from P when using a code based on Q, rather than using a code based on P. Typically P represents the "true" distribution of data, observations, or a precisely calculated theoretical distribution. The measure Q typically represents a theory, model, description, or approximation of P.

5 Ergodic : The process is ergodic if its long term time-average converges to its ensemble average. Ergodic processes that evolve for a long time forget their initial states. Entropy: The average surprise of outcomes sampled from a probability density. A density with low entropy means, on average, the outcome is relatively predictable. The second law of thermo dynamics states that the entropy of closed systems increases with time. Entropy is a measure of disorder or, more simply, the number of ways the elements of a system can be rearranged. P is sensory samples and their causes and q is recognition density and their causes.

6 sufficient statistics :
Generative model : It is a forward model and is a probabilistic mapping from causes to observed consequences (data). It is usually specified in terms of the likelihood of getting some data given their causes (parameters of a model) and priors on the parameters Recognition density : It is an approximating conditional density is an approximate probability distribution of the causes of data. It is the product inference or inverting a generative model. sufficient statistics : quantities which are sufficient to parameterize a probability density (e.g., mean and covariance of a Gaussian density).

7 Bayesian theory Bayesian surprise :
Bayesian probability theory, one of these “events” is the hypothesis, H, and the other is data, D, and we wish to judge the relative truth of the hypothesis given the data. According to Bayes’ rule, we do this via the relation Bayesian surprise : A measure of salience based on the divergence between the recognition and prior densities. It measures the information in the data that can be recognized.

8 Free Energy . It is an attempt to explain the structure and function of the brain, starting from the fact that exist. Free-energy is an information theory quantity that bounds the evidence for a model of data. free-energy is greater than the negative log-evidence or ‘surprise’ in sensory data, given a model of how they were generated. 1. : This fact places constraints on our interactions with the world, which have been studied for years in evolutionary biology and systems theory. recent advances in statistical physics and machine learning point to a simple scheme that enables biological systems to comply with these constraints.

9  Action , perception and sensorium contribution on free energy reduction
We are open systems in exchange with the environment; the environment acts on us to produce sensory impressions, and we act on the environment to change its states. On changing the environment or our relationship to it, then sensory input changes. Therefore, action can reduce free- energy by changing the sensory input predicted. perception reduces free-energy by changing predictions. we sample the world to ensure our predictions become a self- fulfilling prophecy and surprises are avoided. In this view, perception is enslaved by action to provide veridical predictions that guides active sampling of the sensorium. This exchange rests upon sensory and effector organs (like photoreceptors and oculomotor muscles).

10 No. 4 Explain the neurobiological architecture for implementing the free-energy principle in Figure 1 in Box 1. Describe each of the modules in the figure and their functions as well as the quantities that define the free energy.

11 Neurobiological architecture for implementing the free-energy principle

12 Neurobiological architecture for implementing the free-energy principle
Upper panel: schematic detailing the quantities that define free-energy. Lower panel: alternative expressions for the free- energy that show what its minimization entails. For action, free-energy can only be suppressed by increasing the accuracy of sensory data (i.e. selectively sampling data that are predicted by the representation).

13 No. 5 Describe the sufficient statistics representing a hierarchical dynamic model of the world in the brain in Figure 1 in Box 2. How are they related with each other? How are the changes in synaptic activity, connectivity, and gain involved with perceptual inference, learning and attention?

14 Hierarchical dynamic of the brain

15 Hierarchical dynamic of the brain
Key architecture is the hierarchy. The recognition density is encoded in terms of its sufficient statistics. On the fig. three sorts of representations pertaining to the states: {x,v}, parameters: θ and precisions : λ of a hierarchical dynamic model, these are encoded by neural activity, synaptic connectivity and gain respectively. Crucially, the optimization of any one representation depends on the others.

16 Hierarchical dynamic of the brain
The equations associated with this partition represent a gradient descent on free- energy and correspond to (i) Perceptual inference on states of the world (i.e. optimizing synaptic activity); (ii) Perceptual learning of the parameters underlying causal regularities (i.e. optimizing synaptic efficacy) and (iii) Attention or optimizing the expected precision of states in the face of random fluctuations and uncertainty (i.e. optimizing synaptic gain).

17 No. 6 Formulate and describe the neuronal architecture for the hierarchical dynamic model in Figure 1 in Box 3. How are the forward prediction errors computed? How are the backward predictions made? What are the sources of the forward and backward connections in terms of brain anatomy?

18 Neuronal architecture for the hierarchical dynamic model

19 Neuronal architecture for the hierarchical dynamic model
Schematic detailing the neuronal architectures that might encode a density on the states of hierarchical dynamic model. This shows the speculative cells of origin of forward driving connections that convey prediction error from a lower area to a higher area and backward connections that construct predictions . These predictions try to explain away prediction error in lower levels. In this scheme, the sources of forward and backward connections are superficial and deep pyramidal cells, respectively.

20 No.7 A key implementational issue is how the brain encodes the recognition density. There are two forms of probabilistic neuronal codes: free forms and fixed forms. Give examples of each form and explain them.

21 Brain encoding recognition density
The free-energy principle induces the recognition density, which has to be represented by its sufficient statistics. It is therefore a given that the brain represents probability distributions over sensory causes .

22 Probabilistic neuronal codes
Free-form and fixed-form: Free form : particle filtering : the recognition density is represented by the sample density of neuronal ensembles, whose activity encodes the location of particles in state-space.

23 Probabilistic neuronal codes
probabilistic population code: Method to represent stimuli by using the joint activities of a number of neurons, each neuron has a distribution of responses over some set of inputs, the responses of many neurons may be combined to determine some value about the inputs.

24 Probabilistic neuronal codes
Fix form : multinomial or Gaussian Multinomial forms assume the world is in one of several discrete states and are usually associated with hidden Markov models. The Gaussian or Laplace assumption allows for continuous and correlated states.

25 No 8 What kinds of optimization schemes does the brain use? Does it use deterministic search on free energy to optimize action and perception? Or, does it use stochastic search? What is your opinion?

26 Optimization schemes by the brain
According to the free-energy principle, the sufficient statistics representing will change to minimize free-energy, It provides principled explanation for perception, memory and attention, it accounts for perceptual inference (optimization of synaptic activity to encode the states of the environment), perceptual learning and memory (optimization of synaptic connections that encode contingencies and causal regularities) and Attention (neuromodulatory optimization of synaptic gain that encodes the precision of states)

27 Optimization schemes by the brain
Assumption is that the brain uses a deterministic gradient descent on free-energy to optimize action and perception. It might also use stochastic searches; sampling the sensorium randomly for a percept with low free- energy. Evidence is our eye movements implement an optimal stochastic strategy. This raises interesting questions about the role of stochastic searches from visual search to foraging, in both perception and action

28 Summary It provides a comprehensive measure of how individual represent and come to sample it adaptively. It is the goal to minimize the prediction error (suppress Free Energy) Changes in synaptic activity, connectivity and gain can be understood as perceptual inference, learning and attention.

29 References Friston. The free-energy principle: a unified brain theory?. Nat Rev Neurosci (2010) vol. 11 (2) pp Friston. The free-energy principle: a rough guide to the brain?. Trends Cogn Sci (Regul Ed) (2009) vol. 13 (7) pp Friston etal. A free energy principle for the brain, Journal of Physiology - Paris 100 (2006) 70–87

30 Thank You


Download ppt "The free-energy principle: a rough guide to the brain? Karl Friston"

Similar presentations


Ads by Google