Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reasons to be careful about reward A flow (policy) cannot be specified with a scalar function of states: the fundamental theorem of vector calculus – aka.

Similar presentations


Presentation on theme: "Reasons to be careful about reward A flow (policy) cannot be specified with a scalar function of states: the fundamental theorem of vector calculus – aka."— Presentation transcript:

1 Reasons to be careful about reward A flow (policy) cannot be specified with a scalar function of states: the fundamental theorem of vector calculus – aka the Helmholtz decomposition Any (curl free) flow specified with reward can only have a fixed point attractor: reward cannot specify itinerant movement or policies Value is produced by flow – not its cause: reward is a consequence of (defined by) behaviour not its cause The inherent tautology of reward: explaining behaviour in terms of maximising reward is like explaining the evolution of the eye by saying it maximises adaptive value Unresolved questions in motor control: A UCL-JHU workshop

2 A PhysicistAn EngineerAn Economist

3 Random dynamical systems Random attractors with small measure Kolmogorov forward equation Free-energy formulation Ergodic theorem Helmholtz decomposition Value and reward Free energy upper bounds expected cost

4 Value and reward Helmholtz decomposition Optimal control theory

5 Value and reward

6 Forward models in motor control Intrinsic frame of reference Extrinsic frame of reference hidden states control Optimal control Motor commands Efference copy Forward model State estimation Sensory mapping Cost function Plant kinetics

7 Predictive coding in motor control Intrinsic frame of reference Extrinsic frame of reference Optimal control Motor commands Efference copy Sensory mapping Cost function Plant kinetics Forward model Top-down predictions Bottom up prediction error sensationscontrol

8 Active inference Intrinsic frame of reference Extrinsic frame of reference sensations Classical reflex Corollary discharge Sensory mapping Prior beliefs Plant kinetics movements Forward model Bottom up prediction error Proprioceptive predictions

9 visual input proprioceptive input Action with point attractors cf., equilibria point hypothesis Descending proprioceptive predictions Exteroceptive predictions

10 00.20.40.60.811.21.4 0.4 0.6 0.8 1 1.2 1.4 action position (x) position (y) 00.20.40.60.811.21.4 observation position (x) Heteroclinic cycle Action with heteroclinic cycles Descending proprioceptive predictions

11 Unresolved questions in motor control: A UCL-JHU workshop Reasons to be careful about reward A flow (policy) cannot be specified with a scalar function of states: the fundamental theorem of vector calculus – aka the Helmholtz decomposition Any (curl free) flow specified with reward can only have a fixed point attractor: reward cannot specify itinerant movement or policies Value is produced by flow – not its cause: reward is a consequence of (defined by) behaviour not its cause The inherent tautology of reward: explaining behaviour in terms of maximising reward is like explaining the evolution of the eye by saying it maximises adaptive value – cf., Intelligent design


Download ppt "Reasons to be careful about reward A flow (policy) cannot be specified with a scalar function of states: the fundamental theorem of vector calculus – aka."

Similar presentations


Ads by Google