Reasons to be careful about reward A flow (policy) cannot be specified with a scalar function of states: the fundamental theorem of vector calculus – aka.

Reasons to be careful about reward A flow (policy) cannot be specified with a scalar function of states: the fundamental theorem of vector calculus – aka the Helmholtz decomposition Any (curl free) flow specified with reward can only have a fixed point attractor: reward cannot specify itinerant movement or policies Value is produced by flow – not its cause: reward is a consequence of (defined by) behaviour not its cause The inherent tautology of reward: explaining behaviour in terms of maximising reward is like explaining the evolution of the eye by saying it maximises adaptive value Unresolved questions in motor control: A UCL-JHU workshop

A PhysicistAn EngineerAn Economist

Random dynamical systems Random attractors with small measure Kolmogorov forward equation Free-energy formulation Ergodic theorem Helmholtz decomposition Value and reward Free energy upper bounds expected cost

Value and reward Helmholtz decomposition Optimal control theory

Value and reward

Forward models in motor control Intrinsic frame of reference Extrinsic frame of reference hidden states control Optimal control Motor commands Efference copy Forward model State estimation Sensory mapping Cost function Plant kinetics

Predictive coding in motor control Intrinsic frame of reference Extrinsic frame of reference Optimal control Motor commands Efference copy Sensory mapping Cost function Plant kinetics Forward model Top-down predictions Bottom up prediction error sensationscontrol

Active inference Intrinsic frame of reference Extrinsic frame of reference sensations Classical reflex Corollary discharge Sensory mapping Prior beliefs Plant kinetics movements Forward model Bottom up prediction error Proprioceptive predictions

visual input proprioceptive input Action with point attractors cf., equilibria point hypothesis Descending proprioceptive predictions Exteroceptive predictions

00.20.40.60.811.21.4 0.4 0.6 0.8 1 1.2 1.4 action position (x) position (y) 00.20.40.60.811.21.4 observation position (x) Heteroclinic cycle Action with heteroclinic cycles Descending proprioceptive predictions

Unresolved questions in motor control: A UCL-JHU workshop Reasons to be careful about reward A flow (policy) cannot be specified with a scalar function of states: the fundamental theorem of vector calculus – aka the Helmholtz decomposition Any (curl free) flow specified with reward can only have a fixed point attractor: reward cannot specify itinerant movement or policies Value is produced by flow – not its cause: reward is a consequence of (defined by) behaviour not its cause The inherent tautology of reward: explaining behaviour in terms of maximising reward is like explaining the evolution of the eye by saying it maximises adaptive value – cf., Intelligent design

Reasons to be careful about reward A flow (policy) cannot be specified with a scalar function of states: the fundamental theorem of vector calculus – aka.

Similar presentations

Presentation on theme: "Reasons to be careful about reward A flow (policy) cannot be specified with a scalar function of states: the fundamental theorem of vector calculus – aka."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Reasons to be careful about reward A flow (policy) cannot be specified with a scalar function of states: the fundamental theorem of vector calculus – aka.

Similar presentations

Presentation on theme: "Reasons to be careful about reward A flow (policy) cannot be specified with a scalar function of states: the fundamental theorem of vector calculus – aka."— Presentation transcript:

Similar presentations

About project

Feedback