Computational aspects of motor control and motor learning Michael I. Jordan* Mark J. Buller (mbuller) 21 February 2007 *In H. Heuer & S. Keele, (Eds.),

Slides:



Advertisements
Similar presentations
Pole Placement.
Advertisements

Self tuning regulators
Introductory Control Theory I400/B659: Intelligent robotics Kris Hauser.
Neural networks Introduction Fitting neural networks
Signals and Systems March 25, Summary thus far: software engineering Focused on abstraction and modularity in software engineering. Topics: procedures,
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
CY3A2 System identification Modelling Elvis Impersonators Fresh evidence that pop stars are more popular dead than alive. The University of Missouri’s.
Yiannis Demiris and Anthony Dearden By James Gilbert.
The loss function, the normal equation,
Oklahoma State University Generative Graphical Models for Maneuvering Object Tracking and Dynamics Analysis Xin Fan and Guoliang Fan Visual Computing and.
Machine Learning Neural Networks
x – independent variable (input)
280 SYSTEM IDENTIFICATION The System Identification Problem is to estimate a model of a system based on input-output data. Basic Configuration continuous.
AQM for Congestion Control1 A Study of Active Queue Management for Congestion Control Victor Firoiu Marty Borden.
1 Hybrid Agent-Based Modeling: Architectures,Analyses and Applications (Stage One) Li, Hailin.
Goals of Adaptive Signal Processing Design algorithms that learn from training data Algorithms must have good properties: attain good solutions, simple.
1 lBayesian Estimation (BE) l Bayesian Parameter Estimation: Gaussian Case l Bayesian Parameter Estimation: General Estimation l Problems of Dimensionality.
Understanding Perception and Action Using the Kalman filter Mathematical Models of Human Behavior Amy Kalia April 24, 2007.
Goodwin, Graebe, Salgado ©, Prentice Hall 2000 Chapter 3 Modeling Topics to be covered include:  How to select the appropriate model complexity  How.
Reinforcement Learning Game playing: So far, we have told the agent the value of a given board position. How can agent learn which positions are important?
Neural Networks (NN) Ahmad Rawashdieh Sa’ad Haddad.
Information Fusion Yu Cai. Research Article “Comparative Analysis of Some Neural Network Architectures for Data Fusion”, Authors: Juan Cires, PA Romo,
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Radial Basis Function Networks
Definition of an Industrial Robot
Signals and Systems March 25, Summary thus far: software engineering Focused on abstraction and modularity in software engineering. Topics: procedures,
A Shaft Sensorless Control for PMSM Using Direct Neural Network Adaptive Observer Authors: Guo Qingding Luo Ruifu Wang Limei IEEE IECON 22 nd International.
Ch. 6 Single Variable Control
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
Book Adaptive control -astrom and witten mark
Using Neural Networks in Database Mining Tino Jimenez CS157B MW 9-10:15 February 19, 2009.
20/10/2009 IVR Herrmann IVR: Introduction to Control OVERVIEW Control systems Transformations Simple control algorithms.
Outline What Neural Networks are and why they are desirable Historical background Applications Strengths neural networks and advantages Status N.N and.
Machine Learning Seminar: Support Vector Regression Presented by: Heng Ji 10/08/03.
1 Robot Environment Interaction Environment perception provides information about the environment’s state, and it tends to increase the robot’s knowledge.
Artificial Intelligence Techniques Multilayer Perceptrons.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
T. Bajd, M. Mihelj, J. Lenarčič, A. Stanovnik, M. Munih, Robotics, Springer, 2010 ROBOT CONTROL T. Bajd and M. Mihelj.
Dr. Gallimore10/18/20151 Cognitive Issues in VR Chapter 13 Wickens & Baker.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Methodology of Simulations n CS/PY 399 Lecture Presentation # 19 n February 21, 2001 n Mount Union College.
20/10/2009 IVR Herrmann IVR:Control Theory OVERVIEW Control problems Kinematics Examples of control in a physical system A simple approach to kinematic.
Low Level Control. Control System Components The main components of a control system are The plant, or the process that is being controlled The controller,
Motor Control. Beyond babbling Three problems with motor babbling: –Random exploration is slow –Error-based learning algorithms are faster but error signals.
Curiosity-Driven Exploration with Planning Trajectories Tyler Streeter PhD Student, Human Computer Interaction Iowa State University
LEAST MEAN-SQUARE (LMS) ADAPTIVE FILTERING. Steepest Descent The update rule for SD is where or SD is a deterministic algorithm, in the sense that p and.
Real-Time Simultaneous Localization and Mapping with a Single Camera (Mono SLAM) Young Ki Baik Computer Vision Lab. Seoul National University.
Linear Models for Classification
Chapter 7. Learning through Imitation and Exploration: Towards Humanoid Robots that Learn from Humans in Creating Brain-like Intelligence. Course: Robots.
Lecture 25: Implementation Complicating factors Control design without a model Implementation of control algorithms ME 431, Lecture 25.
Neural Networks - Berrin Yanıkoğlu1 Applications and Examples From Mitchell Chp. 4.
Lecture 5 Neural Control
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
Features of PID Controllers
Chapter 8: Adaptive Networks
Regression. We have talked about regression problems before, as the problem of estimating the mapping f(x) between an independent variable x and a dependent.
Learning Chaotic Dynamics from Time Series Data A Recurrent Support Vector Machine Approach Vinay Varadan.
Artificial Neural Networks (ANN). Artificial Neural Networks First proposed in 1940s as an attempt to simulate the human brain’s cognitive learning processes.
Csci 418/618 Simulation Models Dr. Ken Nygard, IACC 262B
Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.
Probabilistic Robotics Bayes Filter Implementations Gaussian filters.
Chapter 1: Overview of Control
Deep Feedforward Networks
The loss function, the normal equation,
Recursively Adapted Radial Basis Function Networks and its Relationship to Resource Allocating Networks and Online Kernel Learning Weifeng Liu, Puskal.
The Organization and Planning of Movement Ch
August 8, 2006 Danny Budik, Itamar Elhanany Machine Intelligence Lab
Linear Time Invariant systems
Chapter 7 Inverse Dynamics Control
Presentation transcript:

Computational aspects of motor control and motor learning Michael I. Jordan* Mark J. Buller (mbuller) 21 February 2007 *In H. Heuer & S. Keele, (Eds.), Handbook of Perception and Action: Motor Skills. New York: Academic Press, 1996.

Overview Relevance Dynamical Systems (DS) DS Control Architecture Feedforward Feedforward Feedback Feedback Error Correcting Feedback Error Correcting Feedback Composite Control Systems Composite Control Systems State Estimation Learning Algorithms Plant Controller Learning

Relevance Jordan Provides us the Architectural “Nuts and Bolts” for the Robot Control Loop Decision Making Motion Control Plant Sensing Perception y [t] ^ x [t] a [t] u [t]

Relevance Jordan Provides us the Architectural “Nuts and Bolts” for the Robot Control Loop Decision Making Motion Control Plant Sensing Perception y [t] ^ x [t] a [t] u [t]

Relevance Jordan Provides us the Architectural “Nuts and Bolts” for the Robot Control Loop Decision Making Motion Control Plant Sensing Perception y [t] ^ x [t] a [t] u [t]

Dynamical Systems An entity with a state time dependence e.g. “Many useful dynamical systems models are simply descriptive models of the temporal evolution of an interrelated set of variables.” (Jordan p7)

Dynamical Systems An entity with a state time dependence e.g. “Many useful dynamical systems models are simply descriptive models of the temporal evolution of an interrelated set of variables.” (Jordan p7) Ball

Dynamical Systems An entity with a state time dependence e.g. “Many useful dynamical systems models are simply descriptive models of the temporal evolution of an interrelated set of variables.” (Jordan p7) Ball [Mass, Velocity, Acceleration] m [v,a]

Dynamical Systems An entity with a state time dependence e.g. “Many useful dynamical systems models are simply descriptive models of the temporal evolution of an interrelated set of variables.” (Jordan p7) Ball [Mass, Velocity, Acceleration] Newtonian Mechanics allow us to predict location of ball at time [t+1] m [v,a] m g [t+1]

Dynamical System Control Given a “Dynamical System” what inputs are required to produce a given output. E.g. What force needs to be applied and in what direction to get the ball to the friend What force needs to be applied and in what direction to get the ball to the friend Next State Equation: Next State Equation: x n+1 = f(x n, u n ) Output Function Output Function y n = g(x n ) Input Output Mapping Equation Input Output Mapping Equation y n+1 = h(x n, u n ) yn[yn]yn[yn] y* n+1 [y* n+1 ] ?

Models A Dynamical System Model is at the heart of our ability to produce control inputs Forward Model Causal Model or Forward Transformation Model Causal Model or Forward Transformation Model Maps inputs to an output Maps inputs to an output Many to One Mapping Many to One Mapping E.g. Ball and Newtonian Physics E.g. Ball and Newtonian Physics Inverse Model Directional Flow Model Directional Flow Model One to Many Mapping One to Many Mapping e.g. joint angles & spatial position In an articulated arm. A new position can be achieved in multiple ways

Control Problem of computing an input to the system that will achieve some desired behavior at its output. Seems to involve the notion of computing the inverse (explicitly or implicitly) of the control model Jordan uses a simple first order plant model as an example: Jordan uses a simple first order plant model as an example: x n+1 = 0.5x n + 0.4u n y n = x n y n+1 = 0.5x n + 0.4u n Solving for u n : u n = -1.25x ^ n +2.5y * n+1 Where: x ^ n is estimated state and y * n+1 How is state estimated? How is state estimated?

Open Loop Feedforward Controller x ^ n is estimated from y * n (desired output) Pros Simple controller. Simple controller. If model is good then y* and y will be close. If model is good then y* and y will be close.Cons Large assumption that model is correct Large assumption that model is correct Errors can grow and compound Errors can grow and compound Example: Vestibulo-ocular Reflex (VOR) Couple movement of eyes to motion of head. Transform head velocity to eye velocity Couple movement of eyes to motion of head. Transform head velocity to eye velocity

Error Correcting Feedback Controller Does not rely on an explicit inverse of the plant model Works directly to correct the error at the current time step between the desired plant output y * n and actual plant output y n. u n = K(y * n - y n )where K = gain (scalar) Pros Does not depend on a explicit inverse of the plant model Does not depend on a explicit inverse of the plant model More robust on unanticipated disturbances More robust on unanticipated disturbancesCons Corrects error after it has occurred Corrects error after it has occurred Still has error under ideal situations Still has error under ideal situations Can be unstable Can be unstable

Feedback Controller x ^ n is estimated from y n (model output) Pros Very simple controller Very simple controller More robust with unanticipated disturbances More robust with unanticipated disturbances Can avoid compounding of errors Can avoid compounding of errorsCons What if the model is not good or has inaccuracies What if the model is not good or has inaccuracies Feedback can introduce instability Feedback can introduce instability

Composite Control Systems Combine complimentary strengths of feedforward controller and feedback controller.

State Estimation Previous examples assume that state can either be determined from output of the system or assumed to be the desired output. This estimated state is then used to estimate the input variables for the next iteration. Often the system output is a more complex function of state: Inverting the output function will often not work: Inverting the output function will often not work: 1)More state variables than output variables and thus the function is not uniquely invertible. 2)There is uncertainty about the about the dynamics of the system as seen through the output function. “State estimation is a dynamic process” “Robust estimation of the state of a system requires observing the output of the system over an extended period of time”

State Estimation - Observers Observer is an internal simulation of the plant running in parallel Actual Plant output is compared to observer predicted output Errors in output are used to correct the state estimate: Errors in output are used to correct the state estimate: K is set based upon relative noise levels in NEXT STATE and OUTPUT measurement processes. K is set based upon relative noise levels in NEXT STATE and OUTPUT measurement processes. If OUTPUT noise > NEXT STATE noise K is low If NEXT STATE noise > OUTPUT K is high

Learning Algorithms Previous examples have dealt with systems and plants in relatively benign finite settings. Systems that need to interact with the real world will encounter situations or objects etc. that do not conform the system’s model. An adaptive process would allow the system to update its control mechanisms. Learning algorithms can be taught in two ways: 1)Present whole gamut of available data prior to the deployment of the system or periodically update the learning algorithm 2)Dynamically update control models after the presentation of each new piece of learning data. a.k.a On-Line Learning.

Machine Learning Tools Jordan presents two main classes of Learning Algorithms: Classifiers Classifiers Map inputs into a set of discrete outputs e.g. The Perceptron Perceptron updates weights based upon performance with the training examples. (On-line technique) Regression Regression Maps inputs into a continuous output variable e.g. Least Squares Regression (Linear or Polynomial) Many other Machine Learning techniques are applicable see: Many other Machine Learning techniques are applicable see: Bishop CM. (2006). Pattern Recognition and Machine Learning. Bishop CM. (2006). Pattern Recognition and Machine Learning. Springer, NY

Bringing it All Together Motor Learning or Plant Controller Learning Problem of learning an inverse model of the plant Problem of learning an inverse model of the plant Direct Inverse Modeling Distal Supervised Learning Feedback Error Learning

Direct Inverse Learning Present input output pairs to the supervised learning algorithm. (offline technique) The supervised learning algorithm will minimize: The supervised learning algorithm will minimize: Given the plant input at time [t-1] and the plant output and estimated state the learning algorithm attempts to minimize the error between its estimate of control inputs and the actual control inputs at [t-1] Given the plant input at time [t-1] and the plant output and estimated state the learning algorithm attempts to minimize the error between its estimate of control inputs and the actual control inputs at [t-1] Approach works well for linear systems but can yield controller inputs for non-linear systems Approach works well for linear systems but can yield controller inputs for non-linear systems

Direct Inverse Learning - Problems Nonconvexity Problem: If learning data is presented to the learning algorithm where one output exists for the location of the arm in Cartesian space and three different sets of input variables map to this output space then many learning algorithms will provide a learned solution that is an impossibility for the arm. If learning data is presented to the learning algorithm where one output exists for the location of the arm in Cartesian space and three different sets of input variables map to this output space then many learning algorithms will provide a learned solution that is an impossibility for the arm.

Feedback Error Learning Desired plant output is used for both control and learning Learning can be conducted online Is goal oriented: In the sense tries to minimize error between actual plant output and desired plant output. In the sense tries to minimize error between actual plant output and desired plant output. “Guides” learning of the feedforward controller

Distal Supervised Learning Approach aims to solve the nonlinear model inverse problem as a composite system of forward plant model and feedforward controller model Two interactive processes used in learning the system Forward model is learned Forward model is learned Forward model is used in the learning of the feedforward controller Forward model is used in the learning of the feedforward controller This approach avoids nonconvexity problem as the feedforward controller learns to minimize error.

Distal Supervised Learning II The Forward Model is trained using the prediction error: (y[n] - y^[n]). The composite learning system (Forward Model & Feedforward Controller) is trained using the performance error (y*[n] - y[n]). Where the Forward model is held fixed.

Conclusions Jordan presents a series of control architectures and control policy learning techniques Inverse and Forward models play complimentary roles Inverse models are the basis for predictive control Inverse models are the basis for predictive control Forward models can be used to anticipate and cancel delayed feedback Forward models can be used to anticipate and cancel delayed feedback Basic blocks for dynamical state estimation Basic blocks for dynamical state estimation When the models are learned using machine learning algorithms or techniques they provide capabilities for prediction, control and error correction that allow the system to cope with difficult non- linear control problems “General rule…partial knowledge is better than no knowledge, if used appropriately”

Applications to Roomba Tag What Control Architecture/s What Learning algorithm/s Holistic vs. Set of Desired Behaviors Single control architecture or multiple control architectures for different functions Navigate Navigate Find Roomba Find Roomba Stalk Roomba Stalk Roomba Find Hiding Spot Find Hiding Spot Navigate to Hiding Spot Navigate to Hiding Spot