Adaptive Critic Design for Aircraft Control Silvia Ferrari Advisor: Prof. Robert F. Stengel Princeton University FAA/NASA Joint University Program on Air.

Adaptive Critic Design for Aircraft Control Silvia Ferrari Advisor: Prof. Robert F. Stengel Princeton University FAA/NASA Joint University Program on Air Transportation, MIT, Cambridge, MA January 18-19, 2001

Table of Contents Introduction and motivation Optimal control problem Dynamic programming formulation Adaptive critic designs Proportional-Integral Neural Network Controller Pre-training of action and critic networks On-line training of action and critic networks Summary and Conclusions

Introduction Gain scheduled linear controllers for nonlinear systems Classical/neural synthesis of control systems Prior knowledge Adaptive control and artificial neural networks Adaptive critics Learn in real time Cope with noise Cope with many variables Plan over time in a complex way... Action network takes immediate control action Critic network estimates projected cost

Motivation for Neural Network-Based Controller Network of networks motivated by linear control structure Multiphase learning: Pre-training On-line training during piloted simulations or testing Improved global control Pre-training phase provides: A global neural network controller An excellent initialization point for on-line learning On-line training accounts for: Differences between actual and assumed dynamic models Nonlinear effects not captured in linearizations

Nonlinear Business Jet Aircraft Model Aircraft Equations of Motion: State vector: Control vector: Parameters: Output vector: XBXB ZBZB Thrust Drag Lift V mg p q r   YBYB Bolza Type Cost Function Minimization:

Value Function Minimization The minimization of J can be imbedded in the following problem: Value Function for [t, t f ]: V Time  [x(t f ), t f ] t0t0 ttftf J Time  [x(t f ), t f ] t0t0 ttftf

Discretized Optimization Problem Approximate the equations of motion by a difference equation: Similarly, the cost function: The cost of operation during the last stage is: where:

The Principle of Optimality Similarly, the optimal cost over the last two intervals is given by: The optimal cost, from to, is then: By the principle of optimality, a b c Given V ab, if V abc is optimal from a to c, then V bc is optimal from b to c: V * abc = V ab + V * bc

A Recurrence Relationship of Dynamic Programming Then, for a k-stage process: Backward Dynamic Programming So, the optimal cost for a 2-stage process can be re-written as: b c e d f V cf V df V ef V bc V bd V be Begin at t f Move backward to t 0 Store all u and V Off-line process a b'

Forward Dynamic Programming Estimate cost-to-go function, Determine immediate control action Move forward in time, updating cost-to-go function On-line process V bc c b f a V ab Cost associated with going from t k to t f : Utility Estimated cost for Hereon, let t = t k, t + 1 = t k+1, etc...

Adaptive Critic Designs.. at a Glance! Legend: 'H': Heuristic 'DP': Dynamic Programming 'D': Dual 'P': Programming 'G': Global 'AD': Action Dependent HDP u NN A NN C x V x DHP NN A NN C x x u GDHP NN A NN C x x u AD u NN C NN A x x ADHDP ADDHP ADGDHP

Proportional-Integral Controller Closed-loop stability: y c = desired output, (x c,u c ) = set point. ycyc x(t)x(t) ys(t)ys(t) CICI CFCF CBCB HuHu HxHx + + + + - - - u(t)u(t) AIRCRAFT (t)(t) Omitting  's, for simplicity:

Assuming small perturbations, expand about nominal solution: where: The state variable is augmented to include the output error integral,  (t), and the Proportional-Integral cost function is defined as: Linearized Business Jet Aircraft Model

where P is a Riccati matrix. Furthermore, From the Euler-Lagrange equations, the optimal * control law is: Linear Proportional-Integral Control Law Formulation it can be shown that the optimal value function is and its partial derivative with respect to  x a is:

Where: Proportional-Integral Neural Network Controller ycyc x(t)x(t) + + + + - u(t)u(t) ucuc + - uB(t)uB(t) uI(t)uI(t) xcxc ys(t)ys(t) e a NN I NN F NN B SVG CSG AIRCRAFT NN C  V/  x a (t)(t)

Feedback and Command-Integral (Action) Neural Network Pre-Training Feedback Requirements (pre-training phase): (R1) (R2) NN B accounts for regulation,  u B = NN B (  x, a). For each operating point, k, a z output and inputs: ~ Command-Integral Requirements (pre-training phase): NN I provides dynamic compensation,  u I = NN I ( , a). For each operating point, k, a z output and inputs: (R1) (R2)

Critic Neural Network Pre-Training Critic Requirements (pre-training phase): NN C estimates the value function derivatives,  = NN C (  x a, a). For each operating point, k, a z output and training inputs: From the Proportional-Integral optimal value function derivatives: (R1) (R2)

Action and Critic Neural Network On-line Training by Dual Heuristic Programming The cost-to-go at time t, The same cost function is optimized in pre-training and in on-line learning The utility function is defined as: must be minimized w.r.t., for an estimated cost-to-go at (t + 1).

Optimality Conditions for Dual Heuristic Programming Differentiating both sides of the value function, V[  x a (t)], w.r.t.  x a (t): Differentiating w.r.t. the control, Optimality equation:

NN C Aicraft Model (Discrete Time) Utility NN A xa(t)xa(t) a  x a (t+1) Action Network On-line Training Train action network, at time t, holding the critic parameters fixed [Balakrishnan and Biega, 1996] NN B NN I + + uB(t)uB(t) uI(t)uI(t) a (t)(t)

Critic Network On-line Training Train critic network, at time t, holding the action parameters fixed [Balakrishnan and Biega, 1996] NN C Utility NN A xa(t)xa(t) a  x a (t+1) NN C + Aicraft Model (Discrete Time)

Summary Aircraft optimal control problem Linear multivariable control structure Corresponding nonlinear controller Adaptive critic architecture: Action and critic networks  Algebraic pre-training based on a-priori knowledge  On-line training during simulations (severe conditions)

Conclusions and Future Work Nonlinear, adaptive, real-time controller: Maintain stability and robustness throughout flight envelope Improve aircraft control performance under extreme conditions Systematic approach for designing nonlinear control systems motivated by well-known linear control structures Innovative neural network training techniques Future Work: Adaptive critic architecture implementation Testing: acrobatic maneuvers, severe operating conditions, coupling and nonlinear effects

Adaptive Critic Design for Aircraft Control Silvia Ferrari Advisor: Prof. Robert F. Stengel Princeton University FAA/NASA Joint University Program on Air.

Similar presentations

Presentation on theme: "Adaptive Critic Design for Aircraft Control Silvia Ferrari Advisor: Prof. Robert F. Stengel Princeton University FAA/NASA Joint University Program on Air."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Adaptive Critic Design for Aircraft Control Silvia Ferrari Advisor: Prof. Robert F. Stengel Princeton University FAA/NASA Joint University Program on Air.

Similar presentations

Presentation on theme: "Adaptive Critic Design for Aircraft Control Silvia Ferrari Advisor: Prof. Robert F. Stengel Princeton University FAA/NASA Joint University Program on Air."— Presentation transcript:

Similar presentations

About project

Feedback