Colloquium on Optimization for Control 1 Dynamic Optimization and Automatic Differentiation Yi Cao School of Engineering Cranfield University
Colloquium on Optimization for Control2Outline Dynamic optimization problems Parameterization Recursive high-order Taylor series ODE and sensitivity solver Dynamic optimization solver Differential recurrent neural network Continuous-time NMPC Conclusions
Colloquium on Optimization for Control3 Dynamic optimization problem Controller design, =const controller parameter Adaptive control, =variable controller parameter System identification, =model parameter Predictive control, =control action, variable State estimation, =initial state ……
Colloquium on Optimization for Control4 Solving dynamic optimization Optimal control theory well established in 1960s. Challenge for numerical solutions: –Complex / large scale problems –Efficiency for realtime optimization –Global optimization –Stability –Robustness
Colloquium on Optimization for Control5 Differentiation and dynamic optimization H=Φ(t,x, )+λ ’ f(t,x, ) Optimal conditions: dx/dt= H λ =f(t,x, ), dλ/dt= – H x = – Φ x (t,x, ) – f x (t,x, )λ H =Φ (t,x, )+f (t,x, )λ=0 Efficient solution requires efficient differentiation
Colloquium on Optimization for Control6 Differentiation approaches Analytic differentiation manually Not a trivial task for large scale problem Analytic differentiation using symbolic computing software Very complicated results even for a small problem Numerical finite difference Inefficient and inaccurate
Colloquium on Optimization for Control7 Automatic Differentiation (1) Techniques use computer programs to get derivatives of any functions represented in other computer programs with the same accuracy and efficiency as the function. Synonym: Algorithmic Differentiation First proposed by Johnannes Joos, 1976 in his PhD Thesis, ETH, Zurich
Colloquium on Optimization for Control8 Automatic Differentiation (2) Factor: every computer program, no matter how complicated, executes a sequence of elementary arithmetic operations such as additions or elementary functions such as exp(). By applying the chain rule of derivative calculus repeatedly to these operations, derivatives of arbitrary order can be computed automatically, and accurate to working precision.
Colloquium on Optimization for Control9 Automatic Differentiation (3) Two modes to calculate derivatives: Forward mode: y=f(x(t)) Reverse mode (adjoint): y=f(x(t))
Colloquium on Optimization for Control10 Automatic Differentiation (4) Two ways to implement AD Operator overloading Each elementary operation is replaced by a new one, working on pairs of value and its derivative (doublet). Source transformation Produce new code to calculate derivative based on original code of a function
Colloquium on Optimization for Control11 AD Example: Forward Mode x1’=1, x2’=0 v1’=(x1’x2-x1x2’)/x2/x2=2 v2’=cos(v1)*v1’= –1.98 v3’=exp(x2)*x2’=0 v4’=v1’-v3’=2 y’=v2’+v4’=0.02 x1=1.5, x2=0.5 v1=x1/x2=3 v2=sin(v1)= v3=exp(x2)=1.649 v4=v1–v3= y=v2+v4= y=sin(x1/x2)+x1/x2–exp(x2)
Colloquium on Optimization for Control12 x1=1.5, x2=0.5 v1=x1/x2=3 v2=sin(v1)= v3=exp(x2)=1.649 v4=v1–v3= y=v2+v4= AD Example: Reverse Mode x2=x2+v1*x1/x2/x2= x1=v1/x2=0.02 v1=v2*cos(v1)+v1=0.01 x2=v3*exp(x2)= v1=v4=1, v3=–v4=–1 y=1, v2=y=1, v4=y=1 y=sin(x1/x2)+x1/x2–exp(x2)
Colloquium on Optimization for Control13 Automatic Taylor expansion z(t)=f(x(t)), t scalar, x and z vectors TS: x(t)=x 0 +x 1 t+x 2 t 2 +…+x d t d TS: z(t)=z 0 +z 1 t+z 2 t 2 +…+z d t d AD forward (TS of z): z k = z k (x 0,x 1,…,x k ) AD reverse (TS of sensitivity): z k / x j = z k-j / x 0 := A k-j f’ x =A 0 +A 1 t+A 2 t 2 +…+A d t d
Colloquium on Optimization for Control14 Recursively solving ODE using TS dx/dt=f(x) dx/dt=z(t) Recursive relation: x k+1 =z k /(k+1). x 0 =x(t 0 ), x 1 =z 0 (x 0 ), x 2 =z 1 (x 0,x 1 ), … x(t 0 +h)= d i=0 x i h i Next step: t 1 =t 0 +h t 0
Colloquium on Optimization for Control15 Solving ODE with sensitivity equation Sensitivity equation: dx /dt=f x (t,x, )x +f (t,x, ) v={x 0, } A vk := z k / v B vk :=dx k /dv B vk+1 = (A vk + k j=0 A xk-j B vj )/(k+1) B v :=dx(t 0 +h)/dv =B v0 +B v1 h+B v2 h 2 +…+B vd h d
Colloquium on Optimization for Control16 Solving dynamic optimization using TS Convert cost to terminal cost: min F(t f,x(t f ), (t f )) Initial guass: 0 ={ 0 (t 0 ),…, 0 (t f )} Step: h i =t i+1 -t i >0, i=0,1,…,n-1, t n =t f Integrate ODE & sensitivity from t 0 to t f x(t 0 ), x(t 1 ), …, x(t f ) and F(t f,x(t f ), 0 (t f )) B x (t 1 ),…,B x (t f ), B (t 1 ),…,B (t f ) and F x, F dF/d 0 (t i )=F x B x (t f )…B x (t i+2 )B (t i+1 ) For constant , dF/d 0 = dF/d 0 (t i ) Update i+1 = i - dF/d i
Colloquium on Optimization for Control17 Error Control h=step, d=order of TS (h,d)=C(h/r) d+1 r ≈ r d =||x d-1 ||/||x d || for large d (h,d-1) ≈ (h,d)(r d /h)≤ (h,d)+||x d || (h,d)≤h||x d || 2 /(||x d-1 ||-h||x d ||) Given tolerance and d, determine h Given tolerance, determine optimal d, h to minimize computation. Global error: ≥ ||B x || i and g global tolerance local tolerance: = g ( -1)/( n -1)
Colloquium on Optimization for Control18 NMPC using differential recurrent NN Continuous-time nonlinear identification Efficient algorithm to train DRNN Training performance and efficiency DRNN as internal model for NMPC Efficient algorithm for NMPC NMPC performance and efficiency
Colloquium on Optimization for Control19 Differential recurrent neural networks
Colloquium on Optimization for Control20 DRNN Training ={b 1,b 2,vec(W 2 ),vec(W 1x ),vec(W 1u )} N =N h +N x +N h × N u +2N h × N x Training data: u(t), y(t) at t=0,h,…,Nh Solving DRNN + sensitivity using TS Assume t=0 is steady-state, apply u(t) y(t) at t=h,…,Nh and e k =y(kh)-y(kh) min k e T k e k /2=E T E/2, E=vec(e 1,…,e N ) Nonlinear least square (NLSQ) optimization
Colloquium on Optimization for Control21 Continuous time NMPC min u φ=∫ 0 T (y-y r ) T Q(y-y r )+(u-u r ) T R(u-u r )dt/2 s.t. dx/dt=f(x,u), y=g(x) u min ≤ u ≤ u max, 0=t 0 ≤ … ≤ t P =T Parameterize u: piecewise polynomial u(t)= q i=0 u ki (t-t k ) i t k ≤ t ≤ t k+1, k=0,…,P-1 U=[u T 00,…,u T 0q,…,u T P-10,…,u T P-1q ] T Y=[y T 00,…,y T 0d,…,y T P-10,…,y T P-1d ] T φ=T(Y e T H Q Y e +U e T H R U e )/2 φ=E T E/2 NLSQ
Colloquium on Optimization for Control22 NMPC algorithm (with DRNN) x 0 (steady-state or estimated), y m d=y m -Cx 0 (constant for 0 ≤ t ≤ T) U 0 U e =U 0 -U r X Y Y e =Y-Y r +d Jacobian, J= E/ U Unconstrained: U k+1 =U k -(J T J) -1 J T E Input constrained: lsqnonlin Other constraints: fmincon/SQP Apply [u T 00,…,u T 0q ] T repeat
Colloquium on Optimization for Control23 Tank Reactor Example 2-CSTR in series, reaction A+B C 2-output: T 1 and T 2,2-input: Q cw1 and Q cw2 Distuabnce: T cw 6 states
Colloquium on Optimization for Control24 DRNN identification N h =6, N x =6, N u =2, N = s data with h=0.1 s, N=6000 Total sensitivity Validation set sampled at 0.02 s Advantage: change sampling rate does not need re-training
Colloquium on Optimization for Control25 Training and Validating
Colloquium on Optimization for Control26 Training efficiency (one epoch) ODE23AD tolt, mserrordt, mserror × × × × ×
Colloquium on Optimization for Control27 NMPC performance, setpoint change
Colloquium on Optimization for Control28 NMPC performance, disturbance rejection
Colloquium on Optimization for Control29Conclusions High-order TS using AD Efficiently solving ODE + sensitivity Efficient algorithm for general dynamic optimization Efficient algorithm for RDNN training Efficient algorithm for continuous-time NMPC Demonstrated with 2-CSTR example.