Presentation on theme: "1 Some System Identification Challenges and Approaches Brett Ninness School of Electrical Engineering & Computer Science The University of Newcastle, Australia."— Presentation transcript:
1 Some System Identification Challenges and Approaches Brett Ninness School of Electrical Engineering & Computer Science The University of Newcastle, Australia “Many basic scientific problems are now routinely solved by simulation: a fancy random walk is performed on the system of interest. Averages computed from the walk give useful answers to formerly intractable problems” Persi Diaconis, 2008
2 System Identification - a rich history 1700‘s: Bernoulli, Euler, Lagrange - probability concepts 1763: Bayes - conditional probability 1795: Gauss, Legendre - least squares : Gauss, Legendre, Cauchy - prob. distributions 1879: Stokes - periodogram of time series 1890: Galton, Pearson - regression and correlation 1922: Fisher - Maximum Likelihood (ML) 1921: Yule - AR and MA time series 1933: Kolmogorov - Axiomatic probability theory 1930‘s: Khinchin, Kolmogorov, Cramér - stationary processes
3 System Identification - a rich history : Wiener, Kolmogorov - prediction theory 1960: Kalman - Kalman Filter 1965: Kalman & Ho - Realisation theory 1965: Åström & Bohlin - ML methods for dynamic systems 1970: Box & Jenkins - a unified and complete presentation 1970’s: Experiment design, PE formulation with underpinning theory, analysis of recursive methods 1980‘s: Bias & Variance quantification, tradeoff and design 1990‘s: Subspace methods, control relevant identification, robust estimation methods.
4 Recent & Current Activity
5 This talk
6 Acknowledgements Results here rest heavily on the work of colleagues: ‣ Dr. Adrian Wills (Newcastle University) ‣ Dr. Thomas Schön (Linköping University) ‣ Dr. Stuart Gibson (Nomura Bank) ‣ Soren Henriksen (Newcastle University) and on learning from experts: ‣ Håkan Hjalmarsson, Tomas McKelvey, Fredrik Gustafsson, Michel Gevers, Graham Goodwin.
7 Challenge 1 - General Nonlinear ID Effective solutions available for specific nonlinear structures ‣ NARX, Hammerstein-Wiener, Bilinear..... Extension to more general forms? Example:
8 Challenge 1 - General Nonlinear ID Obstacle 1: How do we compute a cost function? Prediction error (PE) cost: Maximum Likelihood (ML) cost:
9 Computing Turn to general measurement and time update equations: Time Update Measureme nt Update Problem - closed form solutions only for special cases: ‣ Linear, Gaussian (Kalman Filter), Discrete state HMM More generally: ‣ Need to compute solution numerically ‣ Multi-dimensional integrals the main challenge
11 SEQUENTIAL IMPORTANCE RESAMPLING SIR - More commonly known as “particle filtering” Key idea - use the strong law of large numbers (SLLN) ‣ Suppose a vector random number generator gives realisations from a given target density ‣ Then by the SLLN, with probability one: ‣ Suggests approximate quantification How to build the necessary random number generator?
14 History ‣ Handschin & Mayne, Int’l J. Control, 1969 ‣ Resampling Approach: Gordon, Salmond & Smith, IEE Proc. Radar & Signal Processing, (1136 citations) ‣ Now widely used in signal processing, target tracking, computer vision, econometrics, robotics and statistics, control.... ‣ Some applications in system identification. Bulk of work has involved considering parameters as state variables.
15 Back to Nonlinear System Identification Prediction error cost: General(ish) model structure Max. Likelihood cost:
16 Nonlinear System Identification How to compute the necessary gradients? Strategies: ‣ Differencing to compute derivatives? ‣ Direct search methods: Nelder-Mead, simulated annealing? Obstacle 2: How do we compute an estimate? Gradient based search is standard practice:
17 Expectation-Maximisation (EM) ALG.
18 Expectation-Maximisation (EM) ALG. Example - linear system: Estimate by regression? Need state - use estimate? E.g. Kalman smoother Suggests iteration: ‣ Use estimates of A,B,C,D to estimate state ; ‣ Use estimates of state to estimate A,B,C,D; ‣ Return and do again.
19 Expectation-Maximisation (EM) ALG. Key idea - “complete” and “incomplete” data ‣ Actual observations: ‣ “Wished for” (incomplete) obervations: ‣ Form estimate of “wished for” likelihood: E Step: Calculate M Step: Compute
20 KEY EM Algorithm Property Bayes’ rule: Take conditional expectation of both sides: Increasing implies increased likelihood:
21 Expectation-Maximisation (EM) ALG.
22 Expectation-Maximisation (EM) ALG.
23 Expectation-Maximisation (EM) ALG. History ‣ Generally attributed to Baum: Ann. Math. Stat. 1970; ‣ Generalised by Dempster et al: JRSS B, 1977 (9858 cites) ‣ Widely used in image processing, statistics, radar...
24 Nonlinear system estimation Example: N=100 data points, M=100 particles, 100 experiments
25 Evolution of Look at b parameter only - others fixed at true values:
26 Gradient Based Search Revisited Fisher’s Identity
27 EM vs Gradient search iterates
28 Challenge 2: Application Relevant ID “Traditional” practice - note asymptotic results Quality of an estimate must be quantified for it to be useful Assume convergence effectively occurred for finite N
29 Assessment & Design Often, a function of the parameters is of more interest Again - “classical” approach - use linear approximation: Couple with approximate Gaussianity of
30 One perspective Need to combine prior knowledge, assumptions and data : Measure of the evidence supporting an underlying system property - parameter value, frequency response, achieved gain/phase margin......
32 Using Posteriors Marginal on i’th parameter: Other measures? Now the difficulty - using the posterior ‣ Evaluation on -dim. grid, evaluations of model order ‣ Simpson’s rule - evaluation error:
33 Markov Chain Monte Carlo (MCMC)
34 A randomised approach Use the Strong Law of Large Numbers (SLLN) again. ‣ Build a (vector) random number generator giving realisations: ‣ Suggests the approximation: ‣ Then by the SLLN, with probability one: One view - numerical integration with intelligently chosen grid points.
35 The Metropolis Algorithm ‣ 1. Initialise: Choose and set The required vector random number generator: Z.y=y; Z.u=u; M.A=4; g1=est(Z,M); theta=g1.theta;
36 The Metropolis Algorithm ‣ 2. Draw a proposal value xi = theta + 0.1*randn(size(theta)); g2 = theta2m(xi,g1);
38 The Metropolis Algorithm 4. Set with probability if (rand <= alpha) theta=xi; end;
39 “Markov Chain Monte Carlo” History Origins: Metropolis, Rosenbluth, Rosenbluth,Teller & Teller, Journal of Chemical Physics, (11,564 ISI citations) Widespread use: ‣ Listed #1 in “Great Algorithms of Scientific Computing”, Dongarra & Sullivan, Comp. & Sci in Eng ‣ “The Markov Chain Monte Carlo Revolution”, Diaconis, Bull. American Mathematical Society, “Many basic scientific problems are now routinely solved by simulation: a fancy random walk is performed on the system of interest. Averages computed from the walk give useful answers to formerly intractable problems” ‣ Widely used in chemistry, physics, statistics.... Emerging uses in biology, telecommunications.
40 Example Simple first order situation: N=20 data samples available: realisations Metropolis Algorithm:
41 Marginal posteriors via MCMC
42 Posterior of functions of Candidate closed loop controller: What are the likely achieved gain and phase margins ? Implicit functions of - direct computation unclear
43 Sample Histograms of There is strong evidence that the proposed controller will achieve a gain margin > 3.8 and phase margin > 95 o
44 Conclusions Many thanks for your attention; Collective thanks to the SYSID2009 Organisation Team! Deep thanks to the Uni. Newcastle Signal Processing Micro-electonics group (sigpromu.org) ‣ Steve Weller, Chris Kellett, Tharaka Dissanayake, Peter Schreier, Sarah Johnson, Geoff Knagge, Björn Rüffer, Adrian Wills, Lawrence Ong, Dale Bates, Ian Griffiths, David Hayes, Soren Henriksen, Adam Mills, Alan Murray who endured multiple road-test versions of this talk, that were even worse than this one.