 SYSTEMS Identification

Presentation on theme: "SYSTEMS Identification"— Presentation transcript:

SYSTEMS Identification
Ali Karimpour Assistant Professor Ferdowsi University of Mashhad <<<1.1>>> ###Control System Design### {{{Control, Design}}} Reference: “System Identification Theory For The User” Lennart Ljung

Model Structure Selection and Model Validation
Lecture 16 Model Structure Selection and Model Validation Topics to be covered include: General Aspects of the Choice of the Model Structure A Priori Consideration Model Structure Selection Based on Preliminary Data Analysis Comparing Model Structures Model Validation Residual Analysis

Introduction In chapters 4 and 5 we provided list of typical model structures to be used for the identification. In this chapter we shall complement this list by discusing how to arrive at a suiable structure guided by system knowledge and collected data set. Once a model structure has been chosen, the identification procedure provides us with a particular model in this structure This model may be the best available one, but the crucial question is whether it is good enough for the intended purpose. Testing if a given model is appropriate is known model validation

General Aspects of the Choice of the Model Structure
A Priori Consideration Model Structure Selection Based on Preliminary Data Analysis Comparing Model Structures Model Validation Residual Analysis

General Aspects of the Choice of the Model Structure
To route to a particular model structure involves at least three steps: 1- To choose the type of model set This involves the selection between nonlinear and linear models, between input-output, black-box and physically parameterized state-space models, and so on 2- To choose the size of the models set. This involves issues like selecting the order of a state-space model. The degrees of the polynomials in a model or the number of “neurons” in a neural network. It also contains the problem of which variables to include in the model description. We thus have to select M from a given increasing chain of structures

General Aspects of the Choice of the Model Structure
2- To choose the size of the models set.

General Aspects of the Choice of the Model Structure
3- To choose the model parameterization When a model set M* has been decided on, it remains to parameterize it, that is to find a suitable model structure M whose range equals M*

General Aspects of the Choice of the Model Structure
3- To choose the model parameterization When a model set M* has been decided on, it remains to parameterize it, that is to find a suitable model structure M whose range equals M* The goal of the user is to obtain a good model at a low price. The choice of model structure certainly has a considerable effect on both the quality of the resulting model and the price for it. Quality Price

General Aspects of the Choice of the Model Structure
Quality of the Model Section 12.1 The options that effects on quality of the resulting model structure: How to perform the identification experiment What model structures to choose What identification algorithm to apply How to validate the obtained model

General Aspects of the Choice of the Model Structure
Quality of the Model Flexibility: employing model structures that offer good capabilities of describing different possible systems. Flexibility can be obtained either by using many parameters or by placing them in “strategic positions” Parsimony: note to use unnecessarily many parameters: to be “parsimonious” with the model parameterization.

General Aspects of the Choice of the Model Structure
Price of the Model The price of the model is associated with the effort to calculate it, that is to perform the minimization in Or to solve the equation

General Aspects of the Choice of the Model Structure
This work is highly dependent on the model structure, which influences: The algorithm complexity: We saw in chapter 10 that solving for involves evaluation of the prediction errors and their gradients for a number of The work associated with these evaluations depends critically on M. The properties of the criterion function: The amount of work to solve for also depends on how many evaluations of the criterion function and its gradient are necessary. This is determined by the “shape” of the criterion function. The shape in turn is a result of the choice of and of how the depend on .

General Aspects of the Choice of the Model Structure
A high-order complex model is more difficult to use for simulation and control design. If it is only marginally better than a simpler model. It may not be worth the higher price. Consequently, also The intended use of the model Will affect the choose of the model structure. General Considerations The final model structure is a compromise between the below listed aspects Flexibility Parsimony The algorithm complexity The properties of the criterion function The intended use of the model

General Aspects of the Choice of the Model Structure
The techniques and considerations that are used when evaluating General Consideration can be split into different categories. A priori considerations: Certain aspects are independent of the data set ZN and can be evaluated a priori before the data have been measured. (Section 16.2) Techniques based on preliminary data analysis: With the data available, certain testing and evaluation of ZN can be carried out that give insights into possible and suitable model structures. These techniques do not necessarily require the computation of a complete model. (Section 16.3) Comparing different model structures: Before a final model structure is chosen it is advisable to shop around in different model structures and compare quality and prices of the models offered there. This will require the computation and comparison of several models. (Section 16.4) Validation of a given model: Regardless of how a given model is obtained, we can always use ZN to evaluate whether it seems likely that it will serve its purpose. If a certain model is accepted, we have also implicitly approved the choice of the underlying model structure. (Sections 16.5 and 16.6)

A Priori Consideration
General Aspects of the Choice of the Model Structure A Priori Consideration Model Structure Selection Based on Preliminary Data Analysis Comparing Model Structures Model Validation Residual Analysis

A Priori Considerations
Type of Model The choice of which type of model to use is quite subjective and involves several issues that are independent of the data set ZN. The compromise between parsimony and flexibility is at the heart of the identification problem. How shall we obtain a good fit to data with few parameters? The answer usually is to use a priori knowledge about the system intuition and ingenuity. It will depend on our insight and understanding of the process whether it is feasible to build a well-founded physically parameterized model structure. This is of course an application-dependent problem.

A Priori Considerations
For a physical system, a priori information can typically best be incorporated into a continuous time model such as This means that the computation of ε(t,θ) and the minimization of VN become a laborious task both regarding the programming effort and the computation time required. Aspects of algorithmic complexity as well as the shape of the criterion function therefore favor black-box models. By this we mean a model like general model that adapts its parameters to data. There is no need any physical interpretation of their values.

A Priori Considerations
A general advice is to “try simple things first”. Then one should go into sophisticated model structure if …. So simple linear regression model Is a good first choice for an identification problem. One should note that using physical a priori knowledge does not necessarily mean that fancy continuous-time model structures have to be constructed. Some thinking about the nature of the relationship between the measured signals can give good hints for model structures. In general, one should contemplate whether nonlinear transformations of data will make it easier for the transformed data to fit a linear model.

A Priori Considerations
Model Order Choose the size of model set usually requires help from the data. However, physical insight and the intended model application will often tell which range of model orders should be considered. Also, even when the data have not been evaluated, knowing N and the data quality will indicate how many parameters it is feasible to estimate. With few data points, it is not reasonable to try to determine a model in a complex model structure. A related problem is how many different time scales it is feasible to let one and the same model handle. For numerical reasons it may be difficult to adequately describe more that 2 or three decades of the frequency range within one model. (Problem 13G.1) Considerations on sampling rates, proper excitation, and data record lengths strongly suggest that one should not aim at covering more than three decades of time constants in one experiment.

A Priori Considerations
Model Order If the system is stiff so that it contains widely separated time constants of interest, the calculation thus is to build two or more models, each covering a proper part of the frequency range and each sampled with a corresponding suitable sampling interval. For a high frequency model, the low-frequency dynamics for all practical purposes look like integrators The number being equal to the pole excess at low frequencies Correspondingly, the high-frequency dynamics look like static relationships to the low-frequency model. Thus introduce a no-delay term b0u(t) in this model

A Priori Considerations
Model Parameterization The issue of model parameterization is basically numerical We seek model parameterizations that are well conditioned so that a round-off or other numerical error in one parameter has a small influence on the input-output behavior of the model. This is a problem that has been widely recognized in the digital filtering area, but less so in the identification literature. In fact the standard input-output model structures could be quite sensitive to numerical errors.

Model Structure Selection Based on Preliminary Data Analysis
General Aspects of the Choice of the Model Structure A Priori Consideration Model Structure Selection Based on Preliminary Data Analysis Comparing Model Structure Model Validation Residual analysis

Model Structure Selection Based on Preliminary Data Analysis
By preliminary data analysis, we mean calculations that do not involve the determination of a complete model of the system. Estimating The Type of Model Generally, data-aided model structure selection appears to be an underdeveloped field. An exception is the order determination in linear structure. Order Estimation The order of a linear system can be estimated in many different ways. Methods that are based on preliminary data analysis fall into the following categories. 1. Examining the spectral analysis estimate of the transfer function 2. Testing ranks in sample covariance matrices 3. Correlating variables 4. Examining the information matrix

Model Structure Selection Based on Preliminary Data Analysis
1. Special Analysis Estimate A nonparametric estimate of the transfer function will give valuable information about resonance peaks and the high-frequency roll-off and phase shift. All this gives a hint as to what model orders will be required to give an adequate description of the dynamics. Note, though, that discrete-time Bode plots show some artifacts in their interpretation in terms of poles and zeros, compared to continuous-time Bode plots. Thus, use the observations with some care.

Model Structure Selection Based on Preliminary Data Analysis
2. Testing ranks in covariance matrices Suppose that the true system is described by For some noise sequence {v0(t)}. Suppose also that n is the smallest number for which this holds. As usual, let Suppose first that v0(t)=0. Then

Model Structure Selection Based on Preliminary Data Analysis
Now suppose that v0(t)≠0. Then one can use a threshold provided the signal-to-noise ratio is high. If this is not the case, Woodside(1971) suggested the use of the enhanced matrix A better alternative, is to use other correlation vectors. (See Wellstead (1978) and Wellstead and Rojas (1982).

Model Structure Selection Based on Preliminary Data Analysis
3. Correlating variables The order-determination problem is whether to include one more variable in a model structure or not. This variable could be y(t-n-1) or a measured possible disturbance variable ω(t). In any case, the question is whether this new variable has anything to contribute when explaining the output variable y(t). This is measured by the correlation between y(t) and ω(t). However to discount the possible relationship between ω(t) and y(t) what remains to be explained, already accounted for by the smaller model structure. This is known as coninocal correlation or partial correlation in regression analysis. (See Draper and Smith. 1981) We may also note that the determination of the state-space model order, i.e., determining how many of singular values are significant is a test of the same kind.

Model Structure Selection Based on Preliminary Data Analysis
4. The information matrix It follows from theorem 4.1 that, if the model orders are overestimated in certain model structures, global and local identifiability will be lost. This means that ψ(t,θ) will not have full rank at θ=θ*. And hence the information matrix will be singular. Since the Gauss-Newton search algorithm uses the inverse of the information matrix, a natural test quantity for whether the model order is too high will be the conditioning number of this matrix. A related situation occurs when the IV method is used. Then the matrix will be singular when the orders are overestimated. So testing the conditioning of …..

Comparing Model Structure
General Aspects of the Choice of the Model Structure A Priori Consideration Model Structure Selection Based on Preliminary Data Analysis Comparing Model Structure Model Validation Residual Analysis

Comparing Model Structures
A most natural approach to search for a suitable model structure is simply to test number of different ones to compare the resulting models. The model to be evaluated will generically be denoted by It is estimated within the model structure M, which have dM=dimθ free parameters. Estimation Data we mean the data that were used to estimate m. Validation Data will denote any data set available that has not been used to build any of the models we would like to evaluate.

Comparing Model Structures
What to compare? There are of course a number of ways to evaluate a model. We shall here describe evaluations and comparisons that are based on data sets from the system. Suppose that the data sets have been collected under conditions that are close to the intended operating conditions. The model tests are then basically tests of: How well the model is capable of reproducing these data.

Comparing Model Structures
What to compare? We shall generally work with k-step ahead model predictions as the basic of comparisons. For a linear model we thus have

Comparing Model Structures
What to compare? For a linear model we thus have For an output error model, H(q)=1, so, Otherwise, note the considerable conceptual difference between and The latter has y(t-1) and earlier y-values available and can therefore give fits that “look good” even though the model may be bad.

Comparing Model Structures
What to compare? The models can evaluated by Visual inspection of plots y(t) and Find the numerical value

Comparing Model Structures
Comparing Models on Fresh Data Sets: Cross-Validation It is not so surprising that a model will be able to reproduce the estimation data. A suggestive and attractive way of comparing two different models m1 and m2 is to evaluate their performance on validation data, e.g. by comparing and We would then favor that model that shows the better performance. Such procedures are known as cross-validation and several variants have been developed. See for example, Stone (1974) and Snee (1977). Advantage: An attractive feature of cross-validation procedures is their pragmatic character: the comparison makes sense without any probabilistic arguments and without any assumptions about the true system. Disadvantage: we have to save a fresh data set for the validation, and therefore cannot use all our information to build the models.

Comparing Model Structures
Comparing Models on Second-hand Data Sets: Evaluating the Expected Fit The proper quality measure for the model m is the expected criterion If the comparison criterion coincides with the estimation criterion we have

Comparing Model Structures
A Pragmatic Preview. The model obtained in the larger model structure will automatically yield a smaller value of the criterion of fit. Since it is the minimizing value obtained by minimization over a larger set. As the model structure increases, the minimal value of the criterion will thus behave as depicted in figure It is a monotonically decreasing function of model structure flexibility.

Comparing Model Structures
To begin with, the value VN decreases since the model picks up more of the relevant features of the data. But even after a model structure has been reached that allows a correct description of the system, the value V continuous to decrease, now because the additional (unnecessary) parameters adjust themselves to features of the particular realization of the noise. This is known as overfit and this extra improve fit is of course of no value to us, since we are going to apply the model to data with different noise realizations. It is reasonable that the decrease from overfit should be less significant than the decrease that results when more relevant features are included in the model. We will thus be looking for the “knee” in the curve of the figure. Now we are going to clear above sentense.

Comparing Model Structures
A Format Result. For the case that the comparison criterion coincides with the estimation criterion. We have the following result: Theorem Let Let be the minimizing argument of and suppose Then asymptotically as

Comparing Model Structures
Note the importance difference between and If we generate many estimation and validation data sets in a Monte-Carlo manner: is the averages of the fits of the models as they are fitted to estimation data. is the average as the estimated models are evaluated on validation data, since

Comparing Model Structures
A Format Result. Theorem Akaike’s Final Prediction-Error Criterion (FPE) Let:

Comparing Model Structures
Number of unknown parameters Shows the fundamental cost of parameters. The more parameters are used by the model structure the smaller the first term will be. However, each parameter carries a variance penalty that will contribute with to the expected mean square error fit. Any parameter that improves the fit of VN by less that will thus be harmful in this respect.

Comparing Model Structures
The expression Now, is not known. But can easily be estimated. A suitable estimate for is thus obtained as

Comparing Model Structures
which inserted into gives

Comparing Model Structures
gives

Model Validation General Aspects of the Choice of the Model Structure
A Priori Consideration Model Structure Selection Based on Preliminary Data Analysis Comparing Model Structure Model validation Subspace Methods for Estimating State Space Models.

Model Validation The parameter estimation procedure picks out the “best” model within the chosen model structure. The crucial question then is whether this “best” model is “good enough” This is the problem of model validation. The question has several aspects: A general family of search routines is given by 1. Does the model agree sufficiently well with the observed data? 2. Is the model good enough for my purpose? There is always certain purpose with the modeling. 3. Does the model describe the “true system”? Philosophically, impossible to answer. Model validation techniques thus tend to focus on question 1.

Model Validation 1. Does the model agree sufficiently well with the observed data? Validation with Respect to the Purpose of the Modeling Feasibility of Physical Parameters Consistency of Model input-Output Behavior Model Reduction Parameter Confidence Intervals Simulation and Prediction A particularly useful technique, residual analysis (Section 16.5)

Model Validation Validation with Respect to the Purpose of the Modeling It might be that the model is required for: Regulator design, prediction, or simulation. The ultimate validation then is to test whether the problem that motivated the modeling exercise can be solved using the obtained model. If a regular based on the model gives satisfactory control, then the model was a valid one. Feasibility of Physical Parameters For a model structure that is parameterized in terms of physical parameters Consider the estimated value and their estimated variances with what is reasonable from prior knowledge. Also one can evaluate the sensitivity of the input-output behavior with respect to parameters to check their practical identifiability.

Model Validation Consistency of Model input-Output Behavior
Consider input-output properties. For black-box models. Use Bode diagrams. For linear models Inspected by simulation. For nonlinear models It is always good practice to evaluate and compare different linear models in bode plots, possibly with the estimated variance translated to confidence intervals of and Model Reduction One procedure that tests if the model is simple and appropriate system description is: Apply some model-reduction technique to it, if the model order can be reduced without affecting the input-output properties very much, then the original model was “unnecessarily complex”

Model Validation Parameter Confidence Intervals
Another procedure that checks whether the current model contains too many parameters is to compare the estimate with the corresponding estimated standard deviation. (Section 9.6) If the confidence interval contains zero, we could consider whether this parameter should be removed. If the estimated standard deviations are large, this also is an indication of too large model orders. Simulation and Prediction In section 16.4 we used the models ability to reproduce input-output data in terms of simulations and predictions as a main tool for comparisons. Such plots and the numerical fits associated with them, are of course most useful and intuitively appealing also for evaluating a given model. We see exactly what features the model is capable of reproducing and what features it has not captured. The discreaptioncies can be due to noise or model errors, and we will see the combined effects of these sources.

Residual Analysis Topics to be covered include:
General Aspects of the Choice of the Model Structure A Priori Consideration Model Structure Selection Based on Preliminary Data Analysis Comparing Model Structure Model validation Residual Analysis

Residual Analysis Pragmatic viewpoints
The part of the data that the model could not reproduce-are the residuals. We want to know the quality of the model, which in a sense is a statement about how it will be able to reproduce new data sets. A simple and pragmatic starting point is to compute basic statistic for the residuals: This model has never produced a larger residual than S1 for all data we have seen. It is likely that such a bound will hold also for future data. Now this use of the statistics has an implicit invariance assumption: the residuals do not depend on something that is likely to change. Of special importance is, of course, that they do not depend on the particular input used in ZN. If they did the value of S1 would be limited, since the model should work for a range of possible inputs.

Residual Analysis Of special importance is, of course, that they do not depend on the particular input used in ZN. If they did the value of S1 would be limited, since the model should work for a range of possible inputs. To check this it is reasonable to study the covariance between residuals and the past inputs If these numbers are not small, it means that there are traces of past inputs in the residuals, then there is a part of y(t) that originates from the past input and that has not been properly picked up by the model m. Hence the model could be improved. Similarly, if we find correlation among the residuals themselves, i.e., if the numbers Are not small for τ ≠ 0 , then part of ε(t) could have been predicted from past data. This means that y(t) could have been better predicted.

Residual Analysis Tests for dynamical systems
From a more formal point of view we have motivated the estimation criterion as a maximum likelihood method. Where the ε(t) are independent of each other and past data. The model validation question related to data, then, is: Is it likely that is a sequence of independent random variables with PDF fe(x,t:θ) ? Whiteness test Independence between Residuals and Past Inputs Tests for dynamical systems

… Residual Analysis Whiteness test The numbers:
carry information about whether the residuals can be regarded as white? The whiteness test is to define and this parameter must has special distribution. There are some more methods to check: Number of sign changes of ε(t). histogram test for the distribution of ε. (See Draper and Smith 1981)

… Residual Analysis Independence between Residuals and Past Inputs
The numbers: carry information about independence between residuals and past inputs. The independence between residuals and past inputs test is to define and this parameter must has special distribution. There are some more methods to check: Plots of the pair ( ε(t),u(t-τ) ) Correlating non-linear transformation of ε and u. (See Draper and Smith 1981)

Residual Analysis Tests for dynamical systems
Testing the correlation between past inputs and the residuals is natural to evaluate if the model has picked up the essential part of the dynamics from u to y. For a dynamical model, the results of the tests can be visualized more effectively if we view them as estimates of the residual dynamics or Model Error Model Even more effective from a control point of view will be to display the frequency function of the estimate Along with estimated confidence regions. This gives a picture of what frequency ranges the model has not captured in the input-output behavior. Depending on the intended use, the model could then be accepted, even “independence between Residuals and Past Inputs” is violated, provided the errors occur in “harmless” frequency ranges.

Residual Analysis Example: Residual analysis for dynamic systems
Consider the system was simulated over 500 samples with an input consisting of sinusoids between 0.3 to 0.6 rad/sec, and white Gaussian noise e with variance 1. A model ARX(2) considered as The model was identified from the simulation data. A validation data set was generated using a random binary input with a resonance peak around 3 rad/sec.

Residual Analysis The result of conventional residual analysis : and
Correlation among the residuals themselves Correlation among residuals and past inputs

Residual Analysis Model-Error Model
Example: Residual analysis for dynamic systems A model ARX(2) considered as Model-Error Model A model ARX(10) considered for the model-error model as:

Residual Analysis Impulse response and frequency response of model-error-model , estimated as a 10th order ARX model.

Solid line: model m, dashed line:true system
Residual Analysis Comparison between the amplitude bode plots of the model and the true system Solid line: model m, dashed line:true system We see that the model error information from validation data is quite reliable.