Model Validation 1. Introduction Model validation is a crucial step to evaluate the quality of estimated models. If validation is unsuccessful, previous.

Slides:



Advertisements
Similar presentations
Multiple Regression and Model Building
Advertisements

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Multiple Regression Analysis
Forecasting Using the Simple Linear Regression Model and Correlation
Inference for Regression
Probability & Statistical Inference Lecture 9
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Ch11 Curve Fitting Dr. Deshi Ye
Objectives (BPS chapter 24)
Errors & Uncertainties Confidence Interval. Random – Statistical Error From:
Modeling the Vibrating Beam By: The Vibrations SAMSI/CRSC June 3, 2005 Nancy Rodriguez, Carl Slater, Troy Tingey, Genevieve-Yvonne Toutain.
Chapter Topics Types of Regression Models
Governing equation for concentration as a function of x and t for a pulse input for an “open” reactor.
1 Validation and Verification of Simulation Models.
Chapter 11 Multiple Regression.
Business Statistics - QBM117 Interval estimation for the slope and y-intercept Hypothesis tests for regression.
Business Statistics - QBM117 Least squares regression.
Business Statistics - QBM117 Statistical inference for regression.
Correlation and Regression Analysis
Introduction to Regression Analysis, Chapter 13,
Simple Linear Regression Analysis
Random Variables.
12 Autocorrelation Serial Correlation exists when errors are correlated across periods -One source of serial correlation is misspecification of the model.
Review of Probability.
BOX JENKINS METHODOLOGY
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Regression Method.
One-Factor Experiments Andy Wang CIS 5930 Computer Systems Performance Analysis.
A.Murari 1 (24) Frascati 27 th March 2012 Residual Analysis for the qualification of Equilibria A.Murari 1, D.Mazon 2, J.Vega 3, P.Gaudio 4, M.Gelfusa.
CORRELATION & REGRESSION
TELECOMMUNICATIONS Dr. Hugh Blanton ENTC 4307/ENTC 5307.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
© 1998, Geoff Kuenning Linear Regression Models What is a (good) model? Estimating model parameters Allocating variation Confidence intervals for regressions.
Random Processes ECE460 Spring, Power Spectral Density Generalities : Example: 2.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Y X 0 X and Y are not perfectly correlated. However, there is on average a positive relationship between Y and X X1X1 X2X2.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
Week 21 Stochastic Process - Introduction Stochastic processes are processes that proceed randomly in time. Rather than consider fixed random variables.
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
Ledolter & Hogg: Applied Statistics Section 6.2: Other Inferences in One-Factor Experiments (ANOVA, continued) 1.
Stats Lunch: Day 3 The Basis of Hypothesis Testing w/ Parametric Statistics.
Estimation Method of Moments (MM) Methods of Moment estimation is a general method where equations for estimating parameters are found by equating population.
Chapter 8: Simple Linear Regression Yang Zhenlin.
Discrete-time Random Signals
CY3A2 System identification Input signals Signals need to be realisable, and excite the typical modes of the system. Ideally the signal should be persistent.
Model Structures 1. Objective Recognize some discrete-time model structures which are commonly used in system identification such as ARX, FIR, ARMAX,
Linear Regression Models Andy Wang CIS Computer Systems Performance Analysis.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 17 Simple Linear Regression and Correlation.
SYSTEMS Identification Ali Karimpour Assistant Professor Ferdowsi University of Mashhad Reference: “System Identification Theory.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Geology 6600/7600 Signal Analysis 26 Oct 2015 © A.R. Lowry 2015 Last time: Wiener Filtering Digital Wiener Filtering seeks to design a filter h for a linear.
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
This represents the most probable value of the measured variable. The more readings you take, the more accurate result you will get.
MODEL DIAGNOSTICS By Eni Sumarminingsih, Ssi, MM.
Chapter 13 Simple Linear Regression
Statistical Data Analysis - Lecture /04/03
(5) Notes on the Least Squares Estimate
Linear Regression.
Smoothing Serial Data.
9: Basic Confidence and Prediction Intervals
Linear Regression Models
Smoothing Serial Data.
BA 275 Quantitative Business Methods
CHAPTER 29: Multiple Regression*
Chapter 13 Additional Topics in Regression Analysis
CH2 Time series.
Presentation transcript:

Model Validation 1

Introduction Model validation is a crucial step to evaluate the quality of estimated models. If validation is unsuccessful, previous steps in the system identification workflow must be redone – e.g., the model structure can be changed. 2

Motivation So far, we validated and selected models by examining plots of predicted vs. measured outputs. In this section we will learn about additional statistically- based validation tests, namely the correlation tests of the residual error. 3

(1) Whiteness of the residual error The residual error is the difference between the measured output and the output estimated by the model: The residual error thus represents the information in the measured output which can't be explained by the model. For a good model, the residuals must be a set of random numbers caused by experimental errors. Any trend such as a gradual rise or a slight curve in the plot of the residuals invalidates the results. Therefore, the prediction errors ε(k) should be zero-mean white noise. 4

Autocorrelation function of ε(k) The autocorrelation function, r ε (τ) = E{ ε(k+ τ) ε(k) }, can be estimated from data as: If the residual ε(k) is purely random (zero-mean white noise), r ε (τ) = 0 for τ≠0, while r ε (0) is the variance of the white noise. Normalization by the (estimated) variance gives: 5

Testing the Whiteness of ε(k) The condition x(τ) = 0 for τ≠0 almost never holds exactly for data taken from an actual experiment. Instead, the auto-correlation function usually exhibits small values for τ≠0. However, it can be shown that for large number of data points, N, x(τ) is distributed according to a Gaussian distribution N(0,1/N). Therefore, if then we can assume, with 99% confidence, that x(τ) or r ε (τ)=0. 6

(2) Independence of the residuals and past inputs If the model is accurate, it should entirely explains the influence of inputs u(k) on future outputs y(k+τ). Therefore, residual errors ε(k+τ) are only influenced by the disturbances, and are independent of past inputs u(k). 7

Cross-correlation function of ε(k+τ) & u(k) If ε(k+τ) are uncorrelated with past input u(k), the cross-correlation function, should be zero for τ > 0. Estimation from data and normalization: 8

Independence test Like before, for large N, x(τ) is distributed according to Gaussian N(0,1/N), and therefore if: then we can assume, with 99% confidence, that x(τ) or r εu (τ) = 0. 9

Example Assume that a real system to be identified is in the output-error (OE) form with order 3: The identification and validation data are plotted as shown: plot(id); plot(val); 10

Example, continued (ARX model) First, we try an ARX model: mARX = arx(id, [3, 3, 1]); resid(mARX, id); 11 The model fails the whiteness test. This is because the system is not within the model class, leading to an inaccurate model. Therefore, the model is rejected.

Example, continued (ARX model) Simulating the ARX model on the validation data confirms the fact that the model is poor. 12

Example, continued (OE model) Now, let us try an OE model: mOE = oe(id, [3, 3, 1]); resid(mOE, id); As expected, the OE model passes the two tests because the OE model class contains the real system. The model is therefore validated. 13

Example, continued (OE model) Simulating the OE model on the validation data confirms the good quality of the model. 14