Generic Approaches to Model Validation Presented at Growth Model User’s Group August 10, 2005 David K. Walters.

Slides:



Advertisements
Similar presentations
Paul Smith Office for National Statistics
Advertisements

EFIMED Advanced course on MODELLING MEDITERRANEAN FOREST STAND DYNAMICS FOR FOREST MANAGEMENT MARC PALAHI Head of EFIMED Office INDIVIDUAL TREE.
Conceptualization, Operationalization, and Measurement
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Objectives 10.1 Simple linear regression
1 Statistical Inference H Plan: –Discuss statistical methods in simulations –Define concepts and terminology –Traditional approaches: u Hypothesis testing.
Examining Clumpiness in FPS David K. Walters Roseburg Forest Products.
Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.
Beginning the Research Design
Data Sources The most sophisticated forecasting model will fail if it is applied to unreliable data Data should be reliable and accurate Data should be.
Software Quality Control Methods. Introduction Quality control methods have received a world wide surge of interest within the past couple of decades.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Multivariate Data Analysis Chapter 4 – Multiple Regression.
Simulation.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
CE 498/698 and ERS 685 (Spring 2004) Lecture 181 Lecture 18: The Modeling Environment CE 498/698 and ERS 485 Principles of Water Quality Modeling.
Linear Regression and Correlation Analysis
Chapter 13 Introduction to Linear Regression and Correlation Analysis
The Calibration Process
Role and Place of Statistical Data Analysis and very simple applications Simplified diagram of a scientific research When you know the system: Estimation.
Role and Place of Statistical Data Analysis and very simple applications Simplified diagram of a scientific research When you know the system: Estimation.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Experimental Statistics I.  We use data to answer research questions  What evidence does data provide?  How do I make sense of these numbers without.
Correlation and Regression Analysis
Sampling Designs Avery and Burkhart, Chapter 3 Source: J. Hollenbeck.
Chapter Outline  Populations and Sampling Frames  Types of Sampling Designs  Multistage Cluster Sampling  Probability Sampling in Review.
Introduction to the design (and analysis) of experiments James M. Curran Department of Statistics, University of Auckland
Relationships Among Variables
Forecasting Revenue: An Example of Regression Model Building Setting: Possibly a large set of predictor variables used to predict future quarterly revenues.
Chemometrics Method comparison
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
Inference for regression - Simple linear regression
Sampling : Error and bias. Sampling definitions  Sampling universe  Sampling frame  Sampling unit  Basic sampling unit or elementary unit  Sampling.
Regression Method.
Sampling. Concerns 1)Representativeness of the Sample: Does the sample accurately portray the population from which it is drawn 2)Time and Change: Was.
Esri International User Conference | San Diego, CA Technical Workshops | Spatial Statistics: Best Practices Lauren Rosenshein, MS Lauren M. Scott, PhD.
1 Validation & Verification Chapter VALIDATION & VERIFICATION Very Difficult Very Important Conceptually distinct, but performed simultaneously.
Variable selection and model building Part II. Statement of situation A common situation is that there is a large set of candidate predictor variables.
Forecasting Revenue: An Example of Regression Model Building Setting: Possibly a large set of predictor variables used to predict future quarterly revenues.
Ch 8 Estimating with Confidence. Today’s Objectives ✓ I can interpret a confidence level. ✓ I can interpret a confidence interval in context. ✓ I can.
Introduction to Linear Regression
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
VI. Evaluate Model Fit Basic questions that modelers must address are: How well does the model fit the data? Do changes to a model, such as reparameterization,
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
For ABA Importance of Individual Subjects Enables applied behavior analysts to discover and refine effective interventions for socially significant behaviors.
Gile Sampling1 Sampling. Fundamental principles. Daniel Gile
Sampling Techniques 19 th and 20 th. Learning Outcomes Students should be able to design the source, the type and the technique of collecting data.
Evaluating generalised calibration / Fay-Herriot model in CAPEX Tracy Jones, Angharad Walters, Ria Sanderson and Salah Merad (Office for National Statistics)
Course Review FORE 3218 Course Review  Sampling  Inventories  Growth and yield.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
Chapter 10 Verification and Validation of Simulation Models
1 Module One: Measurements and Uncertainties No measurement can perfectly determine the value of the quantity being measured. The uncertainty of a measurement.
The Scientific Method How to Use the Scientific Method EffectivelyHow to Use the Scientific Method Effectively.
Course: Research in Biomedicine and Health III Seminar 5: Critical assessment of evidence.
Building Valid, Credible & Appropriately Detailed Simulation Models
Synthetic Approaches to Data Linkage Mark Elliot, University of Manchester Jerry Reiter Duke University Cathie Marsh Centre.
Introduction to emulators Tony O’Hagan University of Sheffield.
Introduction Many problems in Engineering, Management, Health Sciences and other Sciences involve exploring the relationships between two or more variables.
GROWTH AND YIELD How will my forest grow? Dr. Glenn Glover School of Forestry & Wildlife Sciences Auburn University.
Analysis Tools interface - configuration Wouter Verkerke Wouter Verkerke, NIKHEF 1.
Chapter 7. Classification and Prediction
Regression Analysis Module 3.
The Calibration Process
Chapter 10 Verification and Validation of Simulation Models
Correlation and Regression
Statistical Methods For Engineers
CHAPTER 29: Multiple Regression*
Product moment correlation
Presentation transcript:

Generic Approaches to Model Validation Presented at Growth Model User’s Group August 10, 2005 David K. Walters

2 Phases in a Modeling Project Topology Data Collection Algebraic Specification Arithmetic Specification Software Implementation Model Identification Component Model – Equation Forms Model Fitting with Data

3 Where does validation fit in?  Ideal Case – as an integrated component of the model development process as a feedback mechanism  Reality –  Best Case - done once by modeler using a subset of the modeling data (or other, related techniques), then up to the user. Feedback is up to the persistence of the user and the receptiveness of the modeler. Not integrated…  Probable Case - done once by modeler using a subset of the modeling data (or other, related techniques), then up to the user. Modeler takes a new job, moves on…

4 Of what benefit is validation?  Increase Comfort  The user better understands the situations in which the model can be reliably applied and those situations in which it cannot.  Model Improvements – facilitates calibration  To make a model applicable to a new situation  different treatments/regions/situations  different “scale”  Over/under runs – utilization issues  To weight model output with other data for the purpose of decision making (weighting usually requires some estimate of variability)

5 Validating the overall appropriateness  Is the model flexible enough to reproduce desired management alternatives?  Does it provide sufficient detail for decision- making?  How efficient is the model in meeting these goals? Everything should be made as simple as possible, but not simpler --Albert Einstein Model Type & Resolution Whole Stand Models Individual Tree Models Process Models Distance Dependent - Independent

6 Validating a Model – Check the data Evaluation Data Application Data Modeling Data Differences in Data Populations Spatially Temporally Culturally Some Research data may be collected with such a high degree of caution that resultant models will tend to overestimate growth and yield.

7 Validating the Component Models  Model Component Specification  Equation “forms” – reasonable, consistent with established theory and/or user’s expectations.  Statistical, or other, “fitting” of the component equations

8 Validating the Implementation – the computer software  Software Implementation  Bugs  Adequacy of outputs / interface  Efficient

9 A couple of random thoughts – what else might make a model invalid?  Homogeneity – very few models operationally project plots. Most project stands. Stands are assumed to be homogenous with respect to exogenous or predictor variables. But....they are really full of Holes

10 …misuse  Matching Data Inputs with Model Specifications  Site Index  DBH Thresholds – all versus “merchantable” or other subsets of trees  Others

11 Statistics So, we get to the point of wishing to conduct a data-based validation of some kind. what do we compare? Real Data vs. Predicted Data Tree Variables – DBH, Height, Crown, Volume Stand Variables – QMD, TPA, BA, Volume If using Volume when comparing multiple models…make sure the volume equations are identical

12 What statistics to use?

13 The overall project - two Approaches Case 1 – We have repeat measurements (growth data) Using the observed inputs, run the real data through the model. Look at time 2(or 3, etc.) predicted versus real. Calculate Statistics Case 2 – No repeat measurements Simulation Study Identify matrix of input variables (Site, density, stocking,etc.) that cover the range of interest. Run model for each row of input matrix

14 Validation – Patterns and Trends  In either case, you will want to look for trends, how the predictions (or residuals if you have real data) change. Examples,  MAI over time  TPA over time  Results vs. predictor variables (Site, treatment, density)  How do Long-term predictions compare to “laws”  Self-thinning, etc.  How does the model prediction compare to other models

15 In Summary, Identify the alternative “models”, establish a frame of reference Examine the big picture Look at the sample used in the model calibration, the presumed population, and a sample of “your” population. Identify the key component models. Compare predictions with data – bias and accuracy. Examine these for trends against appropriate factors Look at the overall model output…the computer code. Are there errors? Evaluate output with data (volume per acre – aggregated variables)

16 Final Points Remember, there is always an alternative model. When evaluating a model, give careful thought to the alternative. How well a model performs in relation to the alternative is generally the most relevant question. Validity is relative, as are other things.

17 All you need in this life is ignorance and confidence -- and then success is sure. --Mark Twain