1 Rockefeller College of Public Affairs and Policy University at Albany State University of New York Model Validation Outlined by Forrester and Senge George.

Slides:



Advertisements
Similar presentations
Chapter 2 The Process of Experimentation
Advertisements

Animal, Plant & Soil Science
Experimental Design, Response Surface Analysis, and Optimization
Hypothesis Testing Developing Null and Alternative Hypotheses Developing Null and Alternative Hypotheses Type I and Type II Errors Type I and Type II Errors.
Introduction to Research Methodology
Decision Making: An Introduction 1. 2 Decision Making Decision Making is a process of choosing among two or more alternative courses of action for the.
QM Spring 2002 Business Statistics Introduction to Inference: Hypothesis Testing.
Lecture 7 Model Development and Model Verification.
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Scientific method - 1 Scientific method is a body of techniques for investigating phenomena and acquiring new knowledge, as well as for correcting and.
Introduction to Communication Research
BCOR 1020 Business Statistics
Scaling and Attitude Measurement in Travel and Hospitality Research Research Methodologies CHAPTER 11.
Model Calibration and Model Validation
RESEARCH METHODS IN EDUCATIONAL PSYCHOLOGY
Chapter 7: Analyzing Behavior Change: Basic Assumptions and Strategies
Science and Engineering Practices
Chapter One of Your Thesis
Fundamentals of Human Resource Management 8e, DeCenzo and Robbins
General Analysis Procedure and Calculator Policy Calculator Policy.
Software Testing Verification and validation planning Software inspections Software Inspection vs. Testing Automated static analysis Cleanroom software.
ECON 6012 Cost Benefit Analysis Memorial University of Newfoundland
Section 2: Science as a Process
Chapter 12 Multiple Regression and Model Building.
1 Validation & Verification Chapter VALIDATION & VERIFICATION Very Difficult Very Important Conceptually distinct, but performed simultaneously.
Cost Behavior Analysis
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
+ Chapter 9 Summary. + Section 9.1 Significance Tests: The Basics After this section, you should be able to… STATE correct hypotheses for a significance.
© Grant Thornton | | | | | Guidance on Monitoring Internal Control Systems COSO Monitoring Project Update FEI - CFIT Meeting September 25, 2008.
Chapter 8 Introduction to Hypothesis Testing
1 Chapter 9 Database Design. 2 2 In this chapter, you will learn: That successful database design must reflect the information system of which the database.
Research in Business. Introduction to Research Research is simply the process of finding solution to a problem after a thorough study and analysis of.
1 Rockefeller College of Public Affairs and Policy University at Albany State University of New York Model Validation as an Integrated Social Process George.
Course on Data Analysis and Interpretation P Presented by B. Unmar Sponsored by GGSU PART 2 Date: 5 July
Big Idea 1: The Practice of Science Description A: Scientific inquiry is a multifaceted activity; the processes of science include the formulation of scientifically.
Chapter 2 Section 1. Objectives Be able to define: science, scientific method, system, research, hypothesis, experiment, analysis, model, theory, variable,
On visible choice set and scope sensitivity: - Dealing with the impact of study design on the scope sensitivity Improving the Practice of Benefit Transfer:
Fundamentals of Human Resource Management
Implementation and process evaluation: developing our approach Ann Lendrum University of Manchester Neil Humphrey University of Manchester Gemma Moss Institute.
1 Unit 1 Information for management. 2 Introduction Decision-making is the primary role of the management function. The manager’s decision will depend.
Dr. Tom WayCSC Testing and Test-Driven Development CSC 4700 Software Engineering Based on Sommerville slides.
URBDP 591 I Lecture 3: Research Process Objectives What are the major steps in the research process? What is an operational definition of variables? What.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 22 Slide 1 Software Verification, Validation and Testing.
CHAPTER 1 Understanding RESEARCH
1 The Need for Probabilistic Limits of Harmonics: Proposal for IEEE Std 519 Revision Paulo Ribeiro Calvin College / BWX Technologies, Inc Guide Carpinelli.
RE - SEARCH ---- CAREFUL SEARCH OR ENQUIRY INTO SUBJECT TO DISCOVER FACTS OR INVESTIGATE.
WHAT IS THE NATURE OF SCIENCE?. SCIENTIFIC WORLD VIEW 1.The Universe Is Understandable. 2.The Universe Is a Vast Single System In Which the Basic Rules.
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005 Dr. John Lipp Copyright © Dr. John Lipp.
1 Test Selection for Result Inspection via Mining Predicate Rules Wujie Zheng
MODES-650 Advanced System Simulation Presented by Olgun Karademirci VERIFICATION AND VALIDATION OF SIMULATION MODELS.
Introduction to Earth Science Section 2 Section 2: Science as a Process Preview Key Ideas Behavior of Natural Systems Scientific Methods Scientific Measurements.
Chapter 10 Verification and Validation of Simulation Models
LECTURE 5 HYPOTHESIS TESTING EPSY 640 Texas A&M University.
Scientific Methods and Terminology. Scientific methods are The most reliable means to ensure that experiments produce reliable information in response.
CHAPTER 2 Research Methods in Industrial/Organizational Psychology
WHAT IS RESEARCH? According to Redman and Morry,
Company LOGO. Company LOGO PE, PMP, PgMP, PME, MCT, PRINCE2 Practitioner.
Building Valid, Credible & Appropriately Detailed Simulation Models
Organizations of all types and sizes face a range of risks that can affect the achievement of their objectives. Organization's activities Strategic initiatives.
Defect testing Testing programs to establish the presence of system defects.
SECTION 1 TEST OF A SINGLE PROPORTION
Chapter 10: The t Test For Two Independent Samples.
 System Requirement Specification and System Planning.
Establishing by the laboratory of the functional requirements for uncertainty of measurements of each examination procedure Ioannis Sitaras.
CHAPTER 2 Research Methods in Industrial/Organizational Psychology
Chapter 10 Verification and Validation of Simulation Models
Lecture 09:Software Testing
Test Case Test case Describes an input Description and an expected output Description. Test case ID Section 1: Before execution Section 2: After execution.
Biological Science Applications in Agriculture
Presentation transcript:

1 Rockefeller College of Public Affairs and Policy University at Albany State University of New York Model Validation Outlined by Forrester and Senge George P. Richardson Rockefeller College of Public Affairs and Policy University at Albany - State University of New York

2 Rockefeller College of Public Affairs and Policy University at Albany State University of New York What do we mean by ‘validation’? No model has ever been or ever will be thoroughly validated. …‘Useful,’ ‘illuminating,’ or ‘inspiring confidence’ are more apt descriptors applying to models than ‘valid’ (Greenberger et al. 1976). Validation is a process of establishing confidence in the soundness and usefulness of a model. (Forrester 1973, Forrester and Senge 1980).

3 Rockefeller College of Public Affairs and Policy University at Albany State University of New York The classic questions Not ‘Is the model valid,’ but Is the model suitable for its purposes and the problem it addresses? Is the model consistent with the slice of reality it tries to capture? (Richardson & Pugh 1981)

4 Rockefeller College of Public Affairs and Policy University at Albany State University of New York The system dynamics modeling process Adapted from Saeed 1992

5 Rockefeller College of Public Affairs and Policy University at Albany State University of New York Processes focusing on system structure

6 Rockefeller College of Public Affairs and Policy University at Albany State University of New York Processes focusing on system behavior

7 Rockefeller College of Public Affairs and Policy University at Albany State University of New York Two kinds of validating processes

8 Rockefeller College of Public Affairs and Policy University at Albany State University of New York The classic tests Focusing on STRUCTURE Focusing on BEHAVIOR Testing SUITABILITY for PURPOSES Dimensional consistency Extreme conditions Boundary adequacy Parameter insensitivity Structure insensitivity Testing CONSISTENCY with REALITY Face validity Parameter values Replication of behavior Surprise behavior Statistical tests Contributing to UTILITY & EFFECTIVENESS Appropriateness for audience Counterintuitive behavior Generation of insights Forrester 1973, Forrester & Senge 1980, Richardson and Pugh 1981

9 Rockefeller College of Public Affairs and Policy University at Albany State University of New York Tests for Building Confidence in System Dynamics Models JW Forrester & PM Senge TIMS Studies in the Management Sciences 14 (1980)

10 Rockefeller College of Public Affairs and Policy University at Albany State University of New York Structure: Structure-Verification Test Verifying structure means comparing structure of a model directly with structure of the real system that the model represents. To pass the structure-verfication test, the model structure must not contradict knowledge about the structure of the real system. Structure verification may include review of model assumptions by person highly knowledgeable about corresponding parts of the real system. Verifying that model structure exists in the real system is easier and yakes less skill than other tests. Many structures pass the structure verification tests; it is easier to verify that a model structure is found in the real system than to establish that the most relevant structure for the purpose of the model has been chosen from the real system. Criticisms which ask for more of the real-life structure in the model belong to the boundary-adequacy test.

11 Rockefeller College of Public Affairs and Policy University at Albany State University of New York Structure: Parameter-Verification Test Parameter verification means comparing model parameters [constants] to knowledge of the real system to determine if parameters correspond conceptually and numerically to real life. Both tests [structure-verification and parameter-verification] spring from the same objective – that system dynamics models should strive to describe real decision-making processes. In a model addressed to short-term issues, certain concepts can be considered constants (parameters) that for a longer-term view must be treated as variables. Therefore, structure verification, in the broadest sense, can be thought of as including parameter verification.

12 Rockefeller College of Public Affairs and Policy University at Albany State University of New York Structure: Extreme Conditions Test Much knowledge about real systems relates to consequences of extreme conditions. If knowledge about extreme conditions is incorporated, the result is almost always an improved model in the normal operating region. Structure in a system dynamics model should permit extreme combinations of levels (state variables) in the system being represented. A model should be questioned if the extreme-conditions test is not met. It is not an acceptable counterargument to asset that particular extreme conditions do not occur in real life and should not occur in the model; the nonlinearities introduced by approaches to extreme conditions can have important effects in normal operating ranges. To make the extreme-conditions test, one must examine each rate equation (policy) in a model, trace it back through any auxiliary equations to the level (state variables) on which the rate depends, and consider the implications of imaginary maximum and minimum (minus infinity, zero, plus infinity) values of each state variable and combinations of state variables to determine the plausibility of the resulting rate equation.

13 Rockefeller College of Public Affairs and Policy University at Albany State University of New York Structure: Boundary-Adequacy Test The boundary-adequacy (structure) test considers structural relationships necessary to satisfy a model’s purpose. The boundary-adequacy (structure) test involves developing a convincing hypothesis relating proposed model structure to a particular issue addressed by the model. [Explanatory example: ineffectiveness of job-training programs in reversing urban decay] The boundary adequacy test requires that an evaluator be able to unify criticisms of model boundary with criticisms of model purpose. [Explanatory example: criticisms of World Dynamics for failing to distinguish developed from underdeveloped countries]

14 Rockefeller College of Public Affairs and Policy University at Albany State University of New York Structure: Dimensional-Consistency Test The dimensional-consistency test is more powerful when applied in conjunction with the parameter-verification test. Failure to pass the dimensional-consistency check, or satisfying dimensional consistency by inclusion of parameters with little or no meaning as independent structural components, often reveals faulty model structure.

15 Rockefeller College of Public Affairs and Policy University at Albany State University of New York Behavior: Behavior-Reproduction Tests The symptom-generation test examines whether or not a model recreates the symptoms of the difficulty that motivated the construction of the model. …Unless one can show how internal policies and structure cause the symptoms, one is in a poor position to alter those causes. The frequency-generation and relative phasing tests focus on periodicities of fuctuation and phase relationships between variables. The multiple-mode test considers whether or not a model is able to generate more than one mode of observed behavior. [Explanatory example: Mass (1975) model of the economy generates 3-7 year and roughly 18 year cycles; shift in Urban Dynamics from low unemployment and tight housing to high unemployment and excess housing] It is important that a model pass the behavior-reproduction tests without the aid of exogenous time-series inputs driving the model in a predetermined way. Unless the model shows how internal policies generate observed behavior, the model fails to provide a persuasive basis for improving behavior.

16 Rockefeller College of Public Affairs and Policy University at Albany State University of New York Behavior: Behavior-Prediction Tests The pattern-prediction test examines whether or not a model generates qualitatively correct patterns of future behavior. The event-prediction test focuses on a particular change in circumstances, such as a sharp drop in market share or a rapid upsurge in a commodity price, which is found likely on the basis of analysis of model behavior. [Explanatory example: Naill’s natural gas model showed price rising precipitously even after a long period of steady or falling prices.]

17 Rockefeller College of Public Affairs and Policy University at Albany State University of New York Behavior: Behavior-Anomaly Test Frequently, the model-builder discovers anomalous features of model behavior which sharply conflict with behavior of the real system. Once the behavioral anomaly is traced to the elements of model structure responsible for the behavior, one often finds obvious flaws in model assumptions.

18 Rockefeller College of Public Affairs and Policy University at Albany State University of New York Behavior: Family-Member Test When possible a model should be a general model of the class of system to which belongs the particular member of interest. One should usually be interested in why a particular member of the class differs from the various other members. An important step in validation is to show that the model takes on the characteristics of different members of the class when policies are altered in accordance with the known decision-making differences between the members. [Explanatory example: Urban Dynamics parameterized to fit New York, Dallas, West Berlin, and Calcutta]

19 Rockefeller College of Public Affairs and Policy University at Albany State University of New York Behavior: Surprise-Behavior Test The better and more comprehensive a system dynamics model, the more likely it is to exhibit behavior that is present in the real system but which has gone unrecognized. When unexpected behavior appears, the model builder must first understand causes of the unexpected behavior within the model, then compare the behavior and its causes to those of the real system. When this procedure leads to identification of previously unrecognized behavior in the real system, the surprise-behavior test contributes to confidence in a model’s usefulness.

20 Rockefeller College of Public Affairs and Policy University at Albany State University of New York Behavior: Extreme-Policy Test The extreme-policy test involves altering a policy statement (rate equation) in an extreme way and running the model to determine dynamic consequences. Does the model behave as we might expect for the real system under the same extreme policy circumstances? The test shows the resilience of a model to major policy changes. The better a model passes a multiplicity of extreme-policy tests, the greater can be confidence over the range of normal policy analysis and design.

21 Rockefeller College of Public Affairs and Policy University at Albany State University of New York Behavior: Boundary-Adequacy Test The boundary-adequacy (behavior) test considers whether or not a model includes the structure necessary to address for which it is designed. The test involves conceptualizing additional structure that might influence behavior of the model. When conducted as a behavior test, the boundary-adequacy test includes analysis of behavior with and without the additional structure. Conduct of the boundary-adequacy test requires modeling skill, both in conceptualizing model structure and analyzing the behavior generated by alternative structures.

22 Rockefeller College of Public Affairs and Policy University at Albany State University of New York Behavior: Behavior-Sensitivity Test The behavior-sensitivity test ascertains whether or not plausible shifts in model parameters can cause a model to fail behavior tests previously passed. To the extent that such alternative parameter values are not found, confidence in the model is enhanced. For example, does there exist another equally plausible set of parameter values that an lead the model to fail to generate observed patterns of behavior or to behave implausibly under conditions where plausible behavior was previously exhibited? Finding a sensitive parameter does not necessarily invalidate the model....The sensitive parameter may be an important input for policy analysis.

23 Rockefeller College of Public Affairs and Policy University at Albany State University of New York Policy: System-Improvement Test The system-improvement test considers whether or not policies found beneficial after working with a model, when implemented, also improve real- system behavior. Although it is the ultimate real-life test, the system-improvement test presents many difficulties. In time, the system-improvement test becomes the decisive test, but only as repeated real-life applications of a model lead overwhelmingly to the conclusion that models pointed the way to improved studies. In the meantime, confidence in policy implications of models must be achieved through other tests.

24 Rockefeller College of Public Affairs and Policy University at Albany State University of New York Policy: Changed-Behavior Test The changed-behavior test asks if a model correctly preicts how behavior of the system will change if a governing policy is changed. Initially, the test can be made by changing policies in a model and verifying the plausibility of resulting behavioral changes. Alternatively, one can examine response of a model to policies which have been pursued in the real system to see if the model responds to a policy change as the real system responded. [Explanatory example: Urban Dynamics]

25 Rockefeller College of Public Affairs and Policy University at Albany State University of New York Policy: Boundary-Adequacy Test The boundary-adequacy test, when viewed as a test of the policy implications of a model, examines how modifying the model boundary would alter policy recommendations. The boundary-adequacy test requires conceptualization of additional structure and analysis of the effects of the additional structure on model behavior.

26 Rockefeller College of Public Affairs and Policy University at Albany State University of New York Policy: Policy-Sensitivity Test Parameter sensitivity testing can, in addition to revealing the degree of robustness of model behavior, indicate the degree to which policy recommendations might be influenced by uncertainty in parameter values. If the same policies would be recommended, regardless of parameter values within a plausible range, risk in using the model will be less than if two plausible sets of parameters lead to opposite policy recommendations.

27 Rockefeller College of Public Affairs and Policy University at Albany State University of New York The Core Tests Tests of Model Structure Structure Verification Parameter Verification Extreme Conditions Boundary Adequacy Dimensional Consistency Tests of Model Behavior Behavior Reproduction Behavior Anomaly Behavior Sensitivity Tests of Policy Implications Changed-Behavior Prediction Policy Sensitivity

28 Rockefeller College of Public Affairs and Policy University at Albany State University of New York References Forrester, J. W. (1973). Confidence in Models of Social Behavior--With Emphasis on System Dynamics Models., M. I. T. System Dynamics Group. Forrester, J. W. and P. M. Senge (1980). Tests for Building Confidence in System Dynamics Models. System Dynamics. TIMS Studies in the Management Sciences 14: A. A. Legasto, Jr. et al., eds. New York, North-Holland. Richardson, G. P. and A. L. Pugh, III (1981). Introduction to System Dynamics Modeling with DYNAMO. Cambridge MA, Productivity Press. Reprinted by Pegasus Communications.