Ideas on Updating Model Evaluation Methods by John S. Irwin EPA Eight Conference on Air Quality Modeling September 22-23, 2005 US.

Slides:



Advertisements
Similar presentations
1 Proposed Rule: Amendments to the Protocol Gas Verification Program and Minimum Competency Requirements for Air Emission Testing Presented at May 12,
Advertisements

Sta220 - Statistics Mr. Smith Room 310 Class #14.
Brian A. Harris-Kojetin, Ph.D. Statistical and Science Policy
Air Emissions Testing Accreditation and Certification Programs and regulations Peter Westlin OAQPS, SPPD, MPG December 2010.
AP STATISTICS Simulation “Statistics means never having to say you're certain.”
ANALYSIS OF TRACER DATA FROM URBAN DISPERSION EXPERIMENTS Akula Venkatram and Vlad Isakov  Motivation for Field Experiments  Field Studies Conducted.
AP Statistics Section 6.2 A Probability Models
Evaluate Earth Science Models for What They are - Cartoons of Reality John S. Irwin, NOAA Meteorologist EPA OAQPS Air Quality Modeling Group RTP, NC
Kirkpatrick.
CHAPTER 6 Statistical Analysis of Experimental Data
Total Quality Management BUS 3 – 142 Statistics for Variables Week of Mar 14, 2011.
CHAPTER 6 Statistical Analysis of Experimental Data
Sharif University of Technology Session # 7.  Contents  Systems Analysis and Design  Planning the approach  Asking questions and collecting data 
Working With Databases. Questions to Answer about a Database System What functions the marketing database is expected to perform? What is the initial.
1 CHAPTER M4 Cost Behavior © 2007 Pearson Custom Publishing.
Sampling error Error that occurs in data due to the errors inherent in sampling from a population –Population: the group of interest (e.g., all students.
Chapter 8 Experimental Research
Linear Regression.
Development and Testing of Comprehensive Transport and Dispersion Model Evaluation Methodologies Steven Hanna (GMU and HSPH) OFCM Hanna 19 June 2003.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 9. Hypothesis Testing I: The Six Steps of Statistical Inference.
1 Framework Programme 7 Guide for Applicants
LESSON 8 Booklet Sections: 12 & 13 Systems Analysis.
Near East University Department of English Language Teaching Advanced Research Techniques Correlational Studies Abdalmonam H. Elkorbow.
Copyright © 2014 by The University of Kansas Using the Evaluation System to Answer Key Questions About Your Initiative.
Bell Work. Have you ever watched someone win a game again and again? Do you think that person just has good luck? In many cases, winners have strategies.
Considerations in Public Reporting of the AHRQ QIs Shoshanna Sofaer, Dr.P.H. School of Public Affairs Baruch College.
An Overview of Statistics
EPA’s Strength in Both GEOSS & ESIP; Environmental Health Decisionmaking January 10, 2008 George Gray, Ph.D. Assistant Administrator EPA Office of Research.
EPA’s DRAFT SIP and MODELING GUIDANCE Ian Cohen EPA Region 1 December 8, 2011.
Confidence intervals and hypothesis testing Petter Mostad
CHAPTER 12 Descriptive, Program Evaluation, and Advanced Methods.
Copyright © Allyn & Bacon 2007 Chapter 2 Research Methods This multimedia product and its contents are protected under copyright law. The following are.
James S. Ellis National Atmospheric Release Advisory Center Lawrence Livermore National Laboratory Department of Energy (DOE) OFCM Special Session Atmospheric.
Fitting a Line to Data Lesson 3.3.
Hypothesis Testing An understanding of the method of hypothesis testing is essential for understanding how both the natural and social sciences advance.
Science Process Skills Vocabulary 8/17/15. Predicting Forming an idea of an expected result. Based on inferences.
1 Tracer Experiments Barrio Logan Working Draft Do Not Cite or Quote Tony Servin, P.E. Shuming Du, Ph.D. Vlad Isakov, Ph.D. September 12, 2002 Air Resources.
POSC 202A: Lecture 4 Probability. We begin with the basics of probability and then move on to expected value. Understanding probability is important because.
Please turn off cell phones, pagers, etc. The lecture will begin shortly. There will be a very easy quiz at the end of today’s lecture.
Copyright © 2014 by The University of Kansas Using the Evaluation System to Answer Key Questions About Your Initiative.
1 Modeling Under PSD Air quality models (screening and refined) are used in various ways under the PSD program. Step 1: Significant Impact Analysis –Use.
INSTRUCTIONAL OBEJECTIVES PURPOSE OF IO IO DOMAINS HOW TO WRITE SMART OBJECTIVE 1.
Statistics What is statistics? Where are statistics used?
Inference: Probabilities and Distributions Feb , 2012.
The Impact of Short-term Climate Variations on Predicted Surface Ozone Concentrations in the Eastern US 2020 and beyond Shao-Hang Chu and W.M. Cox US Environmental.
Research Methods Chapter 2.
Some Alternative Approaches Two Samples. Outline Scales of measurement may narrow down our options, but the choice of final analysis is up to the researcher.
HF Modeling Task Mike Williams November 19, 2013.
Acceptance Sampling Systems For Continuous Production and Variables
Ideas on Selection and Approval of Models for EPA Regulatory Use by John S. Irwin EPA Eighth Conference on Air Quality Modeling September.
WHAT IS IT? ANSWER THIS ON A MINI- WHITEBOARD WITHOUT LOOKING IT UP. Psychokenesis (PK)
Regression Analysis: A statistical procedure used to find relations among a set of variables B. Klinkenberg G
EPA’s 8 th Conference on Air Quality Modeling Comments on Model Evaluation By Bob Paine, ENSR (Peer reviewed by the A&WMA AB-3 Committee)
Summary of the Report, “Federal Research and Development Needs and Priorities for Atmospheric Transport and Diffusion Modeling” 22 September 2004 Walter.
Week 2 Normal Distributions, Scatter Plots, Regression and Random.
Linear Regression Essentials Line Basics y = mx + b vs. Definitions
Steven Hanna (GMU and HSPH) OFCM Hanna 19 June 2003
Types of Models Marti Blad PhD PE
WESTAR Recommendations Exceptional Events EPA response
Chapter 5 STATISTICS (PART 4).
SIMPLE LINEAR REGRESSION MODEL
Chapter 1: Introduction to Scientific Thinking
What Is Science? Read the lesson title aloud to students.
What Is Science? Read the lesson title aloud to students.
Reviewing your final digital product
1 FUNCTIONS AND MODELS.
Sampling Distributions
What Is Science? Read the lesson title aloud to students.
Copyright © Allyn & Bacon 2007
What do all of the people above have in common?
Presentation transcript:

Ideas on Updating Model Evaluation Methods by John S. Irwin EPA Eight Conference on Air Quality Modeling September 22-23, 2005 US Environmental Protection Agency, RTP, NC 27711

Where Are We Now EPA has defined the Cox/Tikvart methodology. The BOOT program (developed by Chang and Hanna) implements this procedure. EPA once started to develop software for this, but it never got finalized. Most recently, AERMIC has employed various test data sets. The intensive tracer experiment included: Project Prairie Grass, EPRI Kinciad, EPRI Indianapolis. There are of order ten non-intensive data sets that have also been used. I think EPA would be willing to provide all these data if called upon to do so.. EPA defined a model evaluation procedure, and this does provide a means for ranking skill, but I do not think it provides a means for defining whether differences in skill are significant. The current evaluation methods in use do not account for the fact that air quality transport and diffusion involves stochastic processes that preclude simulation of exactly what is seen. You may want to predict exactly where the transport will be, but it is only possible “on average.” More on this later.

What Might Be Changed? Can we move the process to include more experts to help in devising, testing, drafting and promulgating model performance test methods, test data sets, testing software? The short answer is yes. This is the purpose of voluntary standards development organizations (like ASTM). The more complete answer is that EPA must actively participate to insure EPA’s interests are protected. Test methods could be drafted within ASTM and passed as ASTM standards. This provides a new level of review and scrutiny. It also provides a basis for continual review and upgrading in the future, as ASTM requires all standards to be reapproved at least every five years. Having a standard committee provides a place to develop expertise for succession (retiring experts) and future continuity (“How we got here” history). Working within ASTM, other test methods can be devised, tested and converted to ASTM standards. This could in the future provide tests as models develop new capabilities (e.g., characterization of stochastic effects). Working in a collective of Federal and Private experts, we may find ways to provide test data sets more completely (e.g., co-funding by several agencies, entrepreneurial interest – software providers who provide test data as a perk to attract customers).

Common Worries EPA will be forced to “accept” test methods. –How? EPA will be stating what test methods will be “acceptable” to EPA in the Model Guideline. EPA will lose control. –If EPA does not participate in the voluntary standards development process, then yes, in a sense, EPA loses control. On the flip side, the reason I suggest moving the process out to a communal activity is to enlist help from others and to cultivate a standing committee of experts that will promote better test methods in the future.

What Can We Do Now? At best, models predict the statistical properties of what is to be seen “on average”, whereas observations are individual realizations from imperfectly known ensembles. –We need to get this message out so that it is commonplace knowledge. –We need to devise evaluation methods that account for variability in the observations that models cannot ever explicitly simulate. You can fully explain the distribution of outcomes to occur when a pair of dice are rolled, but you will never predict the exact sequence of outcomes. ASTM D 6589 was recently updated and submitted for reapproval. It outlines the problem as stated above, and provides one example test procedure. The appendix of D could be converted into a Standard Practice. Who wants to participate in this? The Cox/Tikvart method could be drafted as a Standard Practice, but should inform users of what to expect with 1 st -order dispersion models (e.g., ISC, AERMOD). Who wants to participate in this? There are more ideas that people have on candidate test methods, many of which could be converted to ASTM standards, but people have to volunteer their time and talents to create the envisioned “standing committee”.

Composite Analysis for Project Prairie Grass Experiments All the “scatter” about the blue line (Gaussian fit) is what a Gaussian plume model does not characterize. Each experiment is an event out of a population and models describe the behavior of the ensemble mean

▲1.69(0.20) ▲2.17 (0.07) ■2.00 (0.20) ▼2.00 (0.23) ●1.78 (0.35) ●1.53 (0.24)

.Analysis of 1-hr concentration values seen for April 25, 1980 from 1200 to 1300 LST. Results are shown for four arcs. Solid lines with symbols show measured SF6 values. A Gaussian fit is shown for each arc. The resulting plume centerline position, PHIC, and lateral dispersion, Sy, is shown for each arc. The two vertical solid lines illustrates the transport wind direction indicated by the 100-m wind and the average of the PHIC determined individually for each arc. The dotted line (second arc) shows the effect of differences in transport between what is estimated by a wind direction at the release and what actually occurs. The Kincaid tracer experiments involved injecting SF6 into the gas exiting up a power-plant smoke stack. The smoke stack was 183 m tall, and the gases were hotter than the air, rose and leveled off at about 300 m above the ground.