Ensemble forecasts and seasonal precipitation tercile probabilities

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

Lesson 10: Linear Regression and Correlation
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
1 Seasonal Forecasts and Predictability Masato Sugi Climate Prediction Division/JMA.
Propagation of Error Ch En 475 Unit Operations. Quantifying variables (i.e. answering a question with a number) 1. Directly measure the variable. - referred.
Understanding the Accuracy of Assembly Variation Analysis Methods ADCATS 2000 Robert Cvetko June 2000.
Regression Analysis. Unscheduled Maintenance Issue: l 36 flight squadrons l Each experiences unscheduled maintenance actions (UMAs) l UMAs costs $1000.
Confidence intervals. Population mean Assumption: sample from normal distribution.
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Lecture 19 Simple linear regression (Review, 18.5, 18.8)
A Regression Model for Ensemble Forecasts David Unger Climate Prediction Center.
Lecture II-2: Probability Review
Standard error of estimate & Confidence interval.
Multi-Model Ensembling for Seasonal-to-Interannual Prediction: From Simple to Complex Lisa Goddard and Simon Mason International Research Institute for.
Introduction to Seasonal Climate Prediction Liqiang Sun International Research Institute for Climate and Society (IRI)
Physics 114: Lecture 15 Probability Tests & Linear Fitting Dale E. Gary NJIT Physics Department.
The horseshoe estimator for sparse signals CARLOS M. CARVALHO NICHOLAS G. POLSON JAMES G. SCOTT Biometrika (2010) Presented by Eric Wang 10/14/2010.
Chi-squared distribution  2 N N = number of degrees of freedom Computed using incomplete gamma function: Moments of  2 distribution:
Montecarlo Simulation LAB NOV ECON Montecarlo Simulations Monte Carlo simulation is a method of analysis based on artificially recreating.
Measures of Variability Objective: Students should know what a variance and standard deviation are and for what type of data they typically used.
Ch4 Describing Relationships Between Variables. Section 4.1: Fitting a Line by Least Squares Often we want to fit a straight line to data. For example.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Part 2: Model and Inference 2-1/49 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
Why Model? Make predictions or forecasts where we don’t have data.
Stat 112: Notes 2 Today’s class: Section 3.3. –Full description of simple linear regression model. –Checking the assumptions of the simple linear regression.
Forecasting in CPT Simon Mason Seasonal Forecasting Using the Climate Predictability Tool Bangkok, Thailand, 12 – 16 January 2015.
Propagation of Error Ch En 475 Unit Operations. Quantifying variables (i.e. answering a question with a number) 1. Directly measure the variable. - referred.
Statistics Presentation Ch En 475 Unit Operations.
Motivation Quantify the impact of interannual SST variability on the mean and the spread of Probability Density Function (PDF) of seasonal atmospheric.
Statistics Presentation Ch En 475 Unit Operations.
LESSON 5 - STATISTICS & RESEARCH STATISTICS – USE OF MATH TO ORGANIZE, SUMMARIZE, AND INTERPRET DATA.
Central Bank of Egypt Basic statistics. Central Bank of Egypt 2 Index I.Measures of Central Tendency II.Measures of variability of distribution III.Covariance.
Predictability and predictions of Indian summer Monsoon rainfall using dynamical models Michael K. Tippett (IRI), Andrew Robertson (IRI), Makarand Kulkarni.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Canadian Bioinformatics Workshops
Statistical Forecasting
Why Model? Make predictions or forecasts where we don’t have data.
SUR-2250 Error Theory.
Physics 114: Lecture 13 Probability Tests & Linear Fitting
Data Analysis.
Review 1. Describing variables.
PCB 3043L - General Ecology Data Analysis.
Potential predictability, ensemble forecasts and tercile probabilities
Makarand A. Kulkarni Indian Institute of Technology, Delhi
Review of Probability Theory
Ungraded quiz Unit 6.
Statistics Presentation
Basic Estimation Techniques
I271B Quantitative Methods
Statistical Methods For Engineers
Statistics Review ChE 477 Winter 2018 Dr. Harding.
Regression Computer Print Out
Introduction to Statistics
6-1 Introduction To Empirical Models
Predictability of Indian monsoon rainfall variability
The Importance of Reforecasts at CPC
Search for gravitational waves from binary black hole mergers:
Measuring the potential predictability of seasonal climate predictions
The Simple Linear Regression Model: Specification and Estimation
IRI forecast April 2010 SASCOF-1
Statistics II: An Overview of Statistics
Parametric Methods Berlin Chen, 2005 References:
Forecast system development activities
MGS 3100 Business Analysis Regression Feb 18, 2016
Probabilistic Surrogate Models
REGRESSION ANALYSIS 11/28/2019.
The Mean Variance Standard Deviation and Z-Scores
Presentation transcript:

Ensemble forecasts and seasonal precipitation tercile probabilities Michael Tippett, Tony Barnston, Andy Robertson IRI

Motivation PDF fitting: Wilks (2002) NWP, Kharin & Zwiers (2003) Reduce sampling error in tercile probabilities; 2-tier seasonal forecasts Force GCMs with predicted SST. Compute tercile probabilities from frequencies. Post-process. Characterize predictability. Changes in probabilities related to: Ensemble mean? Ensemble variance? Both? The general topic of this work is computing tercile probabilities from fitted distributions. There are two motivations. First, seasonal forecasts use model-based tercile probabilities as a key input. Second, changes in tercile probabilities from equally likely indicates predictability. Parametric descriptions allow us to associate those changes in with changes in the pdf.

Outline Quantify sampling error Fit parametric forecast PDFs Analytical estimates Counting. Gaussian fit. Sub-sampling from large ensemble. Perfect model. ECHAM 4.5 T42 79 members DJF precipitation over North America Fit parametric forecast PDFs Gaussian Constant variance vs. Variable variance Generalized linear model Ensemble mean vs. Ensemble mean and variance. Two main results. First, quantify sampling error analytically and sub-sampling from a large ensemble. Second, use two fitting methods. Is sampling error reduced? What parameters are useful to characterize changes in tercile probabilities.

Sampling error How to measure sampling error? Compare the sample tercile probability with true tercile probability. Problem: Don’t know the true probability. Compare two independent samples. Error variance between two samples is twice true error. One way of measuring sampling error is to compare the sample probability with the true probability. Another is to compare two independent sample. The variance of their difference is twice the true error. Do this in a Monte Carlo fashion. Average over many samples.

S = Signal-to-noise ratio N = ensemble size Converges like S = Signal-to-noise ratio N = ensemble size DJF North America precipitation. Sup-sampling from ensemble of size 79. Sampling error when you calculate tercile probabilities by counting. Error depends on sample size and signal-to-noise ratio. Signal to noise is model dependent.

Fitting with a Gaussian Two types of error: PDF not really Gaussian! Sampling error Fit only mean Fit mean and variance Compare the error of counting with fitting a Gaussian. There’s also sampling error when fitting a Gaussian. Two sources. First, the real PDF is not Gaussian. Problem dependent. Second, sampling error estimating mean and variance. Treat analytically. Conclusion, ff the PDF is really Gaussian, FIT for better tercile probabilities! Expression for no signal, additional terms when there is a signal Error(Gaussian fit N=24) = Error(Counting N=40)

Generalized Linear Model Logistic regression Regression between tercile probabilities and Ensemble mean Ensemble mean and variance Why GLM? Relation is nonlinear—probabilities bounded. Errors are not normal. An empirical approach.

Results Randomly select 24 members Compute DJF (1951-2003) precipitation tercile probabilities by Counting (theory predicts average rms error = 10.9) Fitting Gaussian Constant variance Interannually varying variance GLM Ensemble mean Ensemble mean and variance Compare with frequency probability from independent 55 member ensemble. Adding more parameters better fits the 24 but not the 55. Show results comparing counting, fitting a Gaussian and GLM. Precipitation is not Gaussian, use square-root of precipitation. This approach allows use to determine which parameters are useful.

Gaussian (square-root) Counting Below Above N=24 Gaussian (square-root) 9.25 = rms error of sampling with 40 member ensemble. Gaussian fitting gives a reduction in error that is equivalent to going from 24 to 40 members. Some problems due to the PDF not really being Gaussian. Time-varying Gaussian

Regression with mean (square-root) Counting Below Above N=24 Regression with mean (square-root) GLM results are similar, on average. Some indication that GLM is slight better in regions where the model is more confident and worse elsewhere. Lots of Monte Carlo draws, so results are robust. Regression with mean and variance

1996 N=24 Below Above Sample Regression Illustrate the character of the tercile probabilities based on fitted distributions. Both use the same 24 members, fitted is spatially smoother.

1998 N=24 Below Above Sample Regression With strong SST forcing, still have strong shifts.

Summary Used a large ensemble to look at sampling error in perfect model tercile probabilities. Error well-described analytically Error depends on sample size and S/N ratio. Parametric fitting reduces tercile probability sampling error. For Gaussian fitting and GLM, most of the useful information is associated with ensemble mean. Future: include model error in fitting process. Inflate variance of Gaussian.