A Regression Model for Ensemble Forecasts David Unger Climate Prediction Center.

Slides:



Advertisements
Similar presentations
Unsupervised Learning
Advertisements

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Bayesian methods for combining climate forecasts (*): Department of Meteorology, The University of Reading 1.Introduction 2.Conditioning and Bayes’ theorem.
Creating probability forecasts of binary events from ensemble predictions and prior information - A comparison of methods Cristina Primo Institute Pierre.
TC Dressing: Next-generation GPCE Jim Hansen NRL MRY, code 7504 (831) Jim Goerss Buck Sampson.
Instituting Reforecasting at NCEP/EMC Tom Hamill (ESRL) Yuejian Zhu (EMC) Tom Workoff (WPC) Kathryn Gilbert (MDL) Mike Charles (CPC) Hank Herr (OHD) Trevor.
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
Forecasting.
Hydrometeorological Prediction Center HPC Medium Range Grid Improvements Mike Schichtel, Chris Bailey, Keith Brill, and David Novak.
Introduction to Numerical Weather Prediction and Ensemble Weather Forecasting Tom Hamill NOAA-CIRES Climate Diagnostics Center Boulder, Colorado USA.
AGEC 622 Mission is prepare you for a job in business Have you ever made a price forecast? How much confidence did you place on your forecast? Was it correct?
Stat 112: Lecture 9 Notes Homework 3: Due next Thursday
CHAPTER 18 Models for Time Series and Forecasting
Forecasting Revenue: An Example of Regression Model Building Setting: Possibly a large set of predictor variables used to predict future quarterly revenues.
Review Guess the correlation. A.-2.0 B.-0.9 C.-0.1 D.0.1 E.0.9.
Multi-Model Ensembling for Seasonal-to-Interannual Prediction: From Simple to Complex Lisa Goddard and Simon Mason International Research Institute for.
Inference for regression - Simple linear regression
Operations and Supply Chain Management
Determining Sample Size
Forecasting and Statistical Process Control MBA Statistics COURSE #5.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Random Sampling, Point Estimation and Maximum Likelihood.
Forecasting Revenue: An Example of Regression Model Building Setting: Possibly a large set of predictor variables used to predict future quarterly revenues.
Exploring sample size issues for 6-10 day forecasts using ECMWF’s reforecast data set Model: 2005 version of ECMWF model; T255 resolution. Initial Conditions:
Guidance on Intensity Guidance Kieran Bhatia, David Nolan, Mark DeMaria, Andrea Schumacher IHC Presentation This project is supported by the.
EUROBRISA Workshop – Beyond seasonal forecastingBarcelona, 14 December 2010 INSTITUT CATALÀ DE CIÈNCIES DEL CLIMA Beyond seasonal forecasting F. J. Doblas-Reyes,
Chapter 11 Linear Regression Straight Lines, Least-Squares and More Chapter 11A Can you pick out the straight lines and find the least-square?
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Managerial Economics Demand Estimation & Forecasting.
A unifying framework for hybrid data-assimilation schemes Peter Jan van Leeuwen Data Assimilation Research Center (DARC) National Centre for Earth Observation.
Toward Probabilistic Seasonal Prediction Nir Krakauer, Hannah Aizenman, Michael Grossberg, Irina Gladkova Department of Civil Engineering and CUNY Remote.
1 An overview of the use of reforecasts for improving probabilistic weather forecasts Tom Hamill NOAA / ESRL, Physical Sciences Div.
1 1 Slide Forecasting Professor Ahmadi. 2 2 Slide Learning Objectives n Understand when to use various types of forecasting models and the time horizon.
Probabilistic Forecasting. pdfs and Histograms Probability density functions (pdfs) are unobservable. They can only be estimated. They tell us the density,
1 Objective Drought Monitoring and Prediction Recent efforts at Climate Prediction Ct. Kingtse Mo & Jinho Yoon Climate Prediction Center.
Local Predictability of the Performance of an Ensemble Forecast System Liz Satterfield and Istvan Szunyogh Texas A&M University, College Station, TX Third.
Statistical Post Processing - Using Reforecast to Improve GEFS Forecast Yuejian Zhu Hong Guan and Bo Cui ECM/NCEP/NWS Dec. 3 rd 2013 Acknowledgements:
“Comparison of model data based ENSO composites and the actual prediction by these models for winter 2015/16.” Model composites (method etc) 6 slides Comparison.
An Examination Of Interesting Properties Regarding A Physics Ensemble 2012 WRF Users’ Workshop Nick P. Bassill June 28 th, 2012.
The CFS ensemble mean (heavy blue line) predicts La Nina will last through at least the Northern Hemisphere spring
1 Malaquias Peña and Huug van den Dool Consolidation of Multi Method Forecasts Application to monthly predictions of Pacific SST NCEP Climate Meeting,
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
Huug van den Dool / Dave Unger Consolidation of Multi-Method Seasonal Forecasts at CPC. Part I.
LECTURE 07: CLASSIFICATION PT. 3 February 15, 2016 SDS 293 Machine Learning.
EMC Annual Review: CPC’s Forecasts FY 2011 Edward O’Lenic Chief, Operations Branch NOAA-NWS-Climate Prediction Center December 6, 2011.
Demand Management and Forecasting Chapter 11 Portions Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Forecast 2 Linear trend Forecast error Seasonal demand.
Statistics 350 Lecture 2. Today Last Day: Section Today: Section 1.6 Homework #1: Chapter 1 Problems (page 33-38): 2, 5, 6, 7, 22, 26, 33, 34,
Sampling Analysis. Statisticians collect information about specific groups through surveys. The entire group of objects or people that you want information.
1 Summary of CFS ENSO Forecast December 2010 update Mingyue Chen, Wanqiu Wang and Arun Kumar Climate Prediction Center 1.Latest forecast of Nino3.4 index.
Chapter 11 – With Woodruff Modications Demand Management and Forecasting Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
National Oceanic and Atmospheric Administration’s National Weather Service Colorado Basin River Forecast Center Salt Lake City, Utah 11 The Hydrologic.
Stats Methods at IC Lecture 3: Regression.
Where Are You? Children Adults.
Statistical Modelling
Deep Feedforward Networks
Regression 10/29.
Review Measure testosterone level in rats; test whether it predicts aggressive behavior. What would make this an experiment? Randomly choose which rats.
Goals of Statistics.
Precipitation Products Statistical Techniques
12 Inferential Analysis.
Statistical Methods For Engineers
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Predictability of Indian monsoon rainfall variability
The Importance of Reforecasts at CPC
12 Inferential Analysis.
Regression Forecasting and Model Building
Seasonal Forecasting Using the Climate Predictability Tool
Ensemble forecasts and seasonal precipitation tercile probabilities
Forecast system development activities
Presentation transcript:

A Regression Model for Ensemble Forecasts David Unger Climate Prediction Center

Summary A linear regression model can be designed specifically for ensemble prediction systems. It is best applied to direct model forecasts of the element in question. Ensemble regression is easy to implement and calibrate. This talk will summarize how it works

Ensemble Forecasting The ensemble forecasting approach is based on the following beliefs: 1) Individual solutions represent possible outcomes. 2) Each ensemble member is equally likely to best represent the observation. 3) The ensemble set behaves as a randomly selected sample from the expected distribution of observations.

6-10 day Mean 500-hpa hts.

THEORY

Conventions

The Ensemble Regression Model Assumptions

Forecasts Observations A Schematic Drawing of an Ensemble Regression Line.

Forecasts Potential Observations Actual obs 20% chance An individual case: 5 Potential solutions identified One actual observation (ovals). Four others that “could” happen. Red indicates best (closest) member.

Ensemble Regression Principal Assumptions Statistics gathered from the one actual obs Math applied with the assumption that each ensemble member could also be a solution.

How is it possible to derive?

“Ensemble” Regression Best Member Regression Eq. same as for the Ensemble mean Residual errors much smaller (usually)

What it means in English? Derive a regression equation relating the ensemble mean and the observation. Apply this equation to each individual member. Apply an error estimate to each individual regression corrected forecast This looks a lot like the “Gaussian Kernel” approach. (Kernel Dressing)

Regression with error estimates applied

Derivation The regression is computed from similar “statistics” needed for standard linear regression with only two additional array elements related to the ensemble size and spread.

Multiple linear regression Theory (applying the ensemble mean equation to individual members) also applies to multiple linear regression PROVIDED all predictors are linear. (Inclusion of binary predictors, interactive predictors etc. will not be theoretically correct). Ensemble regression may be easier to apply to the MOS forecasts in a second step. (Derive equations, apply them to get a series of forecasts, and do a second step processing of those forecasts)

CPC PRODUCTS BASED ON ENSEMBLE REGRESSION

NAEFS Combines GEFS and Canadian ensembles Bias corrected by EMC (6-hourly) 2 meter temperatures processed by CPC into probability of above-near-below normal categories(5-day means)

NAEFS Kernel Density Example Standardized Temperature (Z) Probability Density

Long Lead Consolidation Nino 3.4 SST forecasts Seasonal Forecast Consolidation

NAEFS PERFORMANCE 6-10 Day Forecast Reliability8-14 Day Forecast Reliability

NAEFS Performance Official Forecast NAEFS Guidance

CALIBRATION

Climate Forecast System Version 2 (CFSv2) 4 runs per day 1 every 6 hrs. Lagged ensemble – Ensemble formed from model forecasts from different initial times all valid for the same target period Hindcast data available only every 5 th day from 1982-present. Example forecast from Jan 26, 2010.

Forecast Situation El Nino conditions were observed in early CFS was the first to warn of a La Nina

Calibration Most models have too little spread (overconfident). This is compensated for by wide kernels. If the mean ensemble spread is too large, adjustments must be made.

Spread Calibration

SST ( C ) Density Red – Regression on the ensemble mean. (Standard regression) Green line – Individual members Blue Combined envelop CFSv2 Nino 3.4 K=.2

K=.4

K=.6

K=.8

Unaltered Ensemble Regression K=1.0 SST ( C ) Probability Density Red – Ensmble Mean Blue – Kernel Env. Green – Individual members

K=1.2

K=1.4

K=1.6 Near Max Original Fcst. Regression Modified Fcst.

Spread vs. Skill

Adjustments

An information tidbit Generate N values taken randomly from a Gaussian distributed variable. Label them as the ensemble forecasts. N < 20. Take another value randomly from that same distribution and label it the observation. Do an ensemble regression on it many cases (but not so many that R=0) Question: What happens?

Answer Maintains a fixed ratio (on the average)

Inflation

Unaltered Ensemble Regression K=1.0 Very Close to Maximum K for 4 a member ensemble. SST ( C ) Probability Density Red - Ensm Blue – Kernel Env. Green – Individual members

WEIGHTING OF ENSEMBLES

Weighting

Weighting (illustration) Two forecasts (Red = GFS hi-res ensemble mean standard regression error distribution) Blue = GFS ensembles. The “Best” forecast in this case is the one with the highest PDF GEFS is more likely to have the best member if Obs<26.8 C GFS hi-res Is Better

Weighting (Continued) Group ensembles into sets of equal skill. (GEFS, Canadian ensembles, ECMWF ensembles, hi-res GFS, hi-res ECMWF etc) Pass 1) Calculate PDF’s separately Pass 2) Choose highest PDF as best. Keep track of percentages. Pass 3) Enter WEIGHTED ensembles into an ensemble regression. Weights=P(Best)/N An adaptive regression can do this in real time.

Weighted Ensemble CFSv2 Nino 3.4 SSTs – Lead 6-mo. Ensemble Group 1 – Jan For August 2010 Wgt:.36 Ensemble Group 2 – Jan For August 2010 Wgt:.36 Ensemble Group 4 – Jan For August 2010 Wgt:.28

Conclusion It is theoretically sound to derive an equation from the ensemble mean and apply it to individual members. An ensemble regression forecast together with its error estimates resembles Gaussian kernel smoothing except members are first processed by the ensemble mean-based regression equation. Additional control can be achieved by adjusting the spread (K- factor). This capability is required for the case where the ensemble spread is too high. Ensemble regression need not require equally weighted members, only that the probability that each member will be closest be estimated. Weighting coefficients can be derived from the PDFs from component models in relation to the observations. The system delivers reliable probabilistic forecasts that are competitive in skill with manual forecasts (better in reliability).