Five-year Progress in the Performance of Air Quality Forecast Models: Analysis on Categorical Statistics for the National Air Quality Forecast Capacity.

Five-year Progress in the Performance of Air Quality Forecast Models: Analysis on Categorical Statistics for the National Air Quality Forecast Capacity (NAQFC) Daiwen Kang 1, Rohit Mathur 2, Brian Eder 2, Kenneth Schere 2, and S. Trivikrama Rao 2 1 Computer Science Corporation 2 Atmospheric Modeling and Analysis Division NERL/U.S. EPA 8 th Annual CMAS Conference, Chapel Hill, NC, October 19 – 21, 2009

Motivations Assess the progress in performance improvements for categorical metrics of the NAQFC system for O 3 forecasts over the past 5 years Identify categorical metrics that can well characterize AQF performance for categorical forecasts Assess AQI-based categorical performances Propose guidelines for AQF categorical evaluations based on the analysis of KF bias- adjusted forecasts and human forecasts.

Traditional Categorical Metrics Observed Exceedances & Non-Exceedances versus versus Forecast Exceedances & Non-Exceedances Forecast Exceedances & Non-Exceedances ab cd Forecast Exceedance No Yes Observed Exceedance a b c d Observation Forecast

AQI Definition and Categories Air Quality Index (AQI) Values Levels of Health Concern Colors When the AQI is in this range:...air quality conditions are:...as symbolized by this color: 0 to 50GoodGreen (1) 51 to 100ModerateYellow (2) 101 to 150Unhealthy for Sensitive Groups Orange (3) 151 to 200UnhealthyRed (4) 201 to 300Very UnhealthyPurple (5) 301 to 500HazardousMaroon (6) Where: I p = the index for pollutant p (O 3 in this case) C p = the rounded concentration of pollutant p BP Hi = the breakpoint that is ≥ C p BP Lo = the breakpoint that is ≤ C p I Hi = the AQI value corresponding to BP Hi I Lo = the AQI value corresponding to BP Lo

AQI-based Metrics Definition where i is the AQI index category (1, 2, 3, 4, 5) or the color scheme (green, yellow, orange, red, purple), and are the number of observed and forecast instances in the ith category, respectively, is the correctly forecast instances in the ith category, and is the total number of records.

Categorical Stats over 3x domain (1) Accuracy (A) Bias (B) The accuracy is always high (>90%) because the correctly forecast non- exceedence points dominate. Bias indicates that the model has always over estimated execeedences through the years.

Categorical Stats over 3x domain (2) eFAR eH False alarm ratios are quite high across all the years ranging from 70 to 90% on average. Mean hit rates are generally greater than 40% except in the year of 2006; during 2006, a big transition for the meteorology model was made from Eta to WRF.

Categorical Stats over 3x domain (3) Critical Success Index (CSI) Critical success index reflects the combination of false alarm ratio and hit rate. A forecast system can have both high FAR and high H or low FAR and low H, both resulting in low CSI. High CSI values indicate moderate FAR and reasonable H.

Metropolitan Statistical Area (MSA) Local forecasters generally forecast the maximum AQI value that they expect to occur anywhere within an MSA; and then verify this forecast with the maximum monitored value within that area. Here is an example of Charlotte MSA that is comprised of 8 counties, 7 in NC, 1 in SC. There are 8 AQS monitors in those counties, 7 in NC, 1 in SC. And The MSA is represented by 103, 12-km grid cells by the NAQFC. O3O3 AQI

MSAs used in this research Atlanta Charlotte Dallas Houston Washington DC

Kalman Filter Bias-adjustment Kalman Filter (KF) was used to bias-adjust the raw model forecasts for the continental U.S. domain during 2005-2008 summer seasons at all locations where AIRNow monitoring data were available. The categorical performance of both raw model and KF forecasts was assessed over: 1. all sites (paired observation-model grid cell) within the domain, 2. sites within all MSAs, and 3. MSA value (the maximum value out of all the sites within the MSA for each day)

Human NAQFC Exceedance Hit Rate Exceedance False Alarm Rate Because the NAQFC is positively biased, it tends to capture a higher percentage of exceedance hit rates, but this also results in a higher percentage of false alarm ratios. The critical success index results were mixed over MSAs, but on average the NAQFC performed better than Human Forecasts. NAQFC Categorical Performance vs. Human Forecast

cH for the raw model and KF forecasts at all sites and MSAs Domain All Sites: All AIRNow sites within the domain are included in the calculation MSA All Sites: All the AIRNow sites which are located in one of the MSAs listed earlier MSA: The maximum values from both AIRNow sites and the model forecasts within each of the MSAs are used to generate the stats.

cCSI for the raw model and KF forecasts at all sites and MSAs

eH for the raw model and KF forecasts at all sites and MSAs The hit rates are significantly increased when evaluated over MSAs compared to over individual sites. KF bias-adjusted forecasts improved hit rate, especially when the raw model was significantly flawed with systematic biases as in 2006.

eFAR for the raw model and KF forecasts at all sites and MSAs False alarm ratios are significantly lower when evaluated over MSAs than over the individual sites. The KF bias-adjusted forecasts significantly reduced FAR for all the situations across all the years.

eCSI for the raw model and KF forecasts at all sites and MSAs eCSI values almost doubled when evaluated over MSAs compared to those evaluated over the individual sites. The KF bias-adjusted forecasts had larger eCSI values than the raw model forecasts, especially when evaluated over the individual sites.

oH for the raw model and KF forecasts at all sites and MSAs The overall hit rates were consistent and stable and slowly improving over the years for both the KF and raw model forecasts. KF forecasts always had larger oH values than the raw model. oH values decreased when evaluated over MSAs (but still > 50%) due to overestimation at low AQIs compared to those evaluated over individual sites.

oCSI for the raw model and KF forecasts at all sites and MSAs The overall critical success index (oCSI) is quite consistent and increases over the years. The oCSI values are lower when evaluated over MSAs than over individual site because the MSA values are the maximum of all the sites within the MSA resulting in lower hit rate for low AQI values (overestimate low AQI).

Minimum values of H and CSI during the years 2005-2008 over the continental US domain and MSAs Stats Type eH (%) oH (%) eCSI (%) oCSI (%) Raw Model KF Raw Model KF Raw Model KF Raw Model KF All Sites14.539.347.261.212.229.452.363.2 MSA47.261.252.259.532.043.335.442.3 (1)MSA based analysis provides a more objective assessment of the practical use of the guidance, consistent with the way local forecasts are typically developed; (2) Bias-adjustment further improves the predictive skill of the system thereby improving the utility of the forecast products.

Guidelines for AQF models Stats Type eH (%) oH (%) eCSI (%) oCSI (%) All Sites30502050 MSA50 30 These guideline values are in between the minimum values (rounded) of raw model and the KF-adjusted forecasts, which set (1) as targets for what the raw models can realistically achieve as a result of model improvements in the short term; (2) as a reference that any AQF models should perform when combined with KF-adjustment.

Conclusions Comparisons indicate that the NAQFC performed at least as well as, if not better than, the human forecasts over MSAs. The categorical performance of NAQFC has been consistent and stable over the years from 2005 to 2008, with the exception in 2006 when the model underwent significant changes resulting in degraded categorical performance. Kalman filter bias-adjustment resulted in improvement over almost all categorical statistics, especially when the raw model was systematically biased in 2006.

Conclusions Hit Rate (H), False Alarm Ratio (FAR), and Critical Success Index (CSI) are three most appropriate metrics to gauge the categorical performance of an AQF; CSI is even better than H and FAR, because it reflects the combination of H and FAR. The AQI based H and CSI over all sites and MSAs are good indicators of overall performance for categorical forecasts. Based on the analysis in this study, the following guidelines are proposed: eH >= 30%, eCSI >= 20%, oH and oCSI >= 50% for all sites; eH and oH >= 50%, eCSI and oCSI >= 30% for MSAs.

Acknowledgements The authors would like to thank the NOAA/EPA air quality forecast program and the EPA’s AIRNow program for providing forecasted and observed O 3 data. Thanks also goes to Scott Jackson for providing the Human forecast data. Disclaimer The United States Environmental Protection Agency through its Office of Research and Development funded and managed the research described here. It has been subjected to Agency’s administrative review and approved for presentation.

Five-year Progress in the Performance of Air Quality Forecast Models: Analysis on Categorical Statistics for the National Air Quality Forecast Capacity.

Similar presentations

Presentation on theme: "Five-year Progress in the Performance of Air Quality Forecast Models: Analysis on Categorical Statistics for the National Air Quality Forecast Capacity."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Five-year Progress in the Performance of Air Quality Forecast Models: Analysis on Categorical Statistics for the National Air Quality Forecast Capacity.

Similar presentations

Presentation on theme: "Five-year Progress in the Performance of Air Quality Forecast Models: Analysis on Categorical Statistics for the National Air Quality Forecast Capacity."— Presentation transcript:

Similar presentations

About project

Feedback