Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Probabilistic-Spatial Approach to the Quality Control of Climate Observations Christopher Daly, Wayne Gibson, Matthew Doggett, Joseph Smith, and George.

Similar presentations


Presentation on theme: "A Probabilistic-Spatial Approach to the Quality Control of Climate Observations Christopher Daly, Wayne Gibson, Matthew Doggett, Joseph Smith, and George."— Presentation transcript:

1 A Probabilistic-Spatial Approach to the Quality Control of Climate Observations Christopher Daly, Wayne Gibson, Matthew Doggett, Joseph Smith, and George Taylor Spatial Climate Analysis Service Oregon State University Corvallis, Oregon, USA

2 Traditional QC Systems are Categorical and Deterministic Data subjected to categorical quality checksData subjected to categorical quality checks –Designed to uncover mistakes Validity determined from test resultsValidity determined from test results –Mistake = flag / toss –No mistake = no flag / keep Designed to Work With Human Observing Systems

3 Alien Electronic Devices are Invading the Climate Observing World! They’re Everywhere!

4 Electronic Sensors and Modern Applications Create Challenges for Traditional QC Systems Errors tend to be continuous drift, rather than categorical mistakesErrors tend to be continuous drift, rather than categorical mistakes Increasing usage of computer applications that rely on climate observationsIncreasing usage of computer applications that rely on climate observations Continuous estimates, rather than categorical tests, of observation validityContinuous estimates, rather than categorical tests, of observation validity Quantitative estimates of observational uncertainty, not just flagsQuantitative estimates of observational uncertainty, not just flags SituationNeed

5 More Challenges… Range of applications is increasingly rapidly, and each has a difference tolerance for outliersRange of applications is increasingly rapidly, and each has a difference tolerance for outliers Data are often more voluminous and disseminated in a more timely mannerData are often more voluminous and disseminated in a more timely manner Probabilistic information from which a decision to use an obs can be made, not up- front decisionProbabilistic information from which a decision to use an obs can be made, not up- front decision Automated QC methodsAutomated QC methods SituationNeed

6 An Opportunity Advances in climate mapping technology now make it possible to estimate a reasonably accurate “expected value” for an observation based on surrounding stations. Assumption: Spatial consistency is related to observation validity

7 Useful Characteristics for a Next-Generation Climate QC System continuousquantitativeprobabilisticautomatedspatial

8 PRISM Probabilistic-Spatial QC (PSQC) System for SNOTEL Data Uses climate mapping technology and climate statistics to provide a continuous, quantitative confidence probability for each observation, estimate a replacement value, and provide a confidence interval for that replacement. Start with daily max/min temperature for all SNOTEL sites, period of record Move to precipitation, SWE, soil temperature and moisture Develop automated system for near-real time operation at NRCS

9 Climatological Grid Development –PRISM must produce a high-quality estimate of temperature at each SNOTEL station each day –Highest interpolation skill obtained by using a high-quality predictive grid that represents the long-term climatological temperature for that day, rather than a digital elevation grid –Climatological grid: 0.8 km resolution, 1971-2000 4 km 0.8 km

10 Oregon Annual Precipitation Leveraging Information Content of High-Quality Climatologies to Create New Maps with Fewer Data and Less Effort Climatology used in place of DEM as PRISM predictor grid

11 PRISM Regression of “Weather vs Climate” 20 July 2000 Tmax vs 1971-2000 Mean July Tmax

12 -Generates gridded estimates of climatic parameters -Moving-window regression of climate vs. elevation for each grid cell -Uses nearby station observations - Spatial climate knowledge base (KBS) weights stations in the regression function by their climatological similarity to the target grid cell PRISM Parameter-elevation Regressions on Independent Slopes Model

13 PRISM KBS accounts for spatial variations in climate due to: -Elevation -Terrain orientation -Terrain steepness -Moisture regime -Coastal proximity -Inversion layer -Long-term climate patterns PRISM Parameter-elevation Regressions on Independent Slopes Model

14 PRISM Moving-Window Regression Function 1961-90 Mean April Precipitation, Qin Ling Mountains, China Weighted linear regression

15 Rain Shadows: 1961-90 Mean Annual Precipitation Oregon Cascades Portland Eugene Sisters Redmond Bend Mt. Hood Mt. Jefferson Three Sisters N 350 mm/yr 2200 mm/yr 2500 mm/yr Dominant PRISM KBS Components Elevation Terrain orientation Terrain steepness Moisture Regime

16 1961-90 Mean Annual Precipitation, Cascade Mtns, OR, USA

17

18 Coastal Effects: 1971-00 July Maximum Temperature Central California Coast Monterey San Francisco San Jose Santa Cruz Hollister Salinas Stockton Sacramento Pacific Ocean Fremont N Preferred Trajectories Dominant PRISM KBS Components Elevation Coastal Proximity Inversion Layer 34 ° 20 ° 27 ° Oakland

19 Inversions – 1971-00 July Minimum Temperature Northwestern California Ukiah CloverdaleLakeport Willits Clear Lake Pacific Ocean Lake Pilsbury. N Dominant PRISM KBS Components Elevation Inversion Layer Topographic Index Coastal Proximity 12 ° 17 ° 9°9° 16 ° 10 ° 17 °

20 Definition of CP: Given the difference between an observation and an expected value (residual), CP is the probability that another observation and expected value from the same time of year would differ by at least as much Residual distribution +/- 15 day, +/- 2 year window = 5 yrs, 31 days each (N~155) PRISM PSQC System Confidence Probability (CP)

21 Confidence Probability Takes into Account Uncertainty in the System P-value is higher for a given deviation from the mean when S x is large (low skill) X = Residual (P-O) Low Overall SkillHigh Overall Skill

22 Interpreting Confidence Probability Continuous values from 0 – 100% 0% = highly spatially inconsistent observation, reflected in a PRISM prediction that is unusually different than the observation 100% = highly consistent observation, reflected in a PRISM prediction that is relatively close to the observation Guidelines to date CP > 30: Use observation as-is 10 < CP < 30: Blend prediction and observation CP < 10: Use prediction instead of observation

23 PRISM PSQC Process 1. Create Database Records Goal: Enter daily tmax/tmin observations for all networks into database and prepare data Current Actions: 1.Ingest daily tmin/tmax observations from SNOTEL, COOP, RAWS, Agrimet, ASOS, and first-order networks. 2.Shift AM COOP observations of tmax to previous day (assumes standard diurnal curve, which does not always apply). 3.Convert units to degrees Celsius.

24 PRISM PSQC Process 2. Single-Station Checks Goal: Take all QC actions possible at the single-station level, before entering the spatial QC process. Current Checks: 1.Temperature observation is well above the all-time record maximum or well below the all-time record minimum for the state – flag set and CP set to 0 2.Maximum temperature is less than the minimum temperature – flag set and CP set to 0 3.First daily tmax/tmin observation after a period of missing data – flag set and CP set to 0 (COOP only?) 4.More than 10 consecutive observations with the same value (<+/-1F COOP, <+/-0.1C others), or more than 5 consecutive zero values, is a definite flatliner – flag set and CP set to 0 5.5-10 consecutive observations with the same value is a potential flatliner, to be assessed by the spatial QC system – flag set and CP unchanged

25 PRISM PSQC Process 3. Spatial QC System Goal : Through a series of iterations, gradually and systematically “weed out” spatially inconsistent observations from consistent ones Overview: 1.PRISM is run for each station location for each day, and summary statistics are accumulated 2.Once all days have been run, frequency distributions are developed and confidence probabilities (CP) for each daily station observation are estimated 3.These CP values are used to weight the daily observations in a second iteration of PRISM daily runs 4.Obs with lower CP values are given lower weight, and thus have less influence, in the second set of PRISM predictions, and are also given lower weight in the calculation of the second set of summary statistics 5.CP values are again calculated and passed back to the daily PRISM runs 6.This iterative process continues for about 5 iterations, at which time the CP values have reached equilibrium

26 QC Iteration For each station-day: Run PRISM for each station location in its absence, estimating its obs for each dayRun PRISM for each station location in its absence, estimating its obs for each day PRISM omits nearby stations, singly, and in pairs, to try to better match observationPRISM omits nearby stations, singly, and in pairs, to try to better match observation Prediction closest to obs is acceptedPrediction closest to obs is accepted –Raw PRISM variables: –Raw PRISM variables: Observation (O), Prediction (P), Residual (R=P-O), PRISM Regression Standard Deviation (S) Once all station-days are run: Calculate summary statistics for each station for each day – Mean and std dev of O (Os), P (Ps), R (Rs), and S (Ss) –+/- 15 day, +/- 2 year window = 5 yrs, 31 days each (N~155) –5-day running Standard Deviation (RunSD) as a measure of day-to-day variability (time shifting) –Potential flatliners: calculate V, the ratio of station’s RunSD (set to 0.3) to that of surrounding stations Determine “effective” standard deviation for frequency distribution –Sigma = Max ( Rs, S, Ss, RunSD, 2 ) Calculate probability statistics for O, P, R, S, and V for each day –Probability statistics are p-values from z-tests –Residual Probability (RP) used as an estimate of overall Confidence Probability (CP) for an observation –Except in the case of potential flatliners, where CP = min(RP,VP) CP used to weight stations in next iteration

27 Observations and CP values, Date: 1996-02-08 Drifting sensor : MCKENZIE PASS (21E07S)

28 Climatology vs Observation and Prediction, Date: 1996-02-08 Drifting sensor : MCKENZIE PASS (21E07S)

29 Warm Bias: SALT CREEK FALLS (22F04S) Observations, Date: 2000-07-14

30 Anomalies and CP values, 7-21 July 2000 Warm Bias: SALT CREEK FALLS (22F04S) 14 July

31 Scatter Plot: Climatology vs Observation, 14 July 2000 Warm Bias: SALT CREEK FALLS (22F04S) 22F04S Odell Lake COOP

32 Tmax Observations, Date: 2000-07-14

33 Computing Obstacles Computing – currently takes about 60 hours to run PRISM PSQC system for SNOTEL sites in the western US –14-processor cluster Disk space – we now have > 1 TB, but will probably need more Funds are insufficient to “do it right”

34 Issues to Consider How far can the assumption be taken that spatial consistency equates with validity?How far can the assumption be taken that spatial consistency equates with validity? Are continuous and probabilistic QC systems useful for manual observing systems? Can a high-quality QC system ever be completely automated?


Download ppt "A Probabilistic-Spatial Approach to the Quality Control of Climate Observations Christopher Daly, Wayne Gibson, Matthew Doggett, Joseph Smith, and George."

Similar presentations


Ads by Google