Statistical analysis of global temperature and precipitation data Imre Bartos, Imre Jánosi Department of Physics of Complex Systems Eötvös University.

Slides:



Advertisements
Similar presentations
Introduction to modelling extremes
Advertisements

Part 12: Asymptotics for the Regression Model 12-1/39 Econometrics I Professor William Greene Stern School of Business Department of Economics.
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
Justin Glisan Iowa State University Department of Geological and Atmospheric Sciences RACM Project Update: ISU Atmospheric Modeling Component: Part 1 7th.
Economics 20 - Prof. Anderson1 Stationary Stochastic Process A stochastic process is stationary if for every collection of time indices 1 ≤ t 1 < …< t.
Chapter 10 Curve Fitting and Regression Analysis
Simple Multiple Line Fitting Algorithm Yan Guo. Motivation To generate better result than EM algorithm, to avoid local optimization.
Ch11 Curve Fitting Dr. Deshi Ye
Details for Today: DATE:3 rd February 2005 BY:Mark Cresswell FOLLOWED BY:Assignment 2 briefing Evaluation of Model Performance 69EG3137 – Impacts & Models.
Statistical tools in Climatology René Garreaud
STAT 497 APPLIED TIME SERIES ANALYSIS
Variance and covariance M contains the mean Sums of squares General additive models.
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
PREDICTABILITY OF NON- LINEAR TRADING RULES IN THE US STOCK MARKET CHONG & LAM 2010.
Diplomanden-Doktoranden-Seminar Bonn – 18. Mai 2008 LandCaRe 2020 Temporal downscaling of heavy precipitation and some general thoughts about downscaling.
Deterministic Solutions Geostatistical Solutions
Correlation. The sample covariance matrix: where.
Statistical Methods for long-range forecast By Syunji Takahashi Climate Prediction Division JMA.
Impacts of Climate Change on Corn and Soybean Yields in China Jintao Xu With Xiaoguang Chen and Shuai Chen June 2014.
Two and a half problems in homogenization of climate series concluding remarks to Daily Stew Ralf Lindau.
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
Session 4. Applied Regression -- Prof. Juran2 Outline for Session 4 Summary Measures for the Full Model –Top Section of the Output –Interval Estimation.
STATISTICS: BASICS Aswath Damodaran 1. 2 The role of statistics Aswath Damodaran 2  When you are given lots of data, and especially when that data is.
CPE 619 Simple Linear Regression Models Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama.
Simple Linear Regression Models
Chapter 15 Modeling of Data. Statistics of Data Mean (or average): Variance: Median: a value x j such that half of the data are bigger than it, and half.
Correlation.
Correlation properties of global satellite and model ozone time series Viktória Homonnai, Imre M. Jánosi Eötvös Loránd University, Hungary Data: LATMOS/CNRS.
Objectives (BPS chapter 11) Sampling distributions  Parameter versus statistic  The law of large numbers  What is a sampling distribution?  The sampling.
Sampling distributions BPS chapter 11 © 2006 W. H. Freeman and Company.
Geostatistical approach to Estimating Rainfall over Mauritius Mphil/PhD Student: Mr.Dhurmea K. Ram Supervisors: Prof. SDDV Rughooputh Dr. R Boojhawon Estimating.
Comparative analysis of climatic variability characteristics of the Svalbard archipelago and the North European region based on meteorological stations.
10 IMSC, August 2007, Beijing Page 1 An assessment of global, regional and local record-breaking statistics in annual mean temperature Eduardo Zorita.
Descriptive Statistics: Variability Lesson 5. Theories & Statistical Models n Theories l Describe, explain, & predict real- world events/objects n Models.
R. Kass/W03P416/Lecture 7 1 Lecture 7 Some Advanced Topics using Propagation of Errors and Least Squares Fitting Error on the mean (review from Lecture.
Why Is It There? Getting Started with Geographic Information Systems Chapter 6.
Copyright © 2012 Pearson Education. Chapter 23 Nonparametric Methods.
TIME SERIES ANALYSIS Time Domain Models: Red Noise; AR and ARMA models LECTURE 7 Supplementary Readings: Wilks, chapters 8.
1 Statistical Distribution Fitting Dr. Jason Merrick.
Intro. ANN & Fuzzy Systems Lecture 26 Modeling (1): Time Series Prediction.
Applications of Neural Networks in Time-Series Analysis Adam Maus Computer Science Department Mentor: Doctor Sprott Physics Department.
On the Trend, Detrend and the Variability of Nonlinear and Nonstationary Time Series Norden E. Huang Research Center for Adaptive Data Analysis National.
Wobbles, humps and sudden jumps1 Transitions in time: what to look for and how to describe them …
1 Quadratic Model In order to account for curvature in the relationship between an explanatory and a response variable, one often adds the square of the.
Scaling functions for finite-size corrections in EVS Zoltán Rácz Institute for Theoretical Physics Eötvös University Homepage:
Linear Regression Basics III Violating Assumptions Fin250f: Lecture 7.2 Spring 2010 Brooks, chapter 4(skim) 4.1-2, 4.4, 4.5, 4.7,
Scaling functions for finite-size corrections in EVS Zoltán Rácz Institute for Theoretical Physics Eötvös University Homepage:
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
NON-LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 1 Lecture 6 Slide 1 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall.
BASIC STATISTICAL CONCEPTS Statistical Moments & Probability Density Functions Ocean is not “stationary” “Stationary” - statistical properties remain constant.
Assessing the Influence of Decadal Climate Variability and Climate Change on Snowpacks in the Pacific Northwest JISAO/SMA Climate Impacts Group and the.
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression.
Chi Square Test for Goodness of Fit Determining if our sample fits the way it should be.
Individual observations need to be checked to see if they are: –outliers; or –influential observations Outliers are defined as observations that differ.
Analysis of Financial Data Spring 2012 Lecture: Introduction Priyantha Wijayatunga Department of Statistics, Umeå University
Central limit theorem - go to web applet. Correlation maps vs. regression maps PNA is a time series of fluctuations in 500 mb heights PNA = 0.25 *
1 Ka-fu Wong University of Hong Kong A Brief Review of Probability, Statistics, and Regression for Forecasting.
AP Statistics Review Day 1 Chapters 1-4. AP Exam Exploring Data accounts for 20%-30% of the material covered on the AP Exam. “Exploratory analysis of.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
R. Kass/Sp07P416/Lecture 71 More on Least Squares Fit (LSQF) In Lec 5, we discussed how we can fit our data points to a linear function (straight line)
Estimating standard error using bootstrap
9.3 Filtered delay embeddings
Signal processing.
Prediction, Goodness-of-Fit, and Modeling Issues
Cautions about Correlation and Regression
Fundamentals of regression analysis
Stochastic Hydrology Hydrological Frequency Analysis (II) LMRD-based GOF tests Prof. Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.
Review of Probability Concepts
Linear regression Fitting a straight line to observations.
Chapter 3 Statistical Concepts.
Presentation transcript:

Statistical analysis of global temperature and precipitation data Imre Bartos, Imre Jánosi Department of Physics of Complex Systems Eötvös University

Outline The GDCN database Correlation properties of temperature data Short-term Long-term Nonlinear Cumulants Extreme value statistics Recent results Degrees of Freedom estimation

Global Daily Climatology Network Temperature stationsPrecipitation stations stations…

Correlation properties Short-term correlationLong-term correlation TiTi

Correlation properties Short-term correlationLong-term correlation TiTi a i+1 = T i+1 -  T i+1  = F(a i ) +  i Short term memory: exponential decay Autoregressive process: Linear case: AR1 a i+1 = A  a i +  i C 1 (  ) =  a i  a i+   ~ A 

Short-term correlation a i+1 = A  a i +  i in terms of temperature change:  a i+1 = a i+1 – a i ~ T i+1 – T i = (A-1)  a i +  i thus the response function one measures:   a i+1  = (A-1)  a i + 0 The fitted curve:   a i+1  = c 1  a i + c 0 Király, Jánosi, PRE (2002).

Short-term correlation  a i+1  = c 1  a i + c 0 c0c0 c1c1 Bartos, Jánosi, Geophys. Res. Lett. (2005). |c 1 | it increases to the South-East c 0 != 0 significantly a i - asymmetric distribution

Short-term correlation more warming steps (N m ) then cooling (N h ) Bartos, Jánosi, Geophys. Res. Lett. (2005). the average cooling steps (S h ) are bigger then the average warming steps (S m ) Warming index: W = (N m  S m ) / (N h  S h ) Do these two effects compensate each other? asymmetric distribution Global warming (?)

Correlation properties Short-term correlationLong-term correlation TiTi C(  ) =  a i  a i+   ~  -  Long term memory: power decay

Long-term correlation Measurement: Detrended Fluctuation Analysis (DFA)  F(n)  ~ n   = 2  (1 -  ) C(  ) =  a i  a i+   ~  -  DFA curve: Initial gradient (  0 ) Asymptotic gradient (  )  ~ long-term memory  0 ~ short term memory

Detrended Fluctuation Analysis (DFA) Király, Bartos, Jánosi, Tellus A (2006).  All time series are long term correlated

Nonlinear correlation Linear (Gauss) process: C q>2 = f(C 2 ) (3rd or higher cumulants are 0) Two-point correlation: C 2 =  a i  a j , q-point correlation: C q = F(a i  a j  a k …) C 2 completely describes the process Nonlinear (multifractal) process: 3rd or higher cumulants are NOT 0 the 2-point correlation doesn’t give the full picture One needs to measure the nonlinear correlations for the full description

Nonlinear correlation The 2-point correlation of the volatility time series features the nonlinear correlation properties of the anomaly time series a i |a i+1 - a i | „volatility” time series: volatility - DFA exponent

Nonlinear correlation There is also short- and long-term memory for the volatility time series volatility - initial DFA exponent

In short… Daily temperature values are correlated in both short and long terms and both linearly and nonlinearly. We constructed the geographic distributions for these properties, and described or explained some of them in details. volatility - initial DFA exponent

Cumulants skewness kurtosis - nonuniform can affect the EVS

Extreme value statistics we want to use temperature time series temperature anomaly normalized anomaly

Extreme value statistics we try to get rid of the spatial correlation lets use one station in every 4x4 grid

Dangers in filtering for extreme value statistics after filtering out the flagged (bad) data: cutoff at 3.5  Daily normalized distribution seems exactly like a Weibull distribution Explanation: preliminary filtering of „outliers”

Then how can we filter out bad data?? Extreme value statistics There are certainly bad data in the series. The usual way to filter them out is to flag the suspicious ones, but it seems we cannot use the flags. One try to find real outliers: Temperature difference distribution Impossible to validate

Another possible way: try to isolate unreliable stations Extreme value statistics Now we use all the data without filtering spatial correlations Also notice the two peaks

New problem: the two peaks Extreme value statistics What makes the average maximum values differ for some stations? Why two peaks? skewnesskurtosiscorrelation dependsdoesn’t depend

New problem: the two peaks Extreme value statistics Average yearly maximum One can spatially separate the different peaks

Separate one peak by using US stations only: Extreme value statistics Finally we get to the Gumbel distribution 

Degrees of Freedom Why does the average maximum value not depend on the correlation exponent? One can calculate the degrees of freedome of N variables with long time correlation characterized by correlation exponent  DOF = N^2 /  i ^2 Where i is the ith eigenvalue of the covariance matrix, containing the covariance of each pair of days of the year. Long term correlation: C(|x-y|) = c * |x-y|^  Short term correlation: T i+1 = A * T i + noise Variables determining the DOF: c, , A.

Degrees of Freedom – Dependence on correlation C = 1 C = 0.25 C = Short-term

Degrees of Freedom – measurement and calculation Estimation with with c=1 Measurement: Chi square method (underestimation)

Degrees of Freedom – difficulties c = 1 estimation: this causes the difference It is hard to measure anything due to the bad signal to noise rato To say something about c: correlation between consequtive years

Imre Bartos, Imre Jánosi Department of Physics of Complex Systems, Eötvös University Statistical analysis of global temperature and precipitation data