# Environmental Data Analysis with MatLab Lecture 21: Interpolation.

## Presentation on theme: "Environmental Data Analysis with MatLab Lecture 21: Interpolation."— Presentation transcript:

Environmental Data Analysis with MatLab Lecture 21: Interpolation

Lecture 01Using MatLab Lecture 02Looking At Data Lecture 03Probability and Measurement Error Lecture 04Multivariate Distributions Lecture 05Linear Models Lecture 06The Principle of Least Squares Lecture 07Prior Information Lecture 08Solving Generalized Least Squares Problems Lecture 09Fourier Series Lecture 10Complex Fourier Series Lecture 11Lessons Learned from the Fourier Transform Lecture 12Power Spectral Density Lecture 13Filter Theory Lecture 14Applications of Filters Lecture 15Factor Analysis Lecture 16Orthogonal functions Lecture 17Covariance and Autocorrelation Lecture 18Cross-correlation Lecture 19Smoothing, Correlation and Spectra Lecture 20Coherence; Tapering and Spectral Analysis Lecture 21Interpolation Lecture 22 Hypothesis testing Lecture 23 Hypothesis Testing continued; F-Tests Lecture 24 Confidence Limits of Spectra, Bootstraps SYLLABUS

purpose of the lecture to introduce Interpolation the process of filling in missing data points

time 012 A(t) Scenario 1: data are collected at irregular time intervals, but you want to compute power spectral density, which requires evenly sampled data. frequency psd ?

time 0 1 2 A(t) Scenario 2: two datasets are collected with different sampling intervals, but you want to combine them into a scatter plot A B ? 1 2 B(t)

in both scenarios the times that the data are collected at are inconvenient

we encountered a problem similar to this one back in Lecture 8, where we used prior information to fill in data gaps

time 0 1 2 observed data with missing points d obs (t) time 0 1 2 d est (t) estimated data with missing points filled in

find d i est so that d i est ≈ d i obs at the observation points and roughness of d i est ≈ 0 everywhere

the solution is inexact d i est ≠ d i obs everywhere and roughness of d i est ≠ 0 everywhere

but the inexactness isn’t a problem because both observations and prior information have error

now we examine an alternative approach traditional interpolation similar, but subtly different

find d(t) so that d(t i ) = d i obs at the observation points and roughness of d(t) = 0 in between the observation points

exact

find d(t) so that d(t i ) = d i obs at the observation points and roughness of d(t) = 0 in between the observation points “interpolant”

disadvantage the observation points are singled out as special advantage interpolant d(t) is an analytic function that is known everywhere

disadvantage the observation points are singled out as special advantage interpolant d(t) is an analytic function that is known everywhere can evaluate d(t) at any time, t can differentiate d(t), integrate it, etc. d(t) behaves differently at the observation points than between them

the interpolation problem find an interpolant d(t) that goes through all the data points and “does something sensible” or “satisfies some prior information” between them

some obvious ideas don’t work at all an (N-1) order polynomial can easily be constructed to that it passes through N points so use a polynomial for d(t)

d(t) time, t example

d(t) time, t what happened here? and here? example

solution a low-order polynomial has less potential for wild swings so use many low-order polynomial each valid in a small time interval such a function is called a “spline”

simplest case set of linear polynomials each valid between two data points “connect the data points with straight lines”

t d titi t i+1 d(t)

disadvantage advantages conceptually very simple always get what you expect d(t) has kinks at observation points zero roughness between observations

d(t) time, t example

d(t) time, t example kink

in MatLab observations times of interpolation interpolated observations

getting rid of the kinks use cubic polynomials S i (t) = c 0 + c 1 t + c 2 t 2 + c 3 t 3 each valid between two data points

cubic polynomial has 4 coefficients two constrained by need to pass through two data two to implement prior information no kinks in d(t) or its first derivative

the trick second derivative of cubic is linear so use linear interpolation formula for second derivative

t 2 nd derivative titi t i+1 t i-1 y i-1 yiyi y i+1

t 2 nd derivative titi t i+1 t i-1 y i-1 yiyi y i+1 the second derivative at the observation points, denoted y i, become an unknown in the problem

the second derivative is now integrated twice to give the spline function here a i and b i are two more unknowns that arise from the integration constants

finally one finds the y ’s, a ’s and b ’s so that the spline 1. goes through the observations and 2. has a first derivative that is continuous across the observation points

the solution involves solving a matrix equation for the unknowns (see text for details)

in MatLab observations times of interpolation interpolated observations

d(t) time, t example

d(t) time, t example no kinks

interpolation involves prior information of smoothness

in generalized least-squares the prior information of smoothness is quantified by a roughness matrix, H Hm then we minimize the overall roughness, which is to say the overall error in the prior information (Hm) T (Hm)

note that (Hm) T (Hm) = m T (H T H) m but in generalized error also has the form m T C m -1 m where C m -1 is a covariance matrix so in this case C m = (H T H) -1

so the prior information that the data are smooth is equivalent to the requirement that they have a specific covariance matrix which for stationary time series is equivalent to saying that they have a specific autocorrelation function

so an alternative, more flexible way of interpolating data is by specifying the autocorrelation function that we want the results to have this is called Kriging (after Danie G Krige, its inventor)

Kriging estimate data at arbitrary time t 0

determine weights w by minimizing the variance of with respect to w i we’ll find that we don’t need to know d 0 true only its autocorrelation

assumingand j

assumingand means approximately cancel j

assuming and means approximately cancel expand square j

assumming and means approximately cancel expand square insert weighted average formula j

assumming and means approximately cancel expand square insert weighted average formula j identify terms proportional to autocorrelation

now differentiate with respect to the weight, w k which yields the matrix equation Mw = v

now differentiate with respect to the weight, w k which yields the matrix equation Mw = v note that the autocorrelation appears on both sides of the equation, so that its overall normalization cancels out

all we need now do is specify an autocorrelation function for example we could use the Normal function the variance, L 2, controls the width of the autocorrelation and hence the smoothness of the interpolation

In MatLab observations: t obs, d obs interpolated values: t est, d est Normal autocorrelation function with variance L 2

d(t ) A) KrigingB) Generalized Least Squares time, t d(t) Example

Interpolation in two-dimensions construct an interpolant d(x,y) that goes through the observations and does something sensible in between

1 dimensions t d t0t0 x 2 dimensions y0y0 notion of bracketing observations more complicated y0y0 x0x0

1 dimensions t d titi t i+1 t0t0 x y0y0 x0x0 2 dimensions y notion of bracketing observations more complicated triangular tile segment of t-axis

Delaunay triangles set of most equilateral triangles connecting data points

A) Observations B) Delaunay triangles yy xx

A) Observations B) Delaunay triangles yy xx triangle enclosing a point of interest

D) Cubic Splines C) Linear Splines yy x x

In MatLab linear splines cubic splines