Environmental Data Analysis with MatLab Lecture 16: Orthogonal Functions.

Slides:



Advertisements
Similar presentations
Environmental Data Analysis with MatLab Lecture 10: Complex Fourier Series.
Advertisements

Environmental Data Analysis with MatLab
Environmental Data Analysis with MatLab Lecture 21: Interpolation.
Environmental Data Analysis with MatLab Lecture 15: Factor Analysis.
Environmental Data Analysis with MatLab Lecture 8: Solving Generalized Least Squares Problems.
Lecture 3: A brief background to multivariate statistics
Lecture 13 L1 , L∞ Norm Problems and Linear Programming
Lecture 15 Orthogonal Functions Fourier Series. LGA mean daily temperature time series is there a global warming signal?
Environmental Data Analysis with MatLab
Lecture 22 Exemplary Inverse Problems including Filter Design.
Lecture 18 Varimax Factors and Empircal Orthogonal Functions.
Environmental Data Analysis with MatLab Lecture 9: Fourier Series.
Environmental Data Analysis with MatLab
Environmental Data Analysis with MatLab Lecture 13: Filter Theory.
Lecture 7 Backus-Gilbert Generalized Inverse and the Trade Off of Resolution and Variance.
Lecture 3 Probability and Measurement Error, Part 2.
Environmental Data Analysis with MatLab Lecture 23: Hypothesis Testing continued; F-Tests.
Environmental Data Analysis with MatLab Lecture 11: Lessons Learned from the Fourier Transform.
Environmental Data Analysis with MatLab
Environmental Data Analysis with MatLab Lecture 12: Power Spectral Density.
Principal Component Analysis
An introduction to Principal Component Analysis (PCA)
Lecture 2 Probability and Measurement Error, Part 1.
Lecture 5 A Priori Information and Weighted Least Squared.
Environmental Data Analysis with MatLab Lecture 17: Covariance and Autocorrelation.
Lecture 4 The L 2 Norm and Simple Least Squares. Syllabus Lecture 01Describing Inverse Problems Lecture 02Probability and Measurement Error, Part 1 Lecture.
Lecture 9 Inexact Theories. Syllabus Lecture 01Describing Inverse Problems Lecture 02Probability and Measurement Error, Part 1 Lecture 03Probability and.
Lecture 17 Factor Analysis. Syllabus Lecture 01Describing Inverse Problems Lecture 02Probability and Measurement Error, Part 1 Lecture 03Probability and.
Lecture 24 Exemplary Inverse Problems including Vibrational Problems.
Lecture 6 Resolution and Generalized Inverses. Syllabus Lecture 01Describing Inverse Problems Lecture 02Probability and Measurement Error, Part 1 Lecture.
Environmental Data Analysis with MatLab Lecture 5: Linear Models.
Lecture 12 Equality and Inequality Constraints. Syllabus Lecture 01Describing Inverse Problems Lecture 02Probability and Measurement Error, Part 1 Lecture.
Environmental Data Analysis with MatLab Lecture 3: Probability and Measurement Error.
Lecture 8 The Principle of Maximum Likelihood. Syllabus Lecture 01Describing Inverse Problems Lecture 02Probability and Measurement Error, Part 1 Lecture.
Environmental Data Analysis with MatLab Lecture 24: Confidence Limits of Spectra; Bootstraps.
Lecture 11 Vector Spaces and Singular Value Decomposition.
Ordinary least squares regression (OLS)
PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Prof.Dr.Cevdet Demir
Lecture 20 Empirical Orthogonal Functions and Factor Analysis.
Environmental Data Analysis with MatLab Lecture 7: Prior Information.
Environmental Data Analysis with MatLab Lecture 20: Coherence; Tapering and Spectral Analysis.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Some matrix stuff.
Environmental Data Analysis with MatLab Lecture 10: Complex Fourier Series.
Modern Navigation Thomas Herring
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Richard Brereton
Principle Component Analysis and its use in MA clustering Lecture 12.
Principal Component Analysis (PCA)
1. Systems of Linear Equations and Matrices (8 Lectures) 1.1 Introduction to Systems of Linear Equations 1.2 Gaussian Elimination 1.3 Matrices and Matrix.
Feature Extraction 主講人:虞台文. Content Principal Component Analysis (PCA) PCA Calculation — for Fewer-Sample Case Factor Analysis Fisher’s Linear Discriminant.
Feature Extraction 主講人:虞台文.
Chapter 13 Discrete Image Transforms
Central limit theorem revisited Throw a dice twelve times- the distribution of values is not Gaussian Dice Value Number Of Occurrences.
Environmental Data Analysis with MatLab 2 nd Edition Lecture 14: Applications of Filters.
Principal Components Analysis ( PCA)
Central limit theorem - go to web applet. Correlation maps vs. regression maps PNA is a time series of fluctuations in 500 mb heights PNA = 0.25 *
Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.
Environmental Data Analysis with MatLab 2 nd Edition Lecture 22: Linear Approximations and Non Linear Least Squares.
PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules LECTURE 8 Supplementary Readings: Wilks, chapters 9.
Principal Component Analysis
Lecture 26: Environmental Data Analysis with MatLab 2nd Edition
Singular Value Decomposition
Principal Component Analysis
Singular Value Decomposition SVD
EE513 Audio Signals and Systems
5.4 General Linear Least-Squares
Outline Singular Value Decomposition Example of PCA: Eigenfaces.
Principal Component Analysis
Environmental Data Analysis with MatLab
Presentation transcript:

Environmental Data Analysis with MatLab Lecture 16: Orthogonal Functions

Lecture 01Using MatLab Lecture 02Looking At Data Lecture 03Probability and Measurement Error Lecture 04Multivariate Distributions Lecture 05Linear Models Lecture 06The Principle of Least Squares Lecture 07Prior Information Lecture 08Solving Generalized Least Squares Problems Lecture 09Fourier Series Lecture 10Complex Fourier Series Lecture 11Lessons Learned from the Fourier Transform Lecture 12Power Spectral Density Lecture 13Filter Theory Lecture 14Applications of Filters Lecture 15Factor Analysis Lecture 16Orthogonal functions Lecture 17Covariance and Autocorrelation Lecture 18Cross-correlation Lecture 19Smoothing, Correlation and Spectra Lecture 20Coherence; Tapering and Spectral Analysis Lecture 21Interpolation Lecture 22 Hypothesis testing Lecture 23 Hypothesis Testing continued; F-Tests Lecture 24 Confidence Limits of Spectra, Bootstraps SYLLABUS

purpose of the lecture further develop Factor Analysis and introduce Empirical Orthogonal Functions

review of the last lecture

example Atlantic Rock Dataset chemical composition for several thousand rocks

Rocks are a mix of minerals, and … mineral 1 mineral 2 mineral 3 rock 1rock 2 rock 3 rock 4 rock 5 rock 6 rock 7 …minerals have a well-defined composition

rocks contain elements rocks contain minerals and minerals contain elements simpler

rocks contain elements rocks contain minerals and minerals contain elements simpler samples factors

the sample matrix, S N samples by M elements e.g. sediment samples rock samples word element is used in the abstract sense and may not refer to actual chemical elements

the factor matrix, F P factors by M elements e.g. sediment sources minerals note that there are P factors a simplification if P<M

the loading matrix, C N samples by P factors specifies the mix of factors for each sample

the key question how many factors are needed to represent the samples? and what are these factors?

singular value decomposition the methodology for answer these questions

the matrix of P mutually-perpendicular vectors, each of length M diagonal matrix Σ, of P singular values the matrix of P mutually- perpendicular vectors, each of length N sample matrix

the matrix of loadings, C. the matrix of factors, F since C depends on Σ, the samples contains more of the factors with large singular values than of the factors with the small singular values

in MatLab

singular values,  ii index, i plot of M singular values, sorted by size

singular values,  ii index, i discard, since close to zero use it to discard near-zero singular values

singular values,  ii index, i and to determine the number P of factors P=5

graphical representation of factors 2 through 5 f5f5 f2f2 f3f3 f4f4 SiO 2 TiO 2 Al 2 O 3 FeO total MgO CaO Na 2 O K2OK2O Atlantic Rock Dataset

C2C2 C3C3 C4C4 factor loadings C 2 through C 4 plotted in 3D factors 2 through 4 capture most of the variability of the rocks

end of review

Part 1: Creating Spiky Factors

can we find “better” factors that those returned by svd() ?

mathematically S = CF = C’ F’ with F’ = M F and C’ = M -1 C where M is any P×P matrix with an inverse must rely on prior information to choose M

one possible type of prior information factors should contain mainly just a few elements

example of minerals MineralComposition QuartzSiO 2 RutileTiO 2 AnorthiteCaAl 2 Si 2 O 8 FosteriteMg 2 SiO 4

spiky factors factors containing mostly just a few elements

How to quantify spikiness?

variance as a measure of spikiness

modification for factor analysis

depends on the square, so positive and negative values are treated the same

f (1) = [1, 0, 1, 0, 1, 0] T is much spikier than f (2) = [1, 1, 1, 1, 1, 1] T

f (2) =[1, 1, 1, 1, 1, 1] T is just as spiky as f (3) = [1, -1, 1, -1, -1, 1] T

“varimax” procedure find spiky factors without changing P start with P svd() factors rotate pairs of them in their plane by angle θ to maximize the overall spikiness

fBfB fAfA f’ B f’ A 

determine θ by maximizing

after tedious trig the solution can be shown to be

and the new factors are in this example A=3 and B=5

now one repeats for every pair of factors and then iterates the whole process several times until the whole set of factors is as spiky as possible

A)B)B) f5f5 f2f2 f3f3 f4f4 f’5f’5 f’2f’2 f’3f’3 f’4f’4 SiO 2 TiO 2 Al 2 O 3 FeO total MgO CaO Na 2 O K2OK2O example: Atlantic Rock dataset

Part 2: Empirical Orthogonal Functions

row number in the sample matrix could be meaningful example: samples collected at a succession of times time

column number in the sample matrix could be meaningful example: concentration of the same chemical element at a sequence of positions distance

S = CF becomes

distance dependence time dependence

S = CF becomes each loading: a temporal pattern of variability of the corresponding factor each factor: a spatial pattern of variability of the element

S = CF becomes there are P patterns and they are sorted into order of importance

S = CF becomes factors now called EOF’s (empirical orthogonal functions)

example sea surface temperature in the Pacific Ocean

29  S 29  N 124  E290  E latitude longitude equatorial Pacific Ocean sea surface temperature (black = warm) CAC Sea Surface Temperature

the image is 30 by 84 pixels in size, or 2520 pixels total to use svd(), the image must be unwrapped into a vector of length 2520

2520 positions in the equatorial Pacific ocean 399 times“element” means temperature

singular values,  ii index, i singular values

singular values,  ii index, i singular values no clear cutoff for P, but the first 10 singular values are considerably larger than the rest

using SVD to approximate data

S=C M F M S=C P F P S≈C P’ F P’ With M EOF’s, the data is fit exactly With P chosen to exclude only zero singular values, the data is fit exactly With P’<P, small non-zero singular values are excluded too, and the data is fit only approximately

A) OriginalB) Based on first 5 EOF’s