Multivariate Statistics with Grouped Units Hal Whitehead BIOL4062/5062.

Slides:



Advertisements
Similar presentations
Tables, Figures, and Equations
Advertisements

BIOL 582 Lecture Set 22 One-Way MANOVA, Part II Post-hoc exercises Discriminant Analysis.
Lecture 3: A brief background to multivariate statistics
An Introduction to Multivariate Analysis
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
Pattern Classification Chapter 2 (Part 2)0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O.
Principal Components Analysis Babak Rasolzadeh Tuesday, 5th December 2006.
An introduction to Principal Component Analysis (PCA)
Discrim Continued Psy 524 Andrew Ainsworth. Types of Discriminant Function Analysis They are the same as the types of multiple regression Direct Discrim.
Canonical correlations
Visual Recognition Tutorial1 Random variables, distributions, and probability density functions Discrete Random Variables Continuous Random Variables.
Discriminant Analysis Objective Classify sample objects into two or more groups on the basis of a priori information.
Ch. 10: Linear Discriminant Analysis (LDA) based on slides from
LECTURE 17 MANOVA. Other Measures Pillai-Bartlett trace, V Multiple discriminant analysis (MDA) is the part of MANOVA where canonical roots are.
Basic Mathematics for Portfolio Management. Statistics Variables x, y, z Constants a, b Observations {x n, y n |n=1,…N} Mean.
Analysis of Variance & Multivariate Analysis of Variance
Computer Vision Spring ,-685 Instructor: S. Narasimhan WH 5409 T-R 10:30am – 11:50am Lecture #18.
Correlation. The sample covariance matrix: where.
The Multivariate Normal Distribution, Part 1 BMTRY 726 1/10/2014.
The Multivariate Normal Distribution, Part 2 BMTRY 726 1/14/2014.
Multivariate Data and Matrix Algebra Review BMTRY 726 Spring 2012.
Multivariate Analysis of Variance, Part 1 BMTRY 726.
METU Informatics Institute Min 720 Pattern Classification with Bio-Medical Applications PART 2: Statistical Pattern Classification: Optimal Classification.
1 Linear Methods for Classification Lecture Notes for CMPUT 466/551 Nilanjan Ray.
Probability of Error Feature vectors typically have dimensions greater than 50. Classification accuracy depends upon the dimensionality and the amount.
One-Way Manova For an expository presentation of multivariate analysis of variance (MANOVA). See the following paper, which addresses several questions:
Multivariate Analysis of Variance (MANOVA). Outline Purpose and logic : page 3 Purpose and logic : page 3 Hypothesis testing : page 6 Hypothesis testing.
CHAPTER 26 Discriminant Analysis From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon.
Principles of Pattern Recognition
Canonical Correlation Analysis, Redundancy Analysis and Canonical Correspondence Analysis Hal Whitehead BIOL4062/5062.
The Multiple Correlation Coefficient. has (p +1)-variate Normal distribution with mean vector and Covariance matrix We are interested if the variable.
Discriminant Function Analysis Basics Psy524 Andrew Ainsworth.
Some matrix stuff.
Computer Vision Spring ,-685 Instructor: S. Narasimhan Wean 5403 T-R 3:00pm – 4:20pm Lecture #19.
Feature extraction 1.Introduction 2.T-test 3.Signal Noise Ratio (SNR) 4.Linear Correlation Coefficient (LCC) 5.Principle component analysis (PCA) 6.Linear.
ECE 8443 – Pattern Recognition LECTURE 03: GAUSSIAN CLASSIFIERS Objectives: Normal Distributions Whitening Transformations Linear Discriminants Resources.
Principal Coordinate Analysis, Correspondence Analysis and Multidimensional Scaling: Multivariate Analysis of Association Matrices BIOL4062/5062 Hal Whitehead.
General Linear Models; Generalized Linear Models Hal Whitehead BIOL4062/5062.
MANOVA Mechanics. MANOVA is a multivariate generalization of ANOVA, so there are analogous parts to the simpler ANOVA equations First lets revisit Anova.
Inferential Statistics
Available at Chapter 13 Multivariate Analysis BCB 702: Biostatistics
MANOVA AND DISCRIMANT ANALYSIS Juan Carlos Penagos Saul Hoyos.
Canonical Correlation Psy 524 Andrew Ainsworth. Matrices Summaries and reconfiguration.
Statistical Analysis of Data1 of 38 1 of 42 Department of Cognitive Science Adv. Experimental Methods & Statistics PSYC 4310 / COGS 6310 MANOVA Multivariate.
Principal Component Analysis (PCA). Data Reduction summarization of data with many (p) variables by a smaller set of (k) derived (synthetic, composite)
ECE 8443 – Pattern Recognition LECTURE 08: DIMENSIONALITY, PRINCIPAL COMPONENTS ANALYSIS Objectives: Data Considerations Computational Complexity Overfitting.
Principal Components Analysis. Principal Components Analysis (PCA) A multivariate technique with the central aim of reducing the dimensionality of a multivariate.
November 30, PATTERN RECOGNITION. November 30, TEXTURE CLASSIFICATION PROJECT Characterize each texture so as to differentiate it from one.
Discriminant Analysis
Introduction to Multivariate Analysis and Multivariate Distances Hal Whitehead BIOL4062/5062.
 Seeks to determine group membership from predictor variables ◦ Given group membership, how many people can we correctly classify?
Discriminant Function Analysis Mechanics. Equations To get our results we’ll have to use those same SSCP matrices as we did with Manova.
D/RS 1013 Discriminant Analysis. Discriminant Analysis Overview n multivariate extension of the one-way ANOVA n looks at differences between 2 or more.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 18 Multivariate Statistics.
EMPIRICAL ORTHOGONAL FUNCTIONS 2 different modes SabrinaKrista Gisselle Lauren.
Return to Big Picture Main statistical goals of OODA:
Principal Component Analysis
Differences Among Group Means: Multifactorial Analysis of Variance
EMPIRICAL ORTHOGONAL FUNCTIONS
LECTURE 10: DISCRIMINANT ANALYSIS
CH 5: Multivariate Methods
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Linear Discriminant Analysis
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Principal Components What matters most?.
LECTURE 09: DISCRIMINANT ANALYSIS
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Multivariate Methods Berlin Chen
Principal Component Analysis
Multivariate Methods Berlin Chen, 2005 References:
Presentation transcript:

Multivariate Statistics with Grouped Units Hal Whitehead BIOL4062/5062

Multivariate Statistics with Grouped Units: Summary Assumption Multivariate t-test Discriminant function analysis Multivariate Analysis of Variance (MANOVA) Canonical Variate Analysis

Multivariate Statistics with Grouped Units Data matrix is divided into groups of units: –Habitat types (community ecology) –Gender (animal behaviour) –Species (morphometrics) Variables Units

Multivariate Statistics with Grouped Units Assume: Homogeneity of Covariance Matrices (each group considered separately has the same covariance matrix)

Multivariate t-test Is there a significant difference between the multivariate means of two populations? Tested using Hotelling’s T 2 ? X1X1 X2X2

Multivariate t-test Hotelling’s T 2 : T 2 = (X 1 -X 2 )’.S -1.(X 1 -X 2 ).n 1.n 2 /(n 1 +n 2 ) –S is covariance matrix –X 1 is vector of means for first group –X 2 is vector of means for second group –n 1 is number of units in first group –n 1 is number of units in second group

Why do Multivariate Test rather than a Series of Univariate Tests? Significant differences may only be apparent in multivariate space Reduce Type I errors (one test rather than many) ? X1X1 X2X2 ? ?

Discriminant Function Analysis Quantifies difference between two groups of units Purposes: –How do we express the difference between two groups of units? –Which variables are important in quantifying this difference? –How much overlap is there between the two groups of units? –Classification of new unit into one or the other of the two groups.

Discriminant Function Analysis Discriminant function best expresses difference between two groups D = S -1 (X 1 - X 2 ) –S is covariance matrix –X 1 is vector of means for first group –X 2 is vector of means for second group D = a 1 ∙x 1 + a 2 ∙x a k ∙x k

Discriminant Function Analysis D = a 1 ∙ x 1 + a 2 ∙ x a k ∙ x k

Discriminant Function Analysis D = a 1 ∙ x 1 + a 2 ∙ x a k ∙ x k Stepwise removal of variables possible: D = a 2 ∙ x a k-4 ∙ x k-4

Multivariate T-test and Discriminant Function Nutrients in foliage of maple trees (1) Units: 11 sites (6 poor; 5 good) Variables: Nitrogen, Phosphorus, Potassium Mean vectors: X(p) = 0.10 X(g) = Within group covariance matrix: S = 1/

Multivariate T-test and Discriminant Function Nutrients in foliage of maple trees (2) T² = (P<0.01) Discriminant Function: D = -1.99N P K

Multivariate T-test and Discriminant Function Analysis of forest health using aerial photography (1) Units: 22 trees (11 healthy; 11 diseased) Variables: red, green, blue image densities Mean vectors: X(d) = 1.16 X(h) = Within group covariance matrix: S = 1/

Multivariate T-test and Discriminant Function Analysis of forest health using aerial photography (2) T² = (P<0.01) Discriminant Function: D = 4.24R G B

Classification of new individual Use discriminant function (D): –allocate i to group 1 if D(i)<k –allocate i to group 2 if D(i)>k Use Mahalanobis distances (D M ): –allocate i to group 1 if D M (X 1, i)< D M (X 2, i) –allocate i to group 2 if D M (X 1, i)> D M (X 2, i) where D M (X 1, i)< is Mahalanobis distance between i and the mean vector of group 1 {equivalent to discriminant function approach with k=0} Other approaches if data not normal, covariance matrices not homogeneous,...

More than one Group: Multivariate Analysis of Variance (MANOVA) Are there significant differences between the means of several groups of points in multivariate space? Wilk’s Λ= |Within Gps Covariance Matrix| |Total Covariance Matrix| |W| is determinant of matrix W 0 {maximum difference} < Λ < 1 {no difference}

More than one Group: Multivariate Analysis of Variance (MANOVA) Are there significant differences between the means of several groups of points in multivariate space? If no difference between groups, then: -[n-1-½(k-m)] ∙ Log(Λ) is approximately χ² k(m-1) n no. of units k no. of variables m no. of groups Other possible MANOVA statistics

Canonical Variate Analysis Generalization of discriminant function analysis for more than two groups m groups, each with homogeneous covariance matrix

Canonical Variate Analysis 1st canonical axis inclined in direction of greatest variability between means of m groups of samples 2nd canonical axis in direction of next greatest variability etc. (Axes not necessarily orthogonal) 1st canonical axis 2nd canonical axis

Canonical Variate Analysis Used to: –Disclose relationships between groups –How well, and by what functions, can groups be discriminated? –How different variables contribute to the discrimination of groups?

Canonical Variate Analysis Canonical variates are of form: y 1 = a 11 ∙x 1 + a 12 ∙x a 1k ∙x k y 2 = a 21 ∙x 1 + a 22 ∙x a 2k ∙x k... y m-1 = a m-1,1 ∙x 1 + a m-1,2 ∙x a m-1,k ∙x k Number of canonical variates: number of groups - 1 (m - 1) Tests of significance for each canonical variate

Canonical Variate Analysis T total covariance matrix W within-group covariance matrix B between-group covariance matrix B = T - W Eigenvectors of W -1 B are canonical variate coefficients: a 11 ∙ x 1 + a 12 ∙ x a 1k ∙ x k... Corresponding eigenvalues of W -1 B are: Between Groups Sum of Squares Within Groups Sum of Squares

Example: Sperm Whale Movements Variables: –movements in 3hr, 12hr, 24hr Units: –65 days following sperm whales Groups: –4 clans 00:00 24:00 MOVE3 MOVE12 MOVE24 MANOVA: Wilk’s Λ = (P=0.016)

Example: Sperm whale movements Canonical discriminant functions: 123 Constant MOVE MOVE MOVE Eigenvalues Significance:P 0.25P>0.9

Example: Sperm whale movements

Mahalanoblis Classification functions +14+Reg.Short CONSTANT MOVE MOVE MOVE

Example: Sperm whale movements Classification matrix (cases in row categories classified into columns) +14+Reg.Short%correct Reg Short Total

Example: Sperm whale movements Jackknifed Classification matrix +14+Reg.Short%correct Reg Short Total

Sperm Whale Movements More Complex MANOVA’s [MOVE3,MOVE12,MOVE24]=CLAN –CLAN: Λ = (P=0.016) [MOVE3,MOVE12,MOVE24]=AREA+CLAN –AREA: Λ = (P=0.063) –CLAN: Λ = (P=0.042) [MOVE3,MOVE12,MOVE24]=AREA+CLAN(AREA) –AREA: Λ = (P=0.025) –CLAN nested within AREA: Λ = (P=0.057)

Discriminant Functions, Canonical Variates, etc. Are groups different in multivariate space? How are they different? Which variables most contribute to the differences? Classification of new individuals