The Multiple Correlation Coefficient. has (p +1)-variate Normal distribution with mean vector and Covariance matrix We are interested if the variable.

Slides:



Advertisements
Similar presentations
Discrimination amongst k populations. We want to determine if an observation vector comes from one of the k populations For this purpose we need to partition.
Advertisements

MANOVA Mechanics. MANOVA is a multivariate generalization of ANOVA, so there are analogous parts to the simpler ANOVA equations First lets revisit Anova.
Canonical Correlation
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
The General Linear Model. The Simple Linear Model Linear Regression.
Multivariate distributions. The Normal distribution.
Principal Component Analysis
An introduction to Principal Component Analysis (PCA)
Linear Transformations
Principal Component Analysis
Canonical Correlation: Equations Psy 524 Andrew Ainsworth.
Probability theory 2011 The multivariate normal distribution  Characterizing properties of the univariate normal distribution  Different definitions.
Canonical correlations
Ch 7.3: Systems of Linear Equations, Linear Independence, Eigenvalues
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Visual Recognition Tutorial1 Random variables, distributions, and probability density functions Discrete Random Variables Continuous Random Variables.
Probability theory 2008 Outline of lecture 5 The multivariate normal distribution  Characterizing properties of the univariate normal distribution  Different.
Techniques for studying correlation and covariance structure
Correlation. The sample covariance matrix: where.
The Multivariate Normal Distribution, Part 2 BMTRY 726 1/14/2014.
Multivariate Data and Matrix Algebra Review BMTRY 726 Spring 2012.
Review Guess the correlation. A.-2.0 B.-0.9 C.-0.1 D.0.1 E.0.9.
Maximum Likelihood Estimation
Lecture 5 Correlation and Regression
Summarized by Soo-Jin Kim
Principal Components Analysis BMTRY 726 3/27/14. Uses Goal: Explain the variability of a set of variables using a “small” set of linear combinations of.
Inference for the mean vector. Univariate Inference Let x 1, x 2, …, x n denote a sample of n from the normal distribution with mean  and variance 
PHY 301: MATH AND NUM TECH Chapter 5: Linear Algebra Applications I.Homogeneous Linear Equations II.Non-homogeneous equation III.Eigen value problem.
Multivariate Analysis of Variance (MANOVA). Outline Purpose and logic : page 3 Purpose and logic : page 3 Hypothesis testing : page 6 Hypothesis testing.
Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables.
CHAPTER 26 Discriminant Analysis From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon.
Repeated Measures Designs. In a Repeated Measures Design We have experimental units that may be grouped according to one or several factors (the grouping.
Some matrix stuff.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Multivariate Statistics Matrix Algebra I W. M. van der Veld University of Amsterdam.
Computing Eigen Information for Small Matrices The eigen equation can be rearranged as follows: Ax = x  Ax = I n x  Ax - I n x = 0  (A - I n )x = 0.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
Vector Norms and the related Matrix Norms. Properties of a Vector Norm: Euclidean Vector Norm: Riemannian metric:
1 Inferences About The Pearson Correlation Coefficient.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
1 Sample Geometry and Random Sampling Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking.
Techniques for studying correlation and covariance structure Principal Components Analysis (PCA) Factor Analysis.
Chapter 7 Multivariate techniques with text Parallel embedded system design lab 이청용.
Lecture 26 Molecular orbital theory II
1 Matrix Algebra and Random Vectors Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking.
Copyright © 2011 Pearson Education, Inc. Solving Linear Systems Using Matrices Section 6.1 Matrices and Determinants.
Inference for the mean vector. Univariate Inference Let x 1, x 2, …, x n denote a sample of n from the normal distribution with mean  and variance 
M.Sc. in Economics Econometrics Module I Topic 4: Maximum Likelihood Estimation Carol Newman.
1.7 Linear Independence. in R n is said to be linearly independent if has only the trivial solution. in R n is said to be linearly dependent if there.
Feature Extraction 主講人:虞台文. Content Principal Component Analysis (PCA) PCA Calculation — for Fewer-Sample Case Factor Analysis Fisher’s Linear Discriminant.
Summary of the Statistics used in Multiple Regression.
1 Canonical Correlation Analysis Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking.
Université d’Ottawa / University of Ottawa 2001 Bio 8100s Applied Multivariate Biostatistics L11.1 Lecture 11: Canonical correlation analysis (CANCOR)
Chapter 10 Canonical Correlation Analysis. Introduction Canonical correlation analysis focuses on the correlation between a linear combination of the.
(4-2) Adding and Subtracting Matrices Objectives: To Add and subtract Matrices To solve certain Matrix equations.
Canonical Correlation Analysis (CCA). CCA This is it! The mother of all linear statistical analysis When ? We want to find a structural relation between.
Lecture 2 Survey Data Analysis Principal Component Analysis Factor Analysis Exemplified by SPSS Taylan Mavruk.
Introduction to Differential Equations
Principal Component Analysis
Review Guess the correlation
Solving Systems of Linear Equations: Iterative Methods
Section 4.1 Eigenvalues and Eigenvectors
Inference for the mean vector
Quantitative Methods Simple Regression.
Techniques for studying correlation and covariance structure
Matrix Algebra and Random Vectors
数据的矩阵描述.
Principal Component Analysis
Chapter-1 Multivariate Normal Distributions
Presentation transcript:

The Multiple Correlation Coefficient

has (p +1)-variate Normal distribution with mean vector and Covariance matrix We are interested if the variable y is independent of the vector Definition The multiple correlation coefficient is the maximum correlation between y and a linear combination of the components of

This vector has a bivariate Normal distribution with mean vector and Covariance matrix We are interested if the variable y is independent of the vector Derivation The multiple correlation coefficient is the maximum correlation between y and a linear combination of the components of

Thus we want to choose to maximize The multiple correlation coefficient is the maximum correlation between y and The correlation between y and Equivalently

Note:

The multiple correlation coefficient is independent of the value of k.

We are interested if the variable y is independent of the vector The sample Multiple correlation coefficient Then the sample Multiple correlation coefficient is

Testing for independence between y and The test statistic If independence is true then the test statistic F will have an F- distributions with 1 = p degrees of freedom in the numerator and 1 = n – p + 1 degrees of freedom in the denominator The test is to reject independence if:

Other tests for Independence The test statistic If independence is true then the test statistic t will have a t - distributions with = n –2 degrees of freedom. The test is to reject independence if: Test for zero correlation (Independence between a two variables)

The test statistic If H 0 is true the test statistic z will have approximately a Standard Normal distribution Test for non-zero correlation (H 0 :    We then reject H 0 if:

The test statistic If independence is true then the test statistic t will have a t - distributions with = n – p - 2 degrees of freedom. The test is to reject independence if: Test for zero partial correlation correlation (Conditional independence between a two variables given a set of p Independent variables) = the partial correlation between y i and y j given x 1, …, x p.

The test statistic If H 0 is true the test statistic z will have approximately a Standard Normal distribution Test for non-zero partial correlation We then reject H 0 if:

Canonical Correlation Analysis

The problem Quite often when one has collected data on several variables. The variables are grouped into two (or more) sets of variables and the researcher is interested in whether one set of variables is independent of the other set. In addition if it is found that the two sets of variates are dependent, it is then important to describe and understand the nature of this dependence. The appropriate statistical procedure in this case is called Canonical Correlation Analysis.

Canonical Correlation: An Example In the following study the researcher was interested in whether specific instructions on how to relax when taking tests and how to increase Motivation, would affect performance on standardized achievement tests Reading, Language and Mathematics

A group of 65 third- and fourth-grade students were rated after the instruction and immediately prior taking the Scholastic Achievement tests on: In addition data was collected on the three achievement tests how relaxed they were (X 1 ) and how motivated they were (X 2 ). Reading (Y 1 ), Language (Y 2 ) and Mathematics (Y 3 ). The data were tabulated on the next page

Definition: (Canonical variates and Canonical correlations) have p-variate Normal distribution with and Let be such that U 1 and V 1 have achieved the maximum correlation  1. and Then U 1 and V 1 are called the first pair of canonical variates and  1 is called the first canonical correlation coefficient.

derivation: ( 1 st pair of Canonical variates and Canonical correlation) has covariance matrixThus Now

derivation: ( 1 st pair of Canonical variates and Canonical correlation) has covariance matrixThus Now hence

Thus we want to choose is at a maximum so that is at a maximum or Let

Computing derivatives and

Thus This shows thatis an eigenvector of k is the largest eigenvalue of andis the eigenvector associated with the largest eigenvalue.

Also and

Summary: are found by finding, eigenvectors of the matrices associated with the largest eigenvalue (same for both matrices) The first pair of canonical variates The largest eigenvalue of the two matrices is the square of the first canonical correlation coefficient  1

The remaining canonical variates and canonical correlation coefficients are found by finding, so that 1. (U 2,V 2 ) are independent of (U 1,V 1 ). The second pair of canonical variates 2. The correlation between U 2 and V 2 is maximized The correlation,  2, between U 2 and V 2 is called the second canonical correlation coefficient.

are found by finding, so that 1. (U i,V i ) are independent of (U 1,V 1 ), …, (U i-1,V i-1 ). The i th pair of canonical variates 2. The correlation between U i and V i is maximized The correlation,  2, between U 2 and V 2 is called the second canonical correlation coefficient.

derivation: ( 2 nd pair of Canonical variates and Canonical correlation) has covariance matrix Now

and maximizing Is equivalent to maximizing subject to Using the Lagrange multiplier technique

Now and alsogives the restrictions

These equations can used to show that are eigenvectors of the matrices associated with the 2 nd largest eigenvalue (same for both matrices) The 2 nd largest eigenvalue of the two matrices is the square of the 2 nd canonical correlation coefficient  2

Coefficients for the i th pair of canonical variates, are eigenvectors of the matrices associated with the i th largest eigenvalue (same for both matrices) The i th largest eigenvalue of the two matrices is the square of the i th canonical correlation coefficient  i continuing

Example Variables relaxation Score (X 1 ) motivation score (X 2 ). Reading (Y 1 ), Language (Y 2 ) and Mathematics (Y 3 ).

Summary Statistics

Canonical Correlation statistics Statistics

continued

Summary U 1 = Relax Mot V 1 = Read Lang Math  1 =.592 U 2 = Relax Mot V 2 = Math Read Lang  2 =.159