Chapter-1 Multivariate Normal Distributions

Slides:



Advertisements
Similar presentations
SJS SDI_21 Design of Statistical Investigations Stephen Senn 2 Background Stats.
Advertisements

Comparing k Populations Means – One way Analysis of Variance (ANOVA)
Use of moment generating functions. Definition Let X denote a random variable with probability density function f(x) if continuous (probability mass function.
The General Linear Model. The Simple Linear Model Linear Regression.
Multivariate distributions. The Normal distribution.
Probability theory 2010 Order statistics  Distribution of order variables (and extremes)  Joint distribution of order variables (and extremes)
Ch 7.9: Nonhomogeneous Linear Systems
Probability theory 2011 The multivariate normal distribution  Characterizing properties of the univariate normal distribution  Different definitions.
Visual Recognition Tutorial1 Random variables, distributions, and probability density functions Discrete Random Variables Continuous Random Variables.
Continuous Random Variables and Probability Distributions
Correlations and Copulas Chapter 10 Risk Management and Financial Institutions 2e, Chapter 10, Copyright © John C. Hull
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Dr. Mario MazzocchiResearch Methods & Data Analysis1 Correlation and regression analysis Week 8 Research Methods & Data Analysis.
Probability theory 2008 Outline of lecture 5 The multivariate normal distribution  Characterizing properties of the univariate normal distribution  Different.
Techniques for studying correlation and covariance structure
Correlation. The sample covariance matrix: where.
Lecture II-2: Probability Review
The Multivariate Normal Distribution, Part 1 BMTRY 726 1/10/2014.
The Multivariate Normal Distribution, Part 2 BMTRY 726 1/14/2014.
Stats & Linear Models.
Maximum Likelihood Estimation
Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables.
The Multiple Correlation Coefficient. has (p +1)-variate Normal distribution with mean vector and Covariance matrix We are interested if the variable.
Functions of Random Variables. Methods for determining the distribution of functions of Random Variables 1.Distribution function method 2.Moment generating.
Inferences in Regression and Correlation Analysis Ayona Chatterjee Spring 2008 Math 4803/5803.
Marginal and Conditional distributions. Theorem: (Marginal distributions for the Multivariate Normal distribution) have p-variate Normal distribution.
Functions of Random Variables. Methods for determining the distribution of functions of Random Variables 1.Distribution function method 2.Moment generating.
Use of moment generating functions 1.Using the moment generating functions of X, Y, Z, …determine the moment generating function of W = h(X, Y, Z, …).
Ch5. Probability Densities II Dr. Deshi Ye
Chapter 5.6 From DeGroot & Schervish. Uniform Distribution.
1 Sample Geometry and Random Sampling Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking.
Techniques for studying correlation and covariance structure Principal Components Analysis (PCA) Factor Analysis.
Chapter 6 Simple Regression Introduction Fundamental questions – Is there a relationship between two random variables and how strong is it? – Can.
Stats Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.
Brief Review Probability and Statistics. Probability distributions Continuous distributions.
Colorado Center for Astrodynamics Research The University of Colorado 1 STATISTICAL ORBIT DETERMINATION Probability and statistics review ASEN 5070 LECTURE.
Geology 6600/7600 Signal Analysis 02 Sep 2015 © A.R. Lowry 2015 Last time: Signal Analysis is a set of tools used to extract information from sequences.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
Brief Review Probability and Statistics. Probability distributions Continuous distributions.
1 Probability and Statistical Inference (9th Edition) Chapter 4 Bivariate Distributions November 4, 2015.
Continuous Random Variables and Probability Distributions
Logistic regression. Recall the simple linear regression model: y =  0 +  1 x +  where we are trying to predict a continuous dependent variable y from.
Jointly distributed Random variables Multivariate distributions.
Stats & Summary. The Woodbury Theorem where the inverses.
STA347 - week 91 Random Vectors and Matrices A random vector is a vector whose elements are random variables. The collective behavior of a p x 1 random.
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
Chapter 3: Uncertainty "variation arises in data generated by a model" "how to transform knowledge of this variation into statements about the uncertainty.
Comparing k Populations Means – One way Analysis of Variance (ANOVA)
Factor Analysis An Alternative technique for studying correlation and covariance structure.
Inference for the mean vector
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
PRODUCT MOMENTS OF BIVARIATE RANDOM VARIABLES
Hypothesis testing and Estimation
Comparing k Populations
Some Rules for Expectation
Comparing k Populations
Inference about the Slope and Intercept
Multivariate distributions
Inference about the Slope and Intercept
Hypothesis testing and Estimation
The Multivariate Normal Distribution, Part 2
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005
Comparing k Populations
Factor Analysis An Alternative technique for studying correlation and covariance structure.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
The Multivariate Normal Distribution, Part I
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
Moments of Random Variables
Presentation transcript:

Chapter-1 Multivariate Normal Distributions Dr. A. PHILIP AROKIADOSS Assistant Professor Department of Statistics St. Joseph’s College (Autonomous) Tiruchirappalli-620 002.

1.The Normal distribution – parameters m and s (or s2) Comment: If m = 0 and s = 1 the distribution is called the standard normal distribution Normal distribution with m = 50 and s =15 Normal distribution with m = 70 and s =20

The probability density of the normal distribution If a random variable, X, has a normal distribution with mean m and variance s2 then we will write:

The multivariate Normal distribution

Let = a random vector Let = a vector of constants (the mean vector)

Let = a p × p positive definite matrix

Definition The matrix A is positive semi definite if Further the matrix A is positive definite if

Suppose that the joint density of the random vector The random vector, [x1, x2, … xp] is said to have a p-variate normal distribution with mean vector and covariance matrix S We will write:

Example: the Bivariate Normal distribution with and

Now and

Hence where

Note: is constant when is constant. This is true when x1, x2 lie on an ellipse centered at m1, m2 .

Surface Plots of the bivariate Normal distribution

Contour Plots of the bivariate Normal distribution

Scatter Plots of data from the bivariate Normal distribution

Trivariate Normal distribution - Contour map x3 mean vector x2 x1

Trivariate Normal distribution x3 x2 x1

Trivariate Normal distribution x3 x2 x1

Trivariate Normal distribution x3 x2 x1

example In the following study data was collected for a sample of n = 183 females on the variables Age, Height (Ht), Weight (Wt), Birth control pill use (Bpl - 1=no pill, 2=pill) and the following Blood Chemistry measurements Cholesterol (Chl), Albumin (Abl), Calcium (Ca) and Uric Acid (UA). The data are tabulated next page:

The data :

Alb, Chl, Bp

Marginal and Conditional distributions

Theorem: (Marginal distributions for the Multivariate Normal distribution) have p-variate Normal distribution with mean vector and Covariance matrix Then the marginal distribution of is qi-variate Normal distribution (q1 = q, q2 = p - q) with mean vector and Covariance matrix

Theorem: (Conditional distributions for the Multivariate Normal distribution) have p-variate Normal distribution with mean vector and Covariance matrix Then the conditional distribution of given is qi-variate Normal distribution with mean vector and Covariance matrix

Proof: (of Previous two theorems) is The joint density of , and where

where , and

also and ,

,

The marginal distribution of is

The conditional distribution of given is:

is called the matrix of partial variances and covariances. is called the partial covariance (variance if i = j) between xi and xj given x1, … , xq. is called the partial correlation between xi and xj given x1, … , xq.

is called the matrix of regression coefficients for predicting xq+1, xq+2, … , xp from x1, … , xq. Mean vector of xq+1, xq+2, … , xp given x1, … , xqis:

Example: Suppose that Is 4-variate normal with

The marginal distribution of is bivariate normal with The marginal distribution of is trivariate normal with

Find the conditional distribution of given Now and

The matrix of regression coefficients for predicting x3, x4 from x1, x2.

Thus the conditional distribution of given is bivariate Normal with mean vector And partial covariance matrix

Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS

The first step is to input the data. The data is usually contained in some type of file. Text files Excel files Other types of files

After starting the SSPS program the following dialogue box appears:

If you select Opening an existing file and press OK the following dialogue box appears

Once you selected the file and its type

The following dialogue box appears:

If the variable names are in the file ask it to read the names If the variable names are in the file ask it to read the names. If you do not specify the Range the program will identify the Range: Once you “click OK”, two windows will appear

A window containing the output

The other containing the data:

To perform any statistical Analysis select the Analyze menu:

To compute correlations select Correlate then Bivariate To compute partial correlations select Correlate then Partial

for Bivariate correlation the following dialogue appears

the output for Bivariate correlation:

for partial correlation the following dialogue appears

the output for partial correlation: - - - P A R T I A L C O R R E L A T I O N C O E F F I C I E N T S - - - Controlling for.. AGE HT WT CHL ALB CA UA CHL 1.0000 .1299 .2957 .2338 ( 0) ( 178) ( 178) ( 178) P= . P= .082 P= .000 P= .002 ALB .1299 1.0000 .4778 .1226 ( 178) ( 0) ( 178) ( 178) P= .082 P= . P= .000 P= .101 CA .2957 .4778 1.0000 .1737 ( 178) ( 178) ( 0) ( 178) P= .000 P= .000 P= . P= .020 UA .2338 .1226 .1737 1.0000 ( 178) ( 178) ( 178) ( 0) P= .002 P= .101 P= .020 P= . (Coefficient / (D.F.) / 2-tailed Significance) " . " is printed if a coefficient cannot be computed

Compare these with the bivariate correlation:

Bivariate Correlations Partial Correlations CHL ALB CA UA CHL 1.0000 .1299 .2957 .2338 ALB .1299 1.0000 .4778 .1226 CA .2957 .4778 1.0000 .1737 UA .2338 .1226 .1737 1.0000 Bivariate Correlations

In the last example the bivariate and partial correlations were roughly in agreement. This is not necessarily the case in all stuations An Example: The following data was collected on the following three variables: Age Calcium Intake in diet (CAI) Bone Mass density (BMI)

The data

Bivariate correlations

Partial correlations

Scatter plot CAI vs BMI (r = -0.447)

25 35 45 55 65 75

3D Plot Age, CAI and BMI

Independence

Note: two vectors, , are independent if Then the conditional distribution of given is equal to the marginal distribution of If is multivariate Normal with mean vector and Covariance matrix Then the two vectors, , are independent if

The components of the vector, , are independent if s ij = 0 for all i and j (i ≠ j ) i. e. S is a diagonal matrix

Transformations

Transformations Theorem Let x1, x2,…, xn denote random variables with joint probability density function f(x1, x2,…, xn ) Let u1 = h1(x1, x2,…, xn). u2 = h2(x1, x2,…, xn). ⁞ un = hn(x1, x2,…, xn). define an invertible transformation from the x’s to the u’s

Then the joint probability density function of u1, u2,…, un is given by: where Jacobian of the transformation

Example Suppose that u1, u2 are independent with a uniform distribution from 0 to 1 Find the distribution of Solving for u1 and u2 we get the inverse transformation

also and Hence

The Jacobian of the transformation

The joint density of u1, u2 is f(u1, u2) = f1 (u1) f2(u2) Hence the joint density of z1 and z2 is:

Thus z1 and z2 are independent Standard normal. The transformation is useful for converting uniform RV’s into independent standard normal RV’s

Example Suppose that x1, x2 are independent with density functions f1 (x1) and f2(x2) Find the distribution of u1 = x1+ x2 u2 = x1 - x2 Solving for x1 and x2 we get the inverse transformation

The Jacobian of the transformation

The joint density of x1, x2 is f(x1, x2) = f1 (x1) f2(x2) Hence the joint density of u1 and u2 is:

un = an1 x1+ an2 x2 +…+ annxn + cn Theorem Let x1, x2,…, xn denote random variables with joint probability density function f(x1, x2,…, xn ) Let u1 = a11x1+ a12x2+…+ a1nxn + c1 u2 = a21x1 + a22x2+…+ a2nxn + c2 ⁞ un = an1 x1+ an2 x2 +…+ annxn + cn define an invertible linear transformation from the x’s to the u’s

Then the joint probability density function of u1, u2,…, un is given by: where

Theorem Suppose that The random vector, [x1, x2, … xp] has a p-variate normal distribution with mean vector and covariance matrix S then has a p-variate normal distribution with mean vector and covariance matrix

Theorem Suppose that The random vector, [x1, x2, … xp] has a p-variate normal distribution with mean vector and covariance matrix S then has a p-variate normal distribution with mean vector and covariance matrix

Proof then

since and Also and hence QED

Theorem (Linear transformations of Normal RV’s) Suppose that The random vector, has a p-variate normal distribution with mean vector and covariance matrix S then has a p-variate normal distribution Let A be a q × p matrix of rank q ≤ p with mean vector and covariance matrix

proof Let B be a (p - q) × p matrix so that is invertible. then is p–variate normal with mean vector and covariance matrix

Thus the marginal distribution of is q–variate normal with mean vector and covariance matrix

Summary – Distribution Theory for Multivariate Normal Marginal distribution Conditional distribution

(Linear transformations of Normal RV’s) Suppose that The random vector, has a p-variate normal distribution with mean vector and covariance matrix S then has a p-variate normal distribution Let A be a q × p matrix of rank q ≤ p with mean vector and covariance matrix

Recall: Definition of eigenvector, eigenvalue Let A be an n × n matrix Let then l is called an eigenvalue of A and and is called an eigenvector of A and

Thereom If the matrix A is symmetric with distinct eigenvalues, l1, … , ln, with corresponding eigenvectors Assume

Applications of these results to Statistics Suppose that The random vector, [x1, x2, … xp] has a p-variate normal distribution with mean vector and covariance matrix S Then and covariance matrix S is positive definite. Suppose l1, … , lp are the eigenvalues of S corresponding eigenvectors of unit length Note l1 > 0, … , lp > 0

Let

Suppose that the random vector, [x1, x2, … xp] has a p-variate normal distribution with mean vector and covariance matrix S then has a p-variate normal distribution with mean vector and covariance matrix

Thus the components of are independent normal with mean 0 and variance 1. and Has a c2 distribution with p degrees of freedom