Download presentation

Presentation is loading. Please wait.

Published byLeah Petit Modified about 1 year ago

1
A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and a single set of r predictor variables z 1,…,z r. Each of the m responses is assumed to follow its own regression model, i.e., Y 1 = B 01 + B 11 z 1 + B 21 z 2 + + B r1 z r Y 2 = B 02 + B 12 z 1 + B 22 z 2 + + B r2 z r Y1 = B 01 + B 11 z 1 + B 21 z 2 + + B r1 z r where V. Multivariate Linear Regression

2
Conceptually, we can let [z j0, z j1, …, z jr ] denote the values of the predictor variables for the j th trial and be the responses and errors for the j th trial. Thus we have an n x (r + 1) design matrix

3
If we now set

4
and the multivariate linear regression model is Note also that the m observed responses on the j th trial have covariance matrix with

5
The ordinary least squares estimates are found in a manner analogous to the univariate case – we begin by taking collecting the univariate least squares estimates yields ^ ~ Now for any choice of parameters the resulting matrix of errors is

6
The resulting Error Sums of Squares and Crossproducts is We can show that the selection b (i) = (i) minimizes the i th diagonal sum of squares ^ ~~ i.e., are both minimized. generalized variance

7
so we have matrices of predicted values and we have a resulting matrices of residuals Note that the orthogonality conditions among residuals, predicted values, and columns of the design matrix which hold in the univariate case are also true in the multivariate case because

8
… which means the residuals are perpendicular to the columns of the design matrix and to the predicted values Furthermore, because we have total sums of squares and crossproducts predicted sums of squares and crossproducts residual (error) sums of squares and crossproducts

9
Example – suppose we had the following six sample observations on two independent variables (palatability and texture) and two dependent variables (purchase intent and overall quality): Use these data to estimate the multivariate linear regression model for which palatability and texture are independent variables while purchase intent and overall quality are the dependent variables

10
We wish to estimate Y 1 = B 01 + B 11 z 1 + B 21 z 2 and Y 2 = B 02 + B 12 z 1 + B 22 z 2 jointly. The design matrix is

11
so and

12
so

13
and so

14
This gives us estimated values matrix

15
and residuals matrix Note that each column sums to zero!

16
B. Inference in Multivariate Regression The least squares estimators = [ (1) | (2) | | (m) ] of the multivariate regression model have the following properties - if the model is of full rank, i.e., rank(Z)= r + 1 < n. Note that and are also uncorrelated. ~ ~~~ ~ ~ ~

17
This means that, for any observation z 0 is an unbiased estimator, i.e., ~ We can also determine from these properties that the estimation errors have covariances

18
Furthermore, we can easily ascertain that i.e., the forecasted vector Y 0 associated with the values of the predictor variables z 0 is an unbiased estimator of Y 0. The forecast errors have covariance ~ ~ ^ ~

19
Thus, for the multivariate regression model with full rank (Z) = r + 1, n r m, and normally distributed errors , is the maximum likelihood estimator of and ~ ~ ~ where the elements of are ~

20
Also, the maximum likelihood estimator of is independent of the maximum likelihood estimator of the positive definite matrix given by and all of which provide additional support for using the least squares estimate – when the errors are normally distributed ~ ~ ^ are the maximum likelihood estimators of

21
These results can be used to develop likelihood ratio tests for the multivariate regression parameters. The hypothesis that the responses do not depend on predictor variables z q+1, z q+2,…, z r is ~ (q + 1) x m (r - q) x m If we partition Z in a similar manner m x (q + 1)m x (r - q) Big Beta (2)

22
we can write the general model as The extra sum of squares associated with (2) are ~ where and ^

23
The likelihood ratio for the test of the hypothesis H 0 : (2) = 0 is given by the ratio of generalized variances ~ which is often converted to Wilks’ Lambda statistic ~

24
Finally, for the multivariate regression model with full rank (Z) = r + 1, n r m, normally distributed errors , and the null hypothesis is true (so n( 1 – ) ~ W q,r-q ( )) ~ when n – r and n – m are both large. ~ ~~~ ^^

25
If we again refer to the Error Sum of Squares and Crossproducts as E = n and the Hypothesis Sum of Squares and Crossproducts as H = n( 1 - ) then we can define Wilks’ lambda as ~ ~ ^ ~ ~ where 1 2 s are the ordered eigienvalues of HE -1 where s = min(p, r - q). ~ ~

26
There are other similar tests (as we have seen in our discussion of MANOVA): Each of these statistics is an alternative to Wilks’ lambda and perform in a very similar manner (particularly for large sample sizes). Pillai’s Trace Hotelling-Lawley Trace Roy’s Greatest Root

27
Example – For our previous data (the following six sample observations on two independent variables - palatability and texture - and two dependent variables - purchase intent and overall quality to test the hypotheses that i) palatability has no joint relationship with purchase intent and overall quality and ii) texture has no joint relationship with purchase intent and overall quality.

28
We first test the hypothesis that palatability has no joint relationship with purchase intent and overall quality, i.e., H 0 : (1) = 0 The likelihood ratio for the test of this hypothesis is given by the ratio of generalized variances For ease of computation, we’ll use the Wilks’ lambda statistic ~

29
The error sum of squares and crossproducts matrix is and the hypothesis sum of squares and crossproducts matrix for this null hypothesis is

30
so the calculated value of the Wilks’ lambda statistic is

31
The transformation to a Chi-square distributed statistic (which is actually valid only when n – r and n – m are both large) is at = 0.01 and m(r - q) = 1 degrees of freedom, the critical value is we have a strong non- rejection. Also, the approximate p-value of this chi- square test is – note that this is an extremely gross approximation (since n – r = 4 and n – m = 4).

32
We next test the hypothesis that texture has no joint relationship with purchase intent and overall quality, i.e., H 0 : (2) = 0 The likelihood ratio for the test of this hypothesis is given by the ratio of generalized variances For ease of computation, we’ll use the Wilks’ lambda statistic ~

33
The error sum of squares and crossproducts matrix is and the hypothesis sum of squares and crossproducts matrix for this null hypothesis is

34
so the calculated value of the Wilks’ lambda statistic is

35
The transformation to a Chi-square distributed statistic (which is actually valid only when n – r and n – m are both large) is at = 0.01 and m(r - q) = 1 degrees of freedom, the critical value is we have a strong non- rejection. Also, the approximate p-value of this chi- square test is note that this is an extremely gross approximation (since n – r = 4 and n – m = 4).

36
OPTIONS LINESIZE = 72 NODATE PAGENO = 1; DATA stuff; INPUT z1 z2 y1 y2; LABEL z1='Palatability Rating' z2='Texture Rating' y1='Overall Quality Rating' y2='Purchase Intent'; CARDS; ; PROC GLM DATA=stuff; MODEL y1 y2 = z1 z2/; MANOVA H=z1 z2/PRINTE PRINTH; TITLE4 'Using PROC GLM for Multivariate Linear Regression'; RUN; SAS code for a Multivariate Linear Regression Analysis:

37
Dependent Variable: y1 Overall Quality Rating Sum of Source DF Squares Mean Square F Value Pr > F Model Error Corrected Total R-Square Coeff Var Root MSE y1 Mean Source DF Type I SS Mean Square F Value Pr > F z z Source DF Type III SS Mean Square F Value Pr > F z z Dependent Variable: y1 Overall Quality Rating Standard Parameter Estimate Error t Value Pr > |t| Intercept z z SAS output for a Multivariate Linear Regression Analysis:

38
Dependent Variable: y2 Purchase Intent Sum of Source DF Squares Mean Square F Value Pr > F Model Error Corrected Total R-Square Coeff Var Root MSE y2 Mean Source DF Type I SS Mean Square F Value Pr > F z z Source DF Type III SS Mean Square F Value Pr > F z z Dependent Variable: y2 Purchase Intent Standard Parameter Estimate Error t Value Pr > |t| Intercept z z SAS output for a Multivariate Linear Regression Analysis:

39
The GLM Procedure Multivariate Analysis of Variance E = Error SSCP Matrix y1 y2 y y Partial Correlation Coefficients from the Error SSCP Matrix / Prob > |r| DF = 3 y1 y2 y y SAS output for a Multivariate Linear Regression Analysis:

40
The GLM Procedure Multivariate Analysis of Variance H = Type III SSCP Matrix for z1 y1 y2 y y Characteristic Roots and Vectors of: E Inverse * H, where H = Type III SSCP Matrix for z1 E = Error SSCP Matrix Characteristic Characteristic Vector V'EV=1 Root Percent y1 y MANOVA Test Criteria and Exact F Statistics for the Hypothesis of No Overall z1 Effect H = Type III SSCP Matrix for z1 E = Error SSCP Matrix S=1 M=0 N=0 Statistic Value F Value Num DF Den DF Pr > F Wilks' Lambda Pillai's Trace Hotelling-Lawley Trace Roy's Greatest Root SAS output for a Multivariate Linear Regression Analysis:

41
The GLM Procedure Multivariate Analysis of Variance H = Type III SSCP Matrix for z2 y1 y2 y y Characteristic Roots and Vectors of: E Inverse * H, where H = Type III SSCP Matrix for z2 E = Error SSCP Matrix Characteristic Characteristic Vector V'EV=1 Root Percent y1 y MANOVA Test Criteria and Exact F Statistics for the Hypothesis of No Overall z2 Effect H = Type III SSCP Matrix for z2 E = Error SSCP Matrix S=1 M=0 N=0 Statistic Value F Value Num DF Den DF Pr > F Wilks' Lambda Pillai's Trace Hotelling-Lawley Trace Roy's Greatest Root SAS output for a Multivariate Linear Regression Analysis:

42
We can also build confidence intervals for the predicted mean value of Y 0 associated with z 0 - if the model and has normal errors, then ~ ~ independent so

43
Thus the 100(1 – )% confidence interval for the predicted mean value of Y 0 associated with z 0 ( ’ z 0 ) is given by ~ ~ and the 100(1 – )% simultaneous confidence intervals for the mean value of Y i associated with z 0 (z ’ 0 (i) ) are ~ ~ ~ ~ ~ i = 1,…,m ~

44
Finally, we can build prediction intervals for the predicted value of Y 0 associated with z 0 – here the prediction error and has normal errors, then ~~ independent so

45
the prediction intervals the 100(1 – )% prediction interval associated with z 0 is given by and the 100(1 – )% simultaneous prediction intervals with z 0 are ~ ~ i = 1,…,m

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google