Download presentation

Presentation is loading. Please wait.

Published byHeriberto Warters Modified about 1 year ago

1
Topic 12: Multiple Linear Regression

2
Outline Multiple Regression –Data and notation –Model –Inference Recall notes from Topic 3 for simple linear regression

3
Data for Multiple Regression Y i is the response variable X i1, X i2, …, X i,p-1 are p-1 explanatory (or predictor) variables Cases denoted by i = 1 to n

4
Multiple Regression Model Y i is the value of the response variable for the i th case β 0 is the intercept β 1, β 2, …, β p-1 are the regression coefficients for the explanatory variables

5
Multiple Regression Model X i,k is the value of the k th explanatory variable for the i th case e i are independent Normally distributed random errors with mean 0 and variance σ 2

6
Multiple Regression Parameters β 0 is the intercept β 1, β 2, …, β p-1 are the regression coefficients for the explanatory variables σ 2 the variance of the error term

7
Interesting special cases Y i = β 0 + β 1 X i + β 2 X i 2 +…+ β p-1 X i p-1 + e i (polynomial of order p-1) X’s can be indicator or dummy variables taking the values 0 and 1 (or any other two distinct numbers) Interactions between explanatory variables (represented as the product of explanatory variables)

8
Interesting special cases Consider the model Y i = β 0 + β 1 X i1 + β 2 X i2 +β 3 X i1 X i2 + e i If X 2 a dummy variable –Y i = β 0 + β 1 X i + e i (when X 2 =0) –Y i = β 0 + β 1 X i1 +β 2 +β 3 X i1 + e i (when X 2 =1) = (β 0 +β 2 ) + (β 1 +β 3 )X i1 + e i –Modeling two different regression lines at same time

9
Model in Matrix Form

10
Least Squares

11
Least Squares Solution Fitted (predicted) values

12
Residuals

13
Covariance Matrix of residuals Cov(e)=σ 2 (I-H)(I-H)΄= σ 2 (I-H) Var(e i )= σ 2 (1-h ii ) h ii = X΄ i (X΄X) -1 X i X΄ i =(1,X i1,…,X i,p-1 ) Residuals are usually correlated Cov(e i,e j )= -σ 2 h ij

14
Estimation of σ

15
Distribution of b b = (X΄X) -1 X΄Y Since Y~N(Xβ, σ 2 I ) E(b)=((X΄X) -1 X΄)Xβ=β Cov(b)=σ 2 ((X΄X) -1 X΄)((X΄X) -1 X΄)΄ =σ 2 (X΄X) -1 σ 2 (X΄X) -1 is estimated by s 2 (X΄X) -1

16
ANOVA Table Sources of variation are –Model (SAS) or Regression (KNNL) –Error (SAS, KNNL) or Residual –Total SS and df add as before –SSM + SSE =SSTO –df M + df E = df Total

17
Sums of Squares

18
Degrees of Freedom

19
Mean Squares

20

21
ANOVA Table Source SS df MS F Model SSM df M MSM MSM/MSE Error SSE df E MSE Total SSTO df Total MST

22
ANOVA F test H 0 : β 1 = β 2 = … = β p-1 = 0 H a : β k ≠ 0, for at least one k=1,., p-1 Under H 0, F ~ F(p-1,n-p) Reject H 0 if F is large, use P-value

23
P-value of F test The P-value for the F significance test tells us one of the following: –there is no evidence to conclude that any of our explanatory variables can help us to model the response variable using this kind of model (P ≥.05) –one or more of the explanatory variables in our model is potentially useful for predicting the response variable in a linear model (P ≤.05)

24
R2R2 The squared multiple regression correlation (R 2 ) gives the proportion of variation in the response variable explained by all the explanatory variables It is usually expressed as a percent It is sometimes called the coefficient of multiple determination (KNNL p 226)

25
R2R2 R 2 = SSM/SST – the proportion of variation explained R 2 = 1 – (SSE/SST) – 1 – the proportion not explained Can express F test is terms of R 2 F = [ (R 2 )/(p-1) ] / [ (1- R 2 )/(n-p) ]

26
Background Reading We went over KNNL

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google