Presentation is loading. Please wait.

Presentation is loading. Please wait.

Topic 12: Multiple Linear Regression

Similar presentations

Presentation on theme: "Topic 12: Multiple Linear Regression"— Presentation transcript:

1 Topic 12: Multiple Linear Regression

2 Outline Multiple Regression
Data and notation Model Inference Recall notes from Topic 3 for simple linear regression

3 Data for Multiple Regression
Yi is the response variable Xi1, Xi2, … , Xi,p-1 are p-1 explanatory (or predictor) variables Cases denoted by i = 1 to n

4 Multiple Regression Model
Yi is the value of the response variable for the ith case β0 is the intercept β1, β2, … , βp-1 are the regression coefficients for the explanatory variables

5 Multiple Regression Model
Xi,k is the value of the kth explanatory variable for the ith case ei are independent Normally distributed random errors with mean 0 and variance σ2

6 Multiple Regression Parameters
β0 is the intercept β1, β2, … , βp-1 are the regression coefficients for the explanatory variables σ2 the variance of the error term

7 Interesting special cases
Yi = β0 + β1Xi + β2Xi2 +…+ βp-1Xip-1+ ei (polynomial of order p-1) X’s can be indicator or dummy variables taking the values 0 and 1 (or any other two distinct numbers) Interactions between explanatory variables (represented as the product of explanatory variables)

8 Interesting special cases
Consider the model Yi= β0 + β1Xi1+ β2Xi2+β3X i1Xi2+ ei If X2 a dummy variable Yi = β0 + β1Xi + ei (when X2=0) Yi = β0 + β1Xi1+β2+β3Xi1+ ei (when X2=1) = (β0+β2) + (β1+β3)Xi1+ ei Modeling two different regression lines at same time

9 Model in Matrix Form

10 Least Squares

11 Least Squares Solution
Fitted (predicted) values

12 Residuals

13 Covariance Matrix of residuals
Cov(e)=σ2(I-H)(I-H)΄= σ2(I-H) Var(ei)= σ2(1-hii) hii= X΄i(X΄X)-1Xi X΄i =(1,Xi1,…,Xi,p-1) Residuals are usually correlated Cov(ei,ej)= -σ2hij

14 Estimation of σ

15 Distribution of b b = (X΄X)-1X΄Y Since Y~N(Xβ, σ2I)
E(b)=((X΄X)-1X΄)Xβ=β Cov(b)=σ2 ((X΄X)-1X΄)((X΄X)-1X΄)΄ =σ2(X΄X)-1 σ2 (X΄X)-1 is estimated by s2 (X΄X)-1

16 ANOVA Table Sources of variation are Model (SAS) or Regression (KNNL)
Error (SAS, KNNL) or Residual Total SS and df add as before SSM + SSE =SSTO dfM + dfE = dfTotal

17 Sums of Squares

18 Degrees of Freedom

19 Mean Squares

20 Mean Squares

21 ANOVA Table Source SS df MS F Model SSM dfM MSM MSM/MSE
Error SSE dfE MSE Total SSTO dfTotal MST

22 ANOVA F test H0: β1 = β2 = … = βp-1 = 0
Ha: βk ≠ 0, for at least one k=1,., p-1 Under H0, F ~ F(p-1,n-p) Reject H0 if F is large, use P-value

23 P-value of F test The P-value for the F significance test tells us one of the following: there is no evidence to conclude that any of our explanatory variables can help us to model the response variable using this kind of model (P ≥ .05) one or more of the explanatory variables in our model is potentially useful for predicting the response variable in a linear model (P ≤ .05)

24 R2 The squared multiple regression correlation (R2) gives the proportion of variation in the response variable explained by all the explanatory variables It is usually expressed as a percent It is sometimes called the coefficient of multiple determination (KNNL p 226)

25 R2 R2 = SSM/SST the proportion of variation explained
R2 = 1 – (SSE/SST) 1 – the proportion not explained Can express F test is terms of R2 F = [ (R2)/(p-1) ] / [ (1- R2)/(n-p) ]

26 Background Reading We went over KNNL

Download ppt "Topic 12: Multiple Linear Regression"

Similar presentations

Ads by Google