Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multiple Linear Regression Introduction

Similar presentations


Presentation on theme: "Multiple Linear Regression Introduction"— Presentation transcript:

1 Multiple Linear Regression Introduction
We now present the principal results for the general linear regression model in matrix terms. The matrix notation appear exactly as those for the simple linear regression model. Only the degree of freedom and other consants related to the number of X variables and the dimensions of some matrices are different. Hence, we are able to present the results very concisely.

2 Yi is the response variable (as usual)
Overview of Multiple Linear Regression (MLR) Data for Multiple Regression Yi is the response variable (as usual) Xi,1, Xi,2, , Xi,p−1 are the p − 1 explanatory variables for cases i = 1 to n. Example – In Homework #1 you modeled GPA as a function of entrance exam score. We could also consider an aptitude test and high school GPA as potential predictors. With the entrance exam score, this would be 3 variables, so p = 4. Potential problem to remember!!! These predictor variables are probably correlated with each other. In multiple linear regression, there are p-1 explanatory variables, p if you count the intercept. These predictor variables are probably correlated with each other.

3 Yi is the value of the response variable for the ith case.
MLR has a linear model as shown here. Everything is in scalar form, better representable in a matrix. There is one response Y, one intercept beta0, (p-1) slopes (predictors) beta1 to beta p-1, so a total of p unknown parameters to estimate (intercept + slopes). Xik is the value of the kth predictor for the ith case. The Multiple Regression Model in Scalar Form (MLR: multiple linear regression) Yi = β0 + β1Xi,1 + β2Xi, βp−1Xi,p−1 + εi for i = 1, 2, , n where Yi is the value of the response variable for the ith case. iid ε ∼ N (0, σ ) (exactly as before!) 2 i β0 is the intercept (think multidimensionally and look at the equation). β1, β2, , βp−1 are the regression coefficients for the explanatory variables. Xi,k is the value of the kth explanatory variable for the ith case.

4 Yi = β0 + β1Xi,1 + β2Xi,2 + β3Xi,1Xi,2 + εi
An example of a nonlinear regression model is shown at the end of the page. This is a nonlinear regression model because it cannot be expressed in the form of linear parameters. In the polynomial model and the interaction model, we can see that the shape of the response function Y is not restricted to linear response surface. The term linear model refers to the fact that the model is linear in the parameters, i.e., betas are linear. Design matrix 𝑿= 𝑋 𝑋 … 𝑋 1 𝑝− 𝑋 𝑋 … 𝑋 2 𝑝−1 ⋯ 𝑋 𝑛 𝑋 𝑛 … 𝑋 𝑛 𝑝−1 Special Cases Polynomial model 1Xp−1 + εi Yi = β0 + β1Xi + β2X βp i i Interactions between explanatory variables are expressed as a product of the X’s: Yi = β0 + β1Xi,1 + β2Xi,2 + β3Xi,1Xi,2 + εi ANOVA Models with discrete predictors can be encoded by defining the X’s as indicator or dummy variables where Xi,k = 1 if case i belongs to the k-th group, and X = 0 otherwise. Yi = β0 + β1Xi,1 + β2Xi,2 + β3Xi,1Xi,2 + εi Linear model vs Nonlinear model (not covered in the course) 𝑒.𝑔. 𝑌 𝑖 = 𝛽 0 exp 𝛽 1 𝑋 𝑖 + ɛ 𝑖

5 Yn×1 = Xn×pβp×1 + εn×1 ε ∼ N (0, σ2In n) Y ∼ N (Xβ, σ2I)
In matrix terms, the general linear model regression model is shown here. Where Y is a vector of responses. Beta is a vector of parameters, X is a matrix of constants. E is a vector of independent normal random variables with expectation of 0 and variance covariance matrix sigma square times a identify matrix. Consequently, the random vector Y has expectation of X times beta, and the variance-covariance matrix is the same as that of E. Multiple Regression Model in Matrix Form Yn×1 = Xn×pβp×1 + εn×1 ε ∼ N (0, σ2In n) × Y ∼ N (Xβ, σ2I)

6 X1,1 X2,1 X1,2 X2,2 X1,p−1 X2,p−1 1 · · · 1 X = 1 X X · · · X
Design Matrix X: X1,1 X2,1 X1,2 X2,2 X1,p−1 X2,p−1 1 · · · 1 X = .. .. . . .. . . . . 1 X X · · · X n,1 n,2 n,p−1

7 b = (XtX)−1XtY Yˆ = Xb = X(XtX)−1XtY = HY , Parameter Estimation
Least Squares Find b to minimize SSE = (Y − Xb)t(Y − Xb) Obtain normal equations as before: XtXb = XtY Least Squares Solution b = (XtX)−1XtY Fitted (predicted) values for the mean of Y given a particular combination of predictor values, are Yˆ = Xb = X(XtX)−1XtY = HY ,

8 Remember, residuals (ei) and errors (εi) are not identical values!
Idempotent means if you take H to any power, it will remain the same. This can be proved by induction. Symmetric means the matrix remain the same after transposing, and I-H also remain the same after transposing. Residuals Remember, residuals (ei) and errors (εi) are not identical values! Residuals are estimates of the errors. The matrices H and (I − H) have two special properties. They are Symmetric: H = HI and (I − H)t = (I − H). Idempotent: H2 = H and (I − H)(I − H) = (I − H) 𝒆=𝒀− 𝒀 = 𝒀 ′ 𝑰−𝑯 𝒀 𝐻𝐻…𝐻= 𝐻 𝑚 =𝐻 𝑎𝑛𝑑 𝐼−𝐻 𝐼−𝐻 … 𝐼−𝐻 = 𝐼−𝐻 𝑚 =(𝐼−𝐻)

9 Cov(e) = σ2(I − H)(I − H)t = σ2(I − H) V ar(ei) = σ2(1 − hi,i),
Covariance Matrix of Residuals (I − H) = Cov(e) = σ2(I − H)(I − H)t = σ2(I − H) V ar(ei) = σ2(1 − hi,i), where hi,i is the ith diagonal element of H. −1 Note: hi,i = X t (X X) X where X t t = [1 X · · · X 1]. i i i,1 i,p− i Residuals e i are usually somewhat correlated: cov(e i, e j ) = −σ2hi,j . Technically residuals are not independent, but with a reasonable n, we can ignore this.

10 s2 𝑆𝑆𝐸 𝐷𝐹𝐸 =𝑀𝑆𝐸 ete = (Y − Xb)t(Y − Xb) = = n − p n − p s = 𝑀𝑆𝐸
Estimation of σ Since we have estimated p parameters, SSE = et e has d fE = n − p. The estimate for σ2 is the usual estimate: 𝑆𝑆𝐸 𝐷𝐹𝐸 =𝑀𝑆𝐸 ete = (Y − Xb)t(Y − Xb) = s2 = n − p n − p s = 𝑀𝑆𝐸

11 Σ{b} = 𝑋 𝑡 𝑋 −1 𝑋 𝑡 𝑋𝛽=𝛽 =𝐶𝑜𝑣 𝑏 = 𝜎 2 𝑋 𝑡 𝑋 −1
Distribution of b We know that b = (XtX)−1XtY . The only random variable is Y , so the distribution of b is based on the distribution of Y . By the theorem covered previously, since Y ∼ N (Xβ, σ2I), E(b) Σ{b} = 𝑋 𝑡 𝑋 −1 𝑋 𝑡 𝑋𝛽=𝛽 =𝐶𝑜𝑣 𝑏 = 𝜎 2 𝑋 𝑡 𝑋 −1 σ2 is estimated by the MSE, s2; therefore Σ is estimated by s2(XtX)−1. {b}

12 = d fT otal SS and d f sum as before, SSM + SSE d fM + d fE = SST
ANOVA Table Sources of variation are The Model Error (Residual) Total SS and d f sum as before, SSM + SSE d fM + d fE = SST = d fT otal BUT the degrees of freedom will differ from SLR because there are more predictors.

13 = 𝒃 ′ 𝑿 ′ 𝒀− 𝟏 𝒏 𝒀 ′ 𝑱𝒀= 𝒀 ′ 𝑯− 𝟏 𝒏 𝑱 𝒀
Sum of Squares SSM(or SSR) = (Yˆi − Y¯ )2 SSE = (Yi − Yˆi)2 SST = (Yi − Y¯ )2 = 𝒃 ′ 𝑿 ′ 𝒀− 𝟏 𝒏 𝒀 ′ 𝑱𝒀= 𝒀 ′ 𝑯− 𝟏 𝒏 𝑱 𝒀 = e’e = 𝒀−𝑿𝒃 ′ 𝒀−𝑿𝒃 = 𝒀 ′ 𝒀− 𝒃 ′ 𝑿 ′ 𝒀= 𝒀 ′ 𝑰−𝑯 𝒀 = 𝒀 ′ 𝒀− 𝟏 𝒏 𝒀 ′ 𝑱𝒀= 𝒀 ′ 𝑰− 𝟏 𝒏 𝑱 𝒀 Where J is an 𝑛 ×𝑛 matrix of 1s 1 ⋯ 1 ⋮ ⋱ ⋮ 1 ⋯ 1 and H is the hat matrix X(XtX)−1Xt

14 Mean Squares The mean squares are calculated exactly as they were before,

15 ANOVA Table So is the ANOVA table:

16 F∗ = MSM/M SE ANOVA F –test
In MLR, the F -test looks for a significant linear relationship between Y and at least one of the X variables . H0 : β1 = β2 = = βp−1 = 0 (β0 is not in this list!) HA : βk ≠ 0, for at least one k = 1, , p − 1 F∗ = MSM/M SE Under H0, F∗ ∼ Fp−1,n−p. We test it the usual way (reject H0 if p ≤ α).

17 Interpreting the p-value of the F -test
If the p-value for the F -test . . . is > α, we lack evidence to conclude that any of our explanatory variables can help to predict or explain the response variable using a linear regression model. is ≤ α, one or more of the explanatory variables in our model is potentially useful for predicting the response in a linear model (but F does not say which ones). REMEMBER THIS: If F has a significant p-value, but none of the t-tests for the individual β parameters are significant, then the model includes redundant predictors.

18 Coefficient of Multiple Determination, 𝑹 𝟐
The coefficient of multiple determination, denoted by R square measures the proportionate reduction of total variation in Y associated with the use of the set of X variables, X1, X2,…,Xp-1. The coefficient of multiple determination R^2 reduces to the coefficient of simple determination for simple linear regression when one X variable is in regression model. R^2 assumes the value 0 when all parameters bs =0 and the value 1 when all Y observations fall directly on the fitted regression surface, i.e., when Y= Yhat for all i. Adding more X variables to the regression model can only increase R^2 and never reduce it, because SSE can never become larger with more X variables and SSTO is always the same for a give set of response. The adjusted coefficient of multiple determination may actually become smaller when another X variable is introduced into the model, because any decrease in SSE may be more than offset by the loss of a degree of freedom in the denominator n-p. It is sometimes suggested that a modified measure be used that adjusts for the number of X variables in the model. Finally, a large value of 𝑅 2 does not necessarily imply that the fitted model is a useful one. For instance, observations may have been taken at only a few levels of the predictor variables. Despite a high R^2 in this case, the fitted model may not be useful if most predictions require extrapolation outside the region of the observations. Extrapolation means the model is used to predict Y based on a X value that is not included in the sample (when we built the model). The model may vary once the X’s value is different in other cases. Coefficient of Multiple Determination, 𝑹 𝟐 As in SLR, 𝑅 2 is the proportion of variation in the response explained by the model. 𝑅 2 and 𝐹𝑠 are related. 𝐹= 𝑅 2 𝑝−1 1− 𝑅 2 𝑛−𝑝 𝑅 2 = 𝑆𝑆𝑅 𝑆𝑆𝑇 =1− 𝑆𝑆𝐸 𝑆𝑆𝑇 Note that “explained by the model” means “explained by these predictors using this specific regression equation,” NOT just “explained by these predictor variables.” Finally, a large value of 𝑅 2 does not necessarily imply that the fitted model is a useful one.

19 R adj 2 =1− 𝑀𝑆𝐸 𝑀𝑆𝑇 =1− 𝑛−1 𝑛−𝑝 𝑆𝑆𝐸/𝑆𝑆𝑇
Adjusted Coefficient of Determination, 𝑹 𝒂𝒅𝒋 𝟐 It is sometimes suggested that a modified measure be used that adjusts for the number of X variables in the model. R adj 2 =1− 𝑀𝑆𝐸 𝑀𝑆𝑇 =1− 𝑛−1 𝑛−𝑝 𝑆𝑆𝐸/𝑆𝑆𝑇 1/d fE = 1/(n − p) increases as a hyperbolic function of p. Increasing p by 1 always makes SSE smaller, but not always by the same amount. If the decline in SSE is large enough to cancel the increase in 1/d fE , then MSE will get smaller, so R2 will get bigger. 2 Ad j If the fit does not improve sufficiently to overcome 1/d f same, or might even get smaller! , then R will remain the E

20 Σˆ b k ± tcs{bk}, where tc = tn−p(1 − α/2) = s2 =MSE (XtX)−1
Inference about Regression Coefficients (Parameters) As usual, the CI for βk is, b k ± tcs{bk}, where tc = tn−p(1 − α/2) We know that b ∼ N (β, σ2(XtX)−1) We estimate the variance-covariance matrix for the parameter vector as, Σˆ = s2 =MSE (XtX)−1 = 𝟏 𝒏−𝒑 𝒀 ′ 𝑰−𝑯 𝒀 𝑿 𝒕 𝑿 −𝟏 {b} {b}p×p For an individual coefficient 𝛽 𝑘 , where 𝑘= 0,…𝑝−1 , 𝑠 { 𝑏 𝑘 } 2 is the (𝑘+1)-th diagonal element of the variance-covariance matrix Σ {𝑏}.

21 s{bk } p-value computed from the tn−p distribution.
Significance Test for βk H0 : βk = β∗ (in software, β∗ = 0) k k (bk −β∗) Test statistic t∗ = k s{bk } Remember that d fE = n − p ! p-value computed from the tn−p distribution.

22 The t-test for an individual βk tests the significance of that variable given that the other
variables are already in the model – that is, its unique contribution after accounting for all other effects in the model. Unlike in SLR, the t-tests for β are different from the F -test. The t-tests and CIs for individual βs will be misleading if the model includes redundant predictors.

23 The Dwaine Studios example: The Dwaine operates studios that specialize in portraits of children. The company is considering whether sales (Y) in a community can be predicted from the number of persons aged 16 or younger in the community (X1) and the per capita disposable personal income in the community (X2). Data is Dwaine.csv X1 X2 Y 68.5 16.7 174.4 45.2 16.8 164.4 91.3 18.2 244.2 47.8 16.3 154.6 46.9 17.3 181.6 66.1 207.5 49.5 15.9 152.8 52 17.2 163.2 48.9 16.6 145.4 38.4 16 137.2 87.9 18.3 241.9 72.8 17.1 191.1 88.4 17.4 232 42.9 15.8 145.3 52.5 17.8 161.1 85.7 18.4 209.7 41.3 16.5 146.4 51.7 144 89.6 18.1 232.6 82.7 19.1 224.1 52.3 166.5 The Dwaine Studios example: The Dwaine operates studios that specialize in portraits of children. The company is considering whether sales (Y) in a community can be predicted from the number of persons aged 16 or younger in the community (X1) and the per capita disposable personal income in the community (X2). Data is Dwaine.csv. Data on these variables for the most recent year for the 21 cities in which Dwaine Studios is now operating are partially shown here. [B] P is 3 with the intercept.

24 The Dwaine Studios example: The Dwaine operates studios that specialize in portraits of children. The company is considering whether sales (Y) in a community can be predicted from the number of persons aged 16 or younger in the community (X1) and the per capita disposable personal income in the community (X2). Data is Dwaine.csv n=21 1). Predict the mean Y when X1=65.4 and X2=17.6 2). Find SSE, 𝑑 𝑓 𝐸 , SSR, 𝑑 𝑓 𝑅 , MSE, MSR, 𝑅 2 , 𝑅 𝑎𝑑𝑗 2 SSE=____________ dfE=n-p=_____ MSE= ______ SSR=______________________, dfR = _____ MSR=___________ SST=SSR+SSE=_______ 𝑅 2 = 𝑆𝑆𝑅 𝑆𝑆𝑇 =_________ 𝑅 𝑎𝑑𝑗 2 =1− 𝑛−1 𝑛−𝑝 𝑆𝑆𝐸/𝑆𝑆𝑇 =________________________

25 The Dwaine Studios example: The Dwaine operates studios that specialize in portraits of children. The company is considering whether sales (Y) in a community can be predicted from the number of persons aged 16 or younger in the community (X1) and the per capita disposable personal income in the community (X2). Data is Dwaine.csv n=21 1). Predict the mean Y when X1=65.4 and X2=17.6 2). Find SSE, 𝑑 𝑓 𝐸 , SSR, 𝑑 𝑓 𝑅 , MSE, MSR, 𝑅 2 , 𝑅 𝑎𝑑𝑗 2 𝑌 =− 𝑋 𝑋2 =190.9 SSE=2180.9, dfE=n-p=18, MSE=121.2 SSR= = , dfR = 2 MSR= SST= 𝑅 2 = 𝑆𝑆𝑅 𝑆𝑆𝑇 =0.9167 𝑅 𝑎𝑑𝑗 2 =1− 𝑛−1 𝑛−𝑝 𝑆𝑆𝐸/𝑆𝑆𝑇 =1− =0.9075

26 Effect of the order of predictors entering the model.
When switch the order of predictors entering the model, we will see that the F test and T test perform differently. In this page, we compare two model where X1 is study before X2 (on the left), or X2 is study after X1 (on the right). In the t test, each predictor is always studied as the last one. Therefore, we see the results are the same on the left and right. We see that X1 has a significant t value, meaning when we introduce X1 into a model that contains X2, the model explains a significant additional amount of variation in Y. Similarly, X2 also has a significant t value. But the F tests are different in the two models. The F test treats each predictor according to the marginal effect in the exact order it enters the model. For example, the F value for X1 in the first model is and significant, X1 explains a total of variation in Y, then, the F value for X2 is 5.3 and also significant, X2 explains an additional of variation in Y and is also useful. When we switch the order on the right, first, X2 explains variation in Y which is significant, then, X1 explain more variations in Y which is also significant. A model that contains X1 and X2 is significant with a F value of 99.1 In this case, both F test and T test are significant because X1 and X2 have unique contribution in the model. Imagine if X1 and X2 are redundant, then the model containing both of them are just about as useful as one, the marginal contribution of one predictor given the other is in the model should be insignificant. T test and F test are not equivalent anymore.

27 3). Find the estimated variance-covariance matrix for
The Dwaine Studios example: The Dwaine operates studios that specialize in portraits of children. The company is considering whether sales (Y) in a community can be predicted from the number of persons aged 16 or younger in the community (X1) and the per capita disposable personal income in the community (X2). Data is Dwaine.csv n=21 3). Find the estimated variance-covariance matrix for the parameter 𝑠 2 𝑏 =𝑀𝑆𝐸 ( 𝑋 ′ 𝑋) −1 𝑠 2 𝑏 1 = 𝑠 2 𝑏 2 = 4). Test whether sales are related to the target population and per capita disposable income. 𝐻 0 : 𝛽 1 =0 𝑎𝑛𝑑 𝛽 2 =0 , 𝐻 𝑎 :not both 𝛽 1 𝑎𝑛𝑑 𝛽 2 𝑒𝑞𝑢𝑎𝑙 𝑧𝑒𝑟𝑜 𝐹 ∗ = 𝑀𝑆𝑅 𝑀𝑆𝐸 = For 𝛼=0.05, we require 𝐹 0.95;2,18 =3.55. The sales ________(are/aren’t) related to target population and per capita disposable income

28 3). Find the estimated variance-covariance matrix for
The Dwaine Studios example: The Dwaine operates studios that specialize in portraits of children. The company is considering whether sales (Y) in a community can be predicted from the number of persons aged 16 or younger in the community (X1) and the per capita disposable personal income in the community (X2). Data is Dwaine.csv n=21 3). Find the estimated variance-covariance matrix for the parameter 𝑠 2 𝑏 =𝑀𝑆𝐸 ( 𝑋 ′ 𝑋) −1 𝑠 2 𝑏 1 = 0.045 𝑪𝒐𝒗 𝒃 𝟏 , 𝒃 𝟐 =−𝟎.𝟔𝟕 𝑠 2 𝑏 2 =16.52 4). Test whether sales are related to the target population and per capita disposable income. 𝐻 0 : 𝛽 1 =0 𝑎𝑛𝑑 𝛽 2 =0 , 𝐻 𝑎 :not both 𝛽 1 𝑎𝑛𝑑 𝛽 2 𝑒𝑞𝑢𝑎𝑙 𝑧𝑒𝑟𝑜 𝐹 ∗ = 𝑀𝑆𝑅 𝑀𝑆𝐸 = =99.1 For 𝛼=0.05, we require 𝐹 0.95;2,18 =3.55. The sales are related to target population and per capita disposable income

29 The Dwaine Studios example: The Dwaine operates studios that specialize in portraits of children. The company is considering whether sales (Y) in a community can be predicted from the number of persons aged 16 or younger in the community (X1) and the per capita disposable personal income in the community (X2). Data is Dwaine.csv n=21 5) Find the 95% individual CI for each of the parameter. 6) Find the 95% simultaneous Bonferroni CI for the parameter ( 𝛽 1 𝑎𝑛𝑑 𝛽 2 ), g=2 95% simultaneous Bonferroni CI for 𝛽 1 : 𝑏 1 ±𝑡 1−𝛼/2𝑔,𝑑𝑓 𝑠 𝑏1 =1.45± = The simultaneous Bonferroni CI take the t value to be 𝑡 1− 𝛼 2𝑔 , 𝑑𝑓𝐸 For 𝛼=0.05, 𝑎𝑛𝑑 𝑔=____ 𝑡 1− 𝛼 2𝑔 , 𝑑𝑓𝐸 = 95% simultaneous Bonferroni CI for 𝛽 2 : 𝑏 2 ±𝑡 1−𝛼/2𝑔,𝑑𝑓 𝑠 𝑏2 =9.37± =

30 5) Find the 95% individual CI for each of the parameter.
The Dwaine Studios example: The Dwaine operates studios that specialize in portraits of children. The company is considering whether sales (Y) in a community can be predicted from the number of persons aged 16 or younger in the community (X1) and the per capita disposable personal income in the community (X2). Data is Dwaine.csv n=21 5) Find the 95% individual CI for each of the parameter. 6) Find the 95% simultaneous Bonferroni CI for the parameter ( 𝛽 1 𝑎𝑛𝑑 𝛽 2 ), g=2 95% CI for 𝛽 1 : 𝑏 1 ±𝑡 𝑠 𝑏1 =1.45±𝑡 𝑠 𝑏 1 =1.45± = (1.01, 1.9} 95% simultaneous Bonferroni CI for 𝛽 1 : 𝑏 1 ±𝑡 1−𝛼/2𝑔,𝑑𝑓 𝑠 𝑏1 =1.45± = The simultaneous Bonferroni CI take the t value to be 𝑡 1− 𝛼 2𝑔 , 𝑑𝑓𝐸 For 𝛼=0.05, 𝑎𝑛𝑑 𝑔=____ 𝑡 1− 𝛼 2𝑔 , 𝑑𝑓𝐸 = 95% simultaneous Bonferroni CI for 𝛽 2 : 𝑏 2 ±𝑡 1−𝛼/2𝑔,𝑑𝑓 𝑠 𝑏2 =9.37± =

31 5) Find the 95% individual CI for each of the parameter.
The Dwaine Studios example: The Dwaine operates studios that specialize in portraits of children. The company is considering whether sales (Y) in a community can be predicted from the number of persons aged 16 or younger in the community (X1) and the per capita disposable personal income in the community (X2). Data is Dwaine.csv n=21 5) Find the 95% individual CI for each of the parameter. 6) Find the 95% simultaneous Bonferroni CI for the parameter ( 𝛽 1 𝑎𝑛𝑑 𝛽 2 ), g=2 95% CI for 𝛽 1 : 𝑏 1 ±𝑡 𝑠 𝑏1 =1.45±𝑡 𝑠 𝑏 1 =1.45± = (1.01, 1.9} 95% simultaneous Bonferroni CI for 𝛽 1 : 𝑏 1 ±𝑡 1−𝛼/2𝑔,𝑑𝑓 𝑠 𝑏1 =1.45± = (0.94, 1.97) The simultaneous Bonferroni CI take the t value to be 𝑡 1− 𝛼 2𝑔 , 𝑑𝑓𝐸 For 𝛼=0.05, 𝑎𝑛𝑑 𝑔=2 𝑡 1− 𝛼 2𝑔 , 𝑑𝑓𝐸 =t( /2, 18)=t(0.9875, 18)=2.445 95% simultaneous Bonferroni CI for 𝛽 2 : 𝑏 2 ±𝑡 1−𝛼/2𝑔,𝑑𝑓 𝑠 𝑏2 =9.37± = (-0.57, 19.3)


Download ppt "Multiple Linear Regression Introduction"

Similar presentations


Ads by Google