QMS 6351 Statistics and Research Methods Regression Analysis: Testing for Significance Chapter 14 (14.5-14.6) Chapter 15 (15.5) Prof. Vera Adamchik.

QMS 6351 Statistics and Research Methods Regression Analysis: Testing for Significance Chapter 14 (14.5-14.6) Chapter 15 (15.5) Prof. Vera Adamchik

Multiple Regression Model E ( y ) =  0 +  1 x 1 +  2 x 2 +...+  p x p +  Multiple Regression Equation E ( y ) =  0 +  1 x 1 +  2 x 2 +...+  p x p Unknown parameters are  0,  1,  2,...,  p Sample Data: x 1 x 2... x p y.... Estimated Multiple Regression Equation Sample statistics are b 0, b 1, b 2,..., b p b 0, b 1, b 2,..., b p b 0, b 1, b 2,..., b p provide estimates of  0,  1,  2,...,  p

Hypotheses about β i Ho:  i = specific value Ha:  i  specific value Ho:  i  specific value Ha:  i < specific value Ho:  i  specific value Ha:  i > specific value The most common hypothesis is whether β i equals to zero (that is, no relationship between y and x i

To learn how to test for a significant regression relationship, we will use the “Programmer Salary Survey” example from the “Ch. 14-15 Part 1” Power Point file.

Testing for significance Two tests are commonly used: the t test and the F test. In simple linear regression, the F and t tests provide the same conclusion. In multiple regression, the F and t tests have different purposes.

The F test The F test is used to determine whether a significant relationship exists between the dependent variable and the set of all the independent variables. The F test is referred to as the test for overall significance.

The t test If the F test shows an overall significance, the t test is used to determine whether each of the individual independent variables is significant. A separate t test is conducted for each of the independent variables in the model. We refer to each of these t tests as a test for individual significance.

Different samples from the same population will produce different values for b i (that is, b 0, b 1, b 2, b 3, etc.). Hence, the estimated regression coefficients are random variables. To test the hypotheses, we need to know the sampling distribution of b i, that is, the sampling distribution of b 1, the sampling distribution of b 2, etc.

Sampling distribution of b i Because of the assumption of normally distributed random errors, the sampling distribution of b i is normal. The mean and standard deviation (a.k.a. standard error) of b i, respectively, are: where is the standard deviation of in the regression model.

etc. Sampling distributions of b i

Because we do not know the value of, we use an estimate of (see the next slide).

An estimate of s is referred to as the standard error of the estimate where p is the number of independent variables in the regression model; MSE stands for “the mean square error” and provides the estimate of.

n Excel’s Regression Statistics Standard error of the estimate s = sqrt [91.88949/(20-3-1)]=2.396475

Estimated standard deviation (standard error) of b i

Consequently, we use the t-distribution to test the hypotheses. The t test for a significant relationship is based on the fact that the test statistic follows a t-distribution with n-p-1 degrees of freedom.

Tests for individual significance

1. Determine the hypotheses. 3. Specify the level of significance. 2. Specify the sampling distribution of b 1 assuming that the null hypothesis is true. OUR EXAMPLE: Testing for significance: t Test

4. Select the test statistic and state the rejection rule. Standardized (t -value) approach : The test statistic is p -value approach : Reject H 0 if p -value < 0.05. For  = 0.05 and d.f. = 16, t 0.025 critical = 2.120. Reject H 0 if t 2.120. OUR EXAMPLE: Testing for significance: t Test

5. Compute the value of the test statistics. 6. Determine whether to reject H 0. The p -value = 0.0014 < alpha = 0.05. Reject H 0. t = 3.8561 > t critical = 2.120. Reject H 0. We conclude that β 1 is not equal to zero. The evidence is sufficient to conclude that a statistically significant relationship exists between the annual salary and the years of experience. OUR EXAMPLE: Testing for significance: t Test

n Excel’s Regression Equation Output Note: Columns F-I are not shown. t statistic and p -value used to test for the individual significance of “Experience” OUR EXAMPLE: Testing for significance: t Test

4. Select the test statistic and state the rejection rule. Standardized (t -value) approach : The test statistic is p -value approach : Reject H 0 if p -value < 0.05. For  = 0.05 and d.f. = 16, t 0.025 critical = 2.120. Reject H 0 if if t 2.120. OUR EXAMPLE: Testing for significance: t Test

5. Compute the value of the test statistics. 6. Determine whether to reject H 0. The p -value = 0.04364 < alpha = 0.05. Reject H 0. t = 2.1905 > t critical = 2.120. Reject H 0. We conclude that β 2 is not equal to zero. The evidence is sufficient to conclude that a statistically significant relationship exists between the annual salary and the score on the programmer aptitude test. OUR EXAMPLE: Testing for significance: t Test

n Excel’s Regression Equation Output Note: Columns F-I are not shown. t statistic and p -value used to test for the individual significance of “Test Score” OUR EXAMPLE: Testing for significance: t Test

4. Select the test statistic and state the rejection rule. Standardized (t -value) approach : The test statistic is p -value approach : Reject H 0 if p -value < 0.05. For  = 0.05 and d.f. = 16, t 0.025 critical = 2.120. Reject H 0 if if t 2.120. OUR EXAMPLE: Testing for significance: t Test

5. Compute the value of the test statistics. 6. Determine whether to reject H 0. The p -value = 0.26789 > alpha = 0.05. Do not reject H 0. t = 1.1479 < t critical = 2.120. Do not reject H 0. The evidence is insufficient to reject H 0. We conclude that β 3 is equal to zero and that there is no statistically significant relationship between the annual salary and whether the individual has a graduate degree in computer science or information systems. OUR EXAMPLE: Testing for significance: t Test

n Excel’s Regression Equation Output Note: Columns F-I are not shown. t statistic and p -value used to test for the individual significance of “Grad. Degr.” OUR EXAMPLE: Testing for significance: t Test

Confidence interval for  i We can use (1- α )% confidence interval for β i to test the hypotheses just used in the t test. H 0 is rejected if the hypothesized value of β i is not included in the confidence interval for β i.

The form of a confidence interval for  i is: Confidence interval for  i where is the t value providing an area of α/2 in the upper tail of a t distribution with n-p-1 degrees of freedom b i is the pointestimator is the margin of error

t-values in EXCEL =TINV(probability,degrees_freedom) Probability is the probability associated with the two-tailed Student's t-distribution. Degrees_freedom is the number of degrees of freedom with which to characterize the distribution. =TINV(0.05,16) = 2.119905285. The t table in the textbook shows 2.120.

OUR EXAMPLE: 95% Confidence interval for  1 Conclusion: 0 is not included in the confidence interval. Therefore, reject H 0.

OUR EXAMPLE: 95% Confidence interval for  2 Conclusion: 0 is not included in the confidence interval. Therefore, reject H 0.

OUR EXAMPLE: 95% Confidence interval for  3 Conclusion: 0 is included in the confidence interval. Therefore, do not reject H 0.

Note: Columns C-E are hidden. n Excel’s Regression Equation Output confidence intervals for β 1, β 2, β 3

The test for overall significance

1. Determine the hypotheses 2. Select the test statistics and specify its distribution H 0 :  1 =  2 =... =  p = 0 H a : One or more of the parameters is not equal to zero. F = MSR/MSE (see the next slide) an F distribution with p d.f. in the numerator and n - p - 1 d.f. in the denominator GENERAL STEPS: Testing for significance: F Test

F-statistic

3. Specify the level of significance 4. State the rejection rule 5. Compute the value of the test statistic p -value approach: Reject H 0 if p -value < . F -value approach: Reject H 0 if F > F  (critical) 6. Determine whether to reject H 0 GENERAL STEPS: Testing for significance: F Test

1. Determine the hypotheses 2. Select the test statistics and specify its distribution H 0 :  1 =  2 =  3 = 0 H a : One or more of the parameters is not equal to zero. F = MSR/MSE an F distribution with 3 d.f. in the numerator and 16 d.f. in the denominator OUR EXAMPLE: Testing for significance: F Test

3. Specify the level of significance 4. State the rejection rule p -value approach: Reject H 0 if p -value < 0.05. F -value approach: For  = 0.05 and d.f. = 3, 16; F 0.05 = 3.24. Reject H 0 if F > 3.24. OUR EXAMPLE: Testing for significance: F Test

5. Compute the value of the test statistic 6. Determine whether to reject H 0 F = MSR/MSE = 169.2987/5.7431 = 29.4787 p -value = 0.0000009417 (Excel printout) The p -value = 0.0000009417 < alpha = 0.05. Reject H 0. F = 29.4787 > F critical = 3.24. Reject H 0. We conclude that a statistically significant relationship is present between the annual salary and the three independent variables, the years of experience, the score on the programmer aptitude test, and whether the individual has a graduate degree in computer science or information systems. OUR EXAMPLE: Testing for significance: F Test

n Excel’s ANOVA Output F statistic MSR and MSE OUR EXAMPLE: Testing for significance: F Test

n Excel’s ANOVA Output p -value used to test for overall significance OUR EXAMPLE: Testing for significance: F Test

Some cautions about the interpretation of significance tests Just because we are able to reject H 0 :  i = 0 and demonstrate statistical significance does not enable us to conclude that there is a linear relationship between x i and y. (See pp. 593-594 in the textbook.) Rejecting H 0 :  i = 0 and concluding that the relationship between x i and y is significant does not enable us to conclude that a cause-and-effect relationship is present between x i and y.

QMS 6351 Statistics and Research Methods Regression Analysis: Testing for Significance Chapter 14 (14.5-14.6) Chapter 15 (15.5) Prof. Vera Adamchik.

Similar presentations

Presentation on theme: "QMS 6351 Statistics and Research Methods Regression Analysis: Testing for Significance Chapter 14 (14.5-14.6) Chapter 15 (15.5) Prof. Vera Adamchik."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

QMS 6351 Statistics and Research Methods Regression Analysis: Testing for Significance Chapter 14 (14.5-14.6) Chapter 15 (15.5) Prof. Vera Adamchik.

Similar presentations

Presentation on theme: "QMS 6351 Statistics and Research Methods Regression Analysis: Testing for Significance Chapter 14 (14.5-14.6) Chapter 15 (15.5) Prof. Vera Adamchik."— Presentation transcript:

Similar presentations

About project

Feedback