Presentation is loading. Please wait.

Presentation is loading. Please wait.

QMS 6351 Statistics and Research Methods Regression Analysis: Testing for Significance Chapter 14 (14.5-14.6) Chapter 15 (15.5) Prof. Vera Adamchik.

Similar presentations


Presentation on theme: "QMS 6351 Statistics and Research Methods Regression Analysis: Testing for Significance Chapter 14 (14.5-14.6) Chapter 15 (15.5) Prof. Vera Adamchik."— Presentation transcript:

1 QMS 6351 Statistics and Research Methods Regression Analysis: Testing for Significance Chapter 14 (14.5-14.6) Chapter 15 (15.5) Prof. Vera Adamchik

2 Multiple Regression Model E ( y ) =  0 +  1 x 1 +  2 x 2 +...+  p x p +  Multiple Regression Equation E ( y ) =  0 +  1 x 1 +  2 x 2 +...+  p x p Unknown parameters are  0,  1,  2,...,  p Sample Data: x 1 x 2... x p y.... Estimated Multiple Regression Equation Sample statistics are b 0, b 1, b 2,..., b p b 0, b 1, b 2,..., b p b 0, b 1, b 2,..., b p provide estimates of  0,  1,  2,...,  p

3 Hypotheses about β i Ho:  i = specific value Ha:  i  specific value Ho:  i  specific value Ha:  i < specific value Ho:  i  specific value Ha:  i > specific value The most common hypothesis is whether β i equals to zero (that is, no relationship between y and x i

4 To learn how to test for a significant regression relationship, we will use the “Programmer Salary Survey” example from the “Ch. 14-15 Part 1” Power Point file.

5 Testing for significance Two tests are commonly used: the t test and the F test. In simple linear regression, the F and t tests provide the same conclusion. In multiple regression, the F and t tests have different purposes.

6 The F test The F test is used to determine whether a significant relationship exists between the dependent variable and the set of all the independent variables. The F test is referred to as the test for overall significance.

7 The t test If the F test shows an overall significance, the t test is used to determine whether each of the individual independent variables is significant. A separate t test is conducted for each of the independent variables in the model. We refer to each of these t tests as a test for individual significance.

8 Different samples from the same population will produce different values for b i (that is, b 0, b 1, b 2, b 3, etc.). Hence, the estimated regression coefficients are random variables. To test the hypotheses, we need to know the sampling distribution of b i, that is, the sampling distribution of b 1, the sampling distribution of b 2, etc.

9 Sampling distribution of b i Because of the assumption of normally distributed random errors, the sampling distribution of b i is normal. The mean and standard deviation (a.k.a. standard error) of b i, respectively, are: where is the standard deviation of in the regression model.

10 etc. Sampling distributions of b i

11 Because we do not know the value of, we use an estimate of (see the next slide).

12 An estimate of s is referred to as the standard error of the estimate where p is the number of independent variables in the regression model; MSE stands for “the mean square error” and provides the estimate of.

13 n Excel’s Regression Statistics Standard error of the estimate s = sqrt [91.88949/(20-3-1)]=2.396475

14 Estimated standard deviation (standard error) of b i

15 Consequently, we use the t-distribution to test the hypotheses. The t test for a significant relationship is based on the fact that the test statistic follows a t-distribution with n-p-1 degrees of freedom.

16 Tests for individual significance

17 1. Determine the hypotheses. 3. Specify the level of significance. 2. Specify the sampling distribution of b 1 assuming that the null hypothesis is true. OUR EXAMPLE: Testing for significance: t Test

18 4. Select the test statistic and state the rejection rule. Standardized (t -value) approach : The test statistic is p -value approach : Reject H 0 if p -value < 0.05. For  = 0.05 and d.f. = 16, t 0.025 critical = 2.120. Reject H 0 if t 2.120. OUR EXAMPLE: Testing for significance: t Test

19 5. Compute the value of the test statistics. 6. Determine whether to reject H 0. The p -value = 0.0014 < alpha = 0.05. Reject H 0. t = 3.8561 > t critical = 2.120. Reject H 0. We conclude that β 1 is not equal to zero. The evidence is sufficient to conclude that a statistically significant relationship exists between the annual salary and the years of experience. OUR EXAMPLE: Testing for significance: t Test

20 n Excel’s Regression Equation Output Note: Columns F-I are not shown. t statistic and p -value used to test for the individual significance of “Experience” OUR EXAMPLE: Testing for significance: t Test

21 1. Determine the hypotheses. 3. Specify the level of significance. 2. Specify the sampling distribution of b 1 assuming that the null hypothesis is true. OUR EXAMPLE: Testing for significance: t Test

22 4. Select the test statistic and state the rejection rule. Standardized (t -value) approach : The test statistic is p -value approach : Reject H 0 if p -value < 0.05. For  = 0.05 and d.f. = 16, t 0.025 critical = 2.120. Reject H 0 if if t 2.120. OUR EXAMPLE: Testing for significance: t Test

23 5. Compute the value of the test statistics. 6. Determine whether to reject H 0. The p -value = 0.04364 < alpha = 0.05. Reject H 0. t = 2.1905 > t critical = 2.120. Reject H 0. We conclude that β 2 is not equal to zero. The evidence is sufficient to conclude that a statistically significant relationship exists between the annual salary and the score on the programmer aptitude test. OUR EXAMPLE: Testing for significance: t Test

24 n Excel’s Regression Equation Output Note: Columns F-I are not shown. t statistic and p -value used to test for the individual significance of “Test Score” OUR EXAMPLE: Testing for significance: t Test

25 1. Determine the hypotheses. 3. Specify the level of significance. 2. Specify the sampling distribution of b 1 assuming that the null hypothesis is true. OUR EXAMPLE: Testing for significance: t Test

26 4. Select the test statistic and state the rejection rule. Standardized (t -value) approach : The test statistic is p -value approach : Reject H 0 if p -value < 0.05. For  = 0.05 and d.f. = 16, t 0.025 critical = 2.120. Reject H 0 if if t 2.120. OUR EXAMPLE: Testing for significance: t Test

27 5. Compute the value of the test statistics. 6. Determine whether to reject H 0. The p -value = 0.26789 > alpha = 0.05. Do not reject H 0. t = 1.1479 < t critical = 2.120. Do not reject H 0. The evidence is insufficient to reject H 0. We conclude that β 3 is equal to zero and that there is no statistically significant relationship between the annual salary and whether the individual has a graduate degree in computer science or information systems. OUR EXAMPLE: Testing for significance: t Test

28 n Excel’s Regression Equation Output Note: Columns F-I are not shown. t statistic and p -value used to test for the individual significance of “Grad. Degr.” OUR EXAMPLE: Testing for significance: t Test

29 Confidence interval for  i We can use (1- α )% confidence interval for β i to test the hypotheses just used in the t test. H 0 is rejected if the hypothesized value of β i is not included in the confidence interval for β i.

30 The form of a confidence interval for  i is: Confidence interval for  i where is the t value providing an area of α/2 in the upper tail of a t distribution with n-p-1 degrees of freedom b i is the pointestimator is the margin of error

31 t-values in EXCEL =TINV(probability,degrees_freedom) Probability is the probability associated with the two-tailed Student's t-distribution. Degrees_freedom is the number of degrees of freedom with which to characterize the distribution. =TINV(0.05,16) = 2.119905285. The t table in the textbook shows 2.120.

32 OUR EXAMPLE: 95% Confidence interval for  1 Conclusion: 0 is not included in the confidence interval. Therefore, reject H 0.

33 OUR EXAMPLE: 95% Confidence interval for  2 Conclusion: 0 is not included in the confidence interval. Therefore, reject H 0.

34 OUR EXAMPLE: 95% Confidence interval for  3 Conclusion: 0 is included in the confidence interval. Therefore, do not reject H 0.

35 Note: Columns C-E are hidden. n Excel’s Regression Equation Output confidence intervals for β 1, β 2, β 3

36 The test for overall significance

37 1. Determine the hypotheses 2. Select the test statistics and specify its distribution H 0 :  1 =  2 =... =  p = 0 H a : One or more of the parameters is not equal to zero. F = MSR/MSE (see the next slide) an F distribution with p d.f. in the numerator and n - p - 1 d.f. in the denominator GENERAL STEPS: Testing for significance: F Test

38 F-statistic

39 3. Specify the level of significance 4. State the rejection rule 5. Compute the value of the test statistic p -value approach: Reject H 0 if p -value < . F -value approach: Reject H 0 if F > F  (critical) 6. Determine whether to reject H 0 GENERAL STEPS: Testing for significance: F Test

40 1. Determine the hypotheses 2. Select the test statistics and specify its distribution H 0 :  1 =  2 =  3 = 0 H a : One or more of the parameters is not equal to zero. F = MSR/MSE an F distribution with 3 d.f. in the numerator and 16 d.f. in the denominator OUR EXAMPLE: Testing for significance: F Test

41 3. Specify the level of significance 4. State the rejection rule p -value approach: Reject H 0 if p -value < 0.05. F -value approach: For  = 0.05 and d.f. = 3, 16; F 0.05 = 3.24. Reject H 0 if F > 3.24. OUR EXAMPLE: Testing for significance: F Test

42 5. Compute the value of the test statistic 6. Determine whether to reject H 0 F = MSR/MSE = 169.2987/5.7431 = 29.4787 p -value = 0.0000009417 (Excel printout) The p -value = 0.0000009417 < alpha = 0.05. Reject H 0. F = 29.4787 > F critical = 3.24. Reject H 0. We conclude that a statistically significant relationship is present between the annual salary and the three independent variables, the years of experience, the score on the programmer aptitude test, and whether the individual has a graduate degree in computer science or information systems. OUR EXAMPLE: Testing for significance: F Test

43 n Excel’s ANOVA Output F statistic MSR and MSE OUR EXAMPLE: Testing for significance: F Test

44 n Excel’s ANOVA Output p -value used to test for overall significance OUR EXAMPLE: Testing for significance: F Test

45 Some cautions about the interpretation of significance tests Just because we are able to reject H 0 :  i = 0 and demonstrate statistical significance does not enable us to conclude that there is a linear relationship between x i and y. (See pp. 593-594 in the textbook.) Rejecting H 0 :  i = 0 and concluding that the relationship between x i and y is significant does not enable us to conclude that a cause-and-effect relationship is present between x i and y.


Download ppt "QMS 6351 Statistics and Research Methods Regression Analysis: Testing for Significance Chapter 14 (14.5-14.6) Chapter 15 (15.5) Prof. Vera Adamchik."

Similar presentations


Ads by Google