1 1 Slide Simple Linear Regression Coefficient of Determination Chapter 14 BA 303 – Spring 2011
2 2 Slide COEFFICIENT OF DETERMINATION
3 3 Slide Assessing the Regression Model n n The Coefficient of Determination provides a measure of the goodness of fit for the estimated regression equation. n n Sum of Squares Coefficient of Determination Correlation Coefficient
4 4 Slide Coefficient of Determination Relationship Among SST, SSR, SSE where: SST = total sum of squares SSR = sum of squares due to regression SSE = sum of squares due to error SST = SSR + SSE
5 5 Slide Sum of Squares Reed Auto Sales What is the relationship between TV ads and auto sales?
6 6 Slide Sum of Squares Due to Error a b c d e
7 7 Slide Sum of Squares Due to Regression a b c d e
8 8 Slide Sum of Squares Total a b c d e
9 9 Slide SST = SSR + SSE? SST = SSR + SSE 114 =
10 Slide The coefficient of determination is: Coefficient of Determination where: SSR = sum of squares due to regression SST = total sum of squares r 2 = SSR/SST
11 Slide Coefficient of Determination r 2 = SSR/SST = 100/114 = The regression relationship is very strong; 87.72% of the variability in the number of cars sold can be explained by the linear relationship between the number of TV ads and the number of cars sold.
12 Slide The sign of b 1 in the equation is “+”. Sample Correlation Coefficient r xy = The correlation coefficient of indicates a strong positive relationship between the independent variable and the dependent variable.
13 Slide SUM OF SQUARES PRACTICE
14 Slide Practice = *x
15 Slide Sum of Squares Due to Error = *x
16 Slide Sum of Squares Due to Regression
17 Slide Sum of Squares Total
18 Slide SST = SSR + SSE? SST = SSR + SSE
19 Slide Coefficient of Determination r 2 = SSR/SST
20 Slide Correlation Coefficient = *x
21 Slide TESTS FOR SIGNIFICANCE
22 Slide Testing for Significance To test for a significant regression relationship, we must conduct a hypothesis test to determine whether the value of 1 is zero. Two tests are commonly used: t Test and F Test Both the t test and F test require an estimate of 2, the variance of in the regression model.
23 Slide An Estimate of 2 Testing for Significance where: s 2 = MSE = SSE/( n 2) The mean square error (MSE) provides the estimate of 2, and the notation s 2 is also used.
24 Slide Testing for Significance An Estimate of To estimate we take the square root of 2. The resulting s is called the standard error of the estimate.
25 Slide t TEST
26 Slide Hypotheses Test Statistic Testing for Significance: t Test where
27 Slide Rejection Rule Testing for Significance: t Test where: is the desired level of significance t is based on a t distribution with n - 2 degrees of freedom Reject H 0 if p -vzalue t
28 Slide t Distribution Table
29 Slide 1. Determine the hypotheses. 2. Specify the level of significance. 3. Select the test statistic. = State the rejection rule. Reject H 0 if p -value (with 3 degrees of freedom) Testing for Significance: t Test
30 Slide Testing for Significance: t Test 5. Compute the value of the test statistic. 6. Determine whether to reject H 0. t = 4.63 > We can reject H 0.
31 Slide A Little Bit Slower! t Test Already known: b 1 =5 n=5 Today: SSE = 14 So s = = 2.16
32 Slide A Little Bit Slower! t Test = 2.16/2 = 1.08 = 5/1.08 = = 4.63 Since our t=4.63 is greater than the test t /2 =3.182, we reject the null hypothesis and conclude b 1 is not equal to zero.
33 Slide PRACTICE t TEST
34 Slide t Test b 1 = 2.6 n = 5What We Know SSE = 12.4 Find t for = 0.10
35 Slide t Test b 1 = 2.6 n = 5 SSE = 12.4 t for = 0.10
36 Slide t Test b 1 = 2.6 n = 5 SSE = 12.4
37 Slide t Test t for = 0.10 Conclusion?
38 Slide t CONFIDENCE INTERVAL
39 Slide Confidence Interval for 1 H 0 is rejected if the hypothesized value of 1 is not included in the confidence interval for 1. We can use a 95% confidence interval for 1 to test the hypotheses just used in the t test.
40 Slide The form of a confidence interval for 1 is: Confidence Interval for 1 where is the t value providing an area of /2 in the upper tail of a t distribution with n - 2 degrees of freedom b 1 is the point estimator is the margin of error
41 Slide Confidence Interval for 1 Reject H 0 if 0 is not included in the confidence interval for 1. 0 is not included in the confidence interval. Reject H 0 = 5 +/ (1.08) = 5 +/ or 1.56 to 8.44 n Rejection Rule 95% Confidence Interval for 1 95% Confidence Interval for 1 n Conclusion
42 Slide F TEST
43 Slide n Hypotheses n Test Statistic Testing for Significance: F Test F = MSR/MSE Table 4 – F Distribution MSR d.f. = 1 (numerator) MSE d.f. = n – 2 (denominator)
44 Slide n Rejection Rule Testing for Significance: F Test where: F is based on an F distribution with 1 degree of freedom in the numerator and n - 2 degrees of freedom in the denominator Reject H 0 if p -value F
45 Slide 1. Determine the hypotheses. 2. Specify the level of significance. 3. Select the test statistic. = 0.05 F = State the rejection rule. Reject H 0 if p -value (with 1 d.f. in numerator and 3 d.f. in denominator) Testing for Significance: F Test F = MSR/MSE
46 Slide Testing for Significance: F Test 5. Compute the value of the test statistic. F = provides an area of 0.05 in the tail. Thus, the p -value corresponding to F = is less than Hence, we reject H 0. F = MSR/MSE = 100/4.667 = The statistical evidence is sufficient to conclude that we have a significant relationship between the number of TV ads aired and the number of cars sold. 6. Determine whether to reject H 0.
47 Slide PRACTICE F TEST
48 Slide F Test FF n = 5 SSE = 12.4 SSR = 67.6 Find = 0.05 MSR = SSR/d.f. Regression F MSE = SSE/(n – 2)
49 Slide F Test FFFF = 0.05 MSR MSE
50 Slide F Test F Conclusion?
51 Slide CAUTIONS
52 Slide Some Cautions about the Interpretation of Significance Tests Just because we are able to reject H 0 : 1 = 0 and demonstrate statistical significance does not enable us to conclude that there is a linear relationship between x and y. Rejecting H 0 : 1 = 0 and concluding that the relationship between x and y is significant does not enable us to conclude that a cause-and-effect relationship is present between x and y.
53 Slide Assumptions About the Error Term 1. The error is a random variable with mean of zero. 2. The variance of , denoted by 2, is the same for all values of the independent variable. 3. The values of are independent. 4. The error is a normally distributed random variable.
54 Slide