Presentation is loading. Please wait.

Presentation is loading. Please wait.

Slide 1 Copyright © 2004 Pearson Education, Inc. Chapter 10 Correlation and Regression 10-1 Overview Overview 10-2 Correlation 10-3 Regression-3 Regression.

Similar presentations


Presentation on theme: "Slide 1 Copyright © 2004 Pearson Education, Inc. Chapter 10 Correlation and Regression 10-1 Overview Overview 10-2 Correlation 10-3 Regression-3 Regression."— Presentation transcript:

1 Slide 1 Copyright © 2004 Pearson Education, Inc. Chapter 10 Correlation and Regression 10-1 Overview Overview 10-2 Correlation 10-3 Regression-3 Regression 10-4 Variation and Prediction Intervals-4 Variation and Prediction Intervals 10-5 Rank Correlation-5 Rank Correlation

2 Slide 2 Copyright © 2004 Pearson Education, Inc. Created by Erin Hodgess, Houston, Texas Section 10-1 & 10-2 Overview and Correlation and Regression

3 Slide 3 Copyright © 2004 Pearson Education, Inc. Overview Paired Data  Is there a relationship?  If so, what is the equation?  Use that equation for prediction.

4 Slide 4 Copyright © 2004 Pearson Education, Inc. Definition  A correlation exists between two variables when one of them is related to the other in some way.

5 Slide 5 Copyright © 2004 Pearson Education, Inc.  A Scatterplot (or scatter diagram) is a graph in which the paired ( x, y ) sample data are plotted with a horizontal x- axis and a vertical y- axis. Each individual ( x, y ) pair is plotted as a single point. Definition

6 Slide 6 Copyright © 2004 Pearson Education, Inc. Scatter Diagram of Paired Data

7 Slide 7 Copyright © 2004 Pearson Education, Inc. Positive Linear Correlation Figure 10-2 Scatter Plots

8 Slide 8 Copyright © 2004 Pearson Education, Inc. Negative Linear Correlation Figure 10-2 Scatter Plots

9 Slide 9 Copyright © 2004 Pearson Education, Inc. No Linear Correlation Figure 10-2 Scatter Plots

10 Slide 10 Copyright © 2004 Pearson Education, Inc. Definition The linear correlation coefficient r measures strength of the linear relationship between paired x and y values in a sample.

11 Slide 11 Copyright © 2004 Pearson Education, Inc. Assumptions 1. The sample of paired data ( x, y ) is a random sample. 2. The pairs of ( x, y ) data have a bivariate normal distribution.

12 Slide 12 Copyright © 2004 Pearson Education, Inc. Notation for the Linear Correlation Coefficient n = number of pairs of data presented  denotes the addition of the items indicated.  x denotes the sum of all x- values.  x 2 indicates that each x - value should be squared and then those squares added. (  x ) 2 indicates that the x- values should be added and the total then squared.  xy indicates that each x -value should be first multiplied by its corresponding y- value. After obtaining all such products, find their sum. r represents linear correlation coefficient for a sample  represents linear correlation coefficient for a population

13 Slide 13 Copyright © 2004 Pearson Education, Inc. n  xy – (  x)(  y) n(  x 2 ) – (  x) 2 n(  y 2 ) – (  y) 2 r =r = Formula 10-1 The linear correlation coefficient r measures the strength of a linear relationship between the paired values in a sample. Calculators can compute r  (rho) is the linear correlation coefficient for all paired data in the population. Definition

14 Slide 14 Copyright © 2004 Pearson Education, Inc.  Round to three decimal places so that it can be compared to critical values in Table A-6.  Use calculator or computer if possible. Rounding the Linear Correlation Coefficient r

15 Slide 15 Copyright © 2004 Pearson Education, Inc. 1212 1818 3636 5454 Data x y Calculating r

16 Slide 16 Copyright © 2004 Pearson Education, Inc. Calculating r

17 Slide 17 Copyright © 2004 Pearson Education, Inc. 1212 1818 3636 5454 Data x y Calculating r n  xy – (  x)(  y) n(  x 2 ) – (  x) 2 n(  y 2 ) – (  y) 2 r =r =  48  – (10)(20) 4(36) – (10) 2 4(120) – (20) 2 r =r = –8 59.329 r =r = = – 0.135

18 Slide 18 Copyright © 2004 Pearson Education, Inc. Interpreting the Linear Correlation Coefficient  If the absolute value of r exceeds the value in Table A - 6, conclude that there is a significant linear correlation.  Otherwise, there is not sufficient evidence to support the conclusion of significant linear correlation.

19 Slide 19 Copyright © 2004 Pearson Education, Inc. Example: Boats and Manatees Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant linear correlation between the number of registered boats and the number of manatees killed by boats. Using the same procedure previously illustrated, we find that r = 0.922. Referring to Table A-6, we locate the row for which n =10. Using the critical value for  =5, we have 0.632. Because r = 0.922, its absolute value exceeds 0.632, so we conclude that there is a significant linear correlation between number of registered boats and number of manatee deaths from boats.

20 Slide 20 Copyright © 2004 Pearson Education, Inc. Properties of the Linear Correlation Coefficient r 1. –1  r  1 2. Value of r does not change if all values of either variable are converted to a different scale. 3. The r is not affected by the choice of x and y. interchange x and y and the value of r will not change. 4. r measures strength of a linear relationship.

21 Slide 21 Copyright © 2004 Pearson Education, Inc. Interpreting r: Explained Variation The value of r 2 is the proportion of the variation in y that is explained by the linear relationship between x and y.

22 Slide 22 Copyright © 2004 Pearson Education, Inc. Using the boat/manatee data in Table 9-1, we have found that the value of the linear correlation coefficient r = 0.922. What proportion of the variation of the manatee deaths can be explained by the variation in the number of boat registrations? With r = 0.922, we get r 2 = 0.850. We conclude that 0.850 (or about 85%) of the variation in manatee deaths can be explained by the linear relationship between the number of boat registrations and the number of manatee deaths from boats. This implies that 15% of the variation of manatee deaths cannot be explained by the number of boat registrations. Example: Boats and Manatees

23 Slide 23 Copyright © 2004 Pearson Education, Inc. Common Errors Involving Correlation 1. Causation: It is wrong to conclude that correlation implies causality. 2. Averages: Averages suppress individual variation and may inflate the correlation coefficient. 3. Linearity: There may be some relationship between x and y even when there is no significant linear correlation.

24 Slide 24 Copyright © 2004 Pearson Education, Inc. FIGURE 10-3 Scatterplot of Distance above Ground and Time for Object Thrown Upward Common Errors Involving Correlation

25 Slide 25 Copyright © 2004 Pearson Education, Inc. Formal Hypothesis Test  We wish to determine whether there is a significant linear correlation between two variables.  We present two methods.  Both methods let H 0 :  =  (no significant linear correlation) H 1 :    (significant linear correlation)

26 Slide 26 Copyright © 2004 Pearson Education, Inc. FIGURE 10-4 Testing for a Linear Correlation

27 Slide 27 Copyright © 2004 Pearson Education, Inc. Test statistic: Critical values: Use Table A-3 with degrees of freedom = n – 2 1 – r 2 n – 2 r t = Method 1: Test Statistic is t (follows format of earlier chapters)

28 Slide 28 Copyright © 2004 Pearson Education, Inc.  Test statistic: r  Critical values: Refer to Table A-6 (no degrees of freedom) Method 2: Test Statistic is r (uses fewer calculations)

29 Slide 29 Copyright © 2004 Pearson Education, Inc. Using the boat/manatee data in Table 9-1, test the claim that there is a linear correlation between the number of registered boats and the number of manatee deaths from boats. Use Method 1. 1 – r 2 n – 2 r t = 1 – 0.922 2 10 – 2 0.922 t = = 6.735 Example: Boats and Manatees

30 Slide 30 Copyright © 2004 Pearson Education, Inc. Method 1: Test Statistic is t (follows format of earlier chapters) Figure 9-5

31 Slide 31 Copyright © 2004 Pearson Education, Inc. Using the boat/manatee data in Table 9-1, test the claim that there is a linear correlation between the number of registered boats and the number of manatee deaths from boats. Use Method 2. The test statistic is r = 0.922. The critical values of r =  0.632 are found in Table A-6 with n = 10 and  = 0.05. Example: Boats and Manatees

32 Slide 32 Copyright © 2004 Pearson Education, Inc.  Test statistic: r  Critical values: Refer to Table A-6 (10 degrees of freedom) Method 2: Test Statistic is r (uses fewer calculations) Figure 9-6

33 Slide 33 Copyright © 2004 Pearson Education, Inc. Using the boat/manatee data in Table 9-1, test the claim that there is a linear correlation between the number of registered boats and the number of manatee deaths from boats. Use both (a) Method 1 and (b) Method 2. Using either of the two methods, we find that the absolute value of the test statistic does exceed the critical value (Method 1: 6.735 > 2.306. Method 2: 0.922 > 0.632); that is, the test statistic falls in the critical region. We therefore reject the null hypothesis. There is sufficient evidence to support the claim of a linear correlation between the number of registered boats and the number of manatee deaths from boats. Example: Boats and Manatees

34 Slide 34 Copyright © 2004 Pearson Education, Inc. Justification for r Formula Figure 10-7 r =r = (n -1 ) S x S y  (x -x) (y -y) Formula 10-1 is developed from (x, y) centroid of sample points

35 Slide 35 Copyright © 2004 Pearson Education, Inc. Created by Erin Hodgess, Houston, Texas Section 10-3 Regression

36 Slide 36 Copyright © 2004 Pearson Education, Inc. Regression Definition  Regression Equation The regression equation expresses a relationship between x (called the independent variable, predictor variable or explanatory variable, and y (called the dependent variable or response variable. The typical equation of a straight line y = mx + b is expressed in the form y = b 0 + b 1 x, where b 0 is the y - intercept and b 1 is the slope. ^

37 Slide 37 Copyright © 2004 Pearson Education, Inc. Assumptions 1. We are investigating only linear relationships. 2. For each x- value, y is a random variable having a normal (bell-shaped) distribution. All of these y distributions have the same variance. Also, for a given value of x, the distribution of y- values has a mean that lies on the regression line. (Results are not seriously affected if departures from normal distributions and equal variances are not too extreme.)

38 Slide 38 Copyright © 2004 Pearson Education, Inc. Regression Definition  Regression Equation Given a collection of paired data, the regression equation  Regression Line The graph of the regression equation is called the regression line (or line of best fit, or least squares line). y = b 0 + b 1 x ^ algebraically describes the relationship between the two variables

39 Slide 39 Copyright © 2004 Pearson Education, Inc. Notation for Regression Equation y -intercept of regression equation  0 b 0 Slope of regression equation  1 b 1 Equation of the regression line y =  0 +  1 x y = b 0 + b 1 x Population Parameter Sample Statistic ^

40 Slide 40 Copyright © 2004 Pearson Education, Inc. Formula for b 0 and b 1 Formula 10-2 n(  xy) – (  x) (  y) b 1 = (slope) n(  x 2 ) – (  x) 2 b 0 = y – b 1 x ( y -intercept) Formula 10-3 calculators or computers can compute these values

41 Slide 41 Copyright © 2004 Pearson Education, Inc. If you find b 1 first, then Formula 10-4 b 0 = y - b 1 x Can be used for Formula 9-2, where y is the mean of the y -values and x is the mean of the x values

42 Slide 42 Copyright © 2004 Pearson Education, Inc. The regression line fits the sample points best.

43 Slide 43 Copyright © 2004 Pearson Education, Inc. Rounding the y -intercept b 0 and the slope b 1  Round to three significant digits.  If you use the formulas 10-2 and 10-3, try not to round intermediate values.

44 Slide 44 Copyright © 2004 Pearson Education, Inc. 1212 1818 3636 5454 Data x y Calculating the Regression Equation In Section 9-2, we used these values to find that the linear correlation coefficient of r = –0.135. Use this sample to find the regression equation.

45 Slide 45 Copyright © 2004 Pearson Education, Inc. 1212 1818 3636 5454 Data x y Calculating the Regression Equation n = 4  x = 10  y = 20  x 2 = 36  y 2 = 120  xy = 48 n(  xy) – (  x) (  y) n(  x 2 ) –(  x) 2 b 1 = 4(48) – (10) (20) 4(36) – (10) 2 b 1 = –8 44 b 1 = = –0.181818

46 Slide 46 Copyright © 2004 Pearson Education, Inc. 1212 1818 3636 5454 Data x y Calculating the Regression Equation n = 4  x = 10  y = 20  x 2 = 36  y 2 = 120  xy = 48 b 0 = y – b 1 x 5 – (–0.181818)(2.5) = 5.45

47 Slide 47 Copyright © 2004 Pearson Education, Inc. 1212 1818 3636 5454 Data x y Calculating the Regression Equation n = 4  x = 10  y = 20  x 2 = 36  y 2 = 120  xy = 48 The estimated equation of the regression line is: y = 5.45 – 0.182 x ^

48 Slide 48 Copyright © 2004 Pearson Education, Inc. Given the sample data in Table 9-1, find the regression equation. Using the same procedure as in the previous example, we find that b 1 = 2.27 and b 0 = –113. Hence, the estimated regression equation is: y = –113 + 2.27 x ^ Example: Boats and Manatees

49 Slide 49 Copyright © 2004 Pearson Education, Inc. Given the sample data in Table 9-1, find the regression equation. Example: Boats and Manatees

50 Slide 50 Copyright © 2004 Pearson Education, Inc. In predicting a value of y based on some given value of x... 1. If there is not a significant linear correlation, the best predicted y -value is y. Predictions 2.If there is a significant linear correlation, the best predicted y -value is found by substituting the x -value into the regression equation.

51 Slide 51 Copyright © 2004 Pearson Education, Inc. Figure 9-8 Predicting the Value of a Variable

52 Slide 52 Copyright © 2004 Pearson Education, Inc. Given the sample data in Table 9-1, we found that the regression equation is y = –113 + 2.27 x. Assume that in 2001 there were 850,000 registered boats. Because Table 9-1 lists the numbers of registered boats in tens of thousands, this means that for 2001 we have x = 85. Given that x = 85, find the best predicted value of y, the number of manatee deaths from boats. ^ Example: Boats and Manatees

53 Slide 53 Copyright © 2004 Pearson Education, Inc. We must consider whether there is a linear correlation that justifies the use of that equation. We do have a significant linear correlation (with r = 0.922). Given the sample data in Table 9-1, we found that the regression equation is y = –113 + 2.27 x. Given that x = 85, find the best predicted value of y, the number of manatee deaths from boats. ^ Example: Boats and Manatees

54 Slide 54 Copyright © 2004 Pearson Education, Inc. Given the sample data in Table 9-1, we found that the regression equation is y = –113 + 2.27 x. Given that x = 85, find the best predicted value of y, the number of manatee deaths from boats. ^ Example: Boats and Manatees y = –113 + 2.27x –113 + 2.27(85) = 80.0 ^ The predicted number of manatee deaths is 80.0. The actual number of manatee deaths in 2001 was 82, so the predicted value of 80.0 is quite close.

55 Slide 55 Copyright © 2004 Pearson Education, Inc. 1. If there is no significant linear correlation, don’t use the regression equation to make predictions. 2. When using the regression equation for predictions, stay within the scope of the available sample data. 3. A regression equation based on old data is not necessarily valid now. 4. Don’t make predictions about a population that is different from the population from which the sample data was drawn. Guidelines for Using The Regression Equation

56 Slide 56 Copyright © 2004 Pearson Education, Inc. Definitions  Marginal Change: The marginal change is the amount that a variable changes when the other variable changes by exactly one unit.  Outlier: An outlier is a point lying far away from the other data points.  Influential Points: An influential point strongly affects the graph of the regression line.

57 Slide 57 Copyright © 2004 Pearson Education, Inc. Definitions  Residual for a sample of paired ( x, y ) data, the difference ( y - y ) between an observed sample y -value and the value of y, which is the value of y that is predicted by using the regression equation. ^ ^ Residuals and the Least-Squares Property

58 Slide 58 Copyright © 2004 Pearson Education, Inc. Definitions  Residual for a sample of paired ( x, y ) data, the difference ( y - y ) between an observed sample y -value and the value of y, which is the value of y that is predicted by using the regression equation.  Least-Squares Property A straight line satisfies this property if the sum of the squares of the residuals is the smallest sum possible. ^ Residuals and the Least-Squares Property ^

59 Slide 59 Copyright © 2004 Pearson Education, Inc. x 1 2 4 5 y 4 24 8 32 y = 5 + 4 x ^ Figure 9-9 Residuals and the Least-Squares Property

60 Slide 60 Copyright © 2004 Pearson Education, Inc. Created by Erin Hodgess, Houston, Texas Section 10-4 Variation and Prediction Intervals

61 Slide 61 Copyright © 2004 Pearson Education, Inc. Definitions We consider different types of variation that can be used for two major applications:  To determine the proportion of the variation in y that can be explained  by the linear relationship between x and y. 2. To construct interval estimates of predicted y -values. Such intervals are called prediction intervals.

62 Slide 62 Copyright © 2004 Pearson Education, Inc. Definitions Total Deviation The total deviation from the mean of the particular point ( x, y ) is the vertical distance y – y, which is the distance between the point ( x, y ) and the horizontal line passing through the sample mean y.

63 Slide 63 Copyright © 2004 Pearson Education, Inc. Definitions Total Deviation The total deviation from the mean of the particular point ( x, y ) is the vertical distance y – y, which is the distance between the point ( x, y ) and the horizontal line passing through the sample mean y. Explained Deviation is the vertical distance y - y, which is the distance between the predicted y- value and the horizontal line passing through the sample mean y. ^

64 Slide 64 Copyright © 2004 Pearson Education, Inc. Definitions Unexplained Deviation is the vertical distance y - y, which is the vertical distance between the point ( x, y ) and the regression line. (The distance y - y is also called a residual, as defined in Section 9-3.). ^ ^

65 Slide 65 Copyright © 2004 Pearson Education, Inc. Figure 10-10 Unexplained, Explained, and Total Deviation

66 Slide 66 Copyright © 2004 Pearson Education, Inc. (total deviation) = (explained deviation) + (unexplained deviation) ( y - y ) = ( y - y ) + (y - y ) ^ ^ (total variation) = (explained variation) + (unexplained variation)  ( y - y ) 2 =  ( y - y ) 2 +  (y - y) 2 ^^ Formula 10-4

67 Slide 67 Copyright © 2004 Pearson Education, Inc. Definition r 2 = explained variation. total variation or simply square r (determined by Formula 10-1, section 10-2) Coefficient of determination the amount of the variation in y that is explained by the regression line

68 Slide 68 Copyright © 2004 Pearson Education, Inc. The standard error of estimate is a measure of the differences (or distances) between the observed sample y values and the predicted values y that are obtained using the regression equation. Prediction Intervals ^ Definition

69 Slide 69 Copyright © 2004 Pearson Education, Inc. Standard Error of Estimate s e = or s e =  y 2 – b 0  y – b 1  xy n – 2 Formula 10-5  ( y – y ) 2 n – 2 ^

70 Slide 70 Copyright © 2004 Pearson Education, Inc. Given the sample data in Table 9-1, we found that the regression equation is y = –113 + 2.27 x. Find the standard error of estimate s e for the boat/manatee data. ^ n = 10  y 2 = 33456  y = 558  xy = 42214 b 0 = –112.70989 b 1 = 2.27408 s e = n - 2  y 2 - b 0  y - b 1  xy s e = 10 – 2 33456 –(–112.70989)(558) – (2.27408)(42414) Example: Boats and Manatees

71 Slide 71 Copyright © 2004 Pearson Education, Inc. Given the sample data in Table 9-1, we found that the regression equation is y = –113 + 2.27 x. Find the standard error of estimate s e for the boat/manatee data. ^ n = 10  y 2 = 33456  y = 558  xy = 42214 b 0 = –112.70989 b 1 = 2.27408 Example: Boats and Manatees s e = 6.61234 = 6.61

72 Slide 72 Copyright © 2004 Pearson Education, Inc. y - E < y < y + E ^ ^ Prediction Interval for an Individual y where E = t   2 s e n(x2)n(x2) – (  x) 2 n(x 0 – x) 2 1 + + 1 n x 0 represents the given value of x t   2 has n – 2 degrees of freedom

73 Slide 73 Copyright © 2004 Pearson Education, Inc. Given the sample data in Table 9-1, we found that the regression equation is y = –113 + 2.27 x. We have also found that when x = 85, the predicted number of manatee deaths is 80.0. Construct a 95% prediction interval given that x = 85. ^ E = t  2 s e + n(  x 2 ) – (  x) 2 n(x 0 – x) 2 1 + 1 n E = (2.306)(6.6123) 10(55289) – (741) 2 10(85 –74 ) 2 +1 + 1 10 E = 18.1 Example: Boats and Manatees

74 Slide 74 Copyright © 2004 Pearson Education, Inc. Given the sample data in Table 9-1, we found that the regression equation is y = –113 + 2.27 x. We have also found that when x = 85, the predicted number of manatee deaths is 80.0. Construct a 95% prediction interval given that x = 85. ^ Example: Boats and Manatees y – E < y < y + E 80.6 – 18.1 < y < 80.6 + 18.1 62.5 < y < 98.7 ^ ^

75 Slide 75 Copyright © 2004 Pearson Education, Inc. Created by Erin Hodgess, Houston, Texas Section 10-5 Rank Correlation

76 Slide 76 Copyright © 2004 Pearson Education, Inc. Created by Erin Hodgess, Houston, Texas Section 10-5 Rank Correlation

77 Slide 77 Copyright © 2004 Pearson Education, Inc. Rank Correlation Definition  Rank Correlation uses the ranks of sample data consisting of matched pairs.  The rank correlation test is used to test for an association between two variables  H o :  s = 0 (There is no correlation between the two variables.)  H 1 :  s  0 (There is a correlation between the two variables.)

78 Slide 78 Copyright © 2004 Pearson Education, Inc. Advantages 1. The nonparametric method of rank correlation can be used in a wider variety of circumstances than the parametric method of linear correlation. With rank correlation, we can analyze paired data that are ranks or can be converted to ranks. 2. Rank correlation can be used to detect some (not all) relationships that are not linear. 3. The computations for rank correlation are much simpler than the computations for linear correlation, as can be readily seen by comparing the formulas used to compute these statistics.

79 Slide 79 Copyright © 2004 Pearson Education, Inc. Disadvantages A disadvantage of rank correlation is its efficiency rating of 0.91, as described in Section 12-1. This efficiency rating shows that with all other circumstances being equal, the nonparametric approach of rank correlation requires 100 pairs of sample data to achieve the same results as only 91 pairs of sample observations analyzed through parametric methods.

80 Slide 80 Copyright © 2004 Pearson Education, Inc. Assumptions 1. The sample data have been randomly selected. 2. Unlike the parametric methods of Section 10-2, there is no requirement that the sample pairs of data have a bivariate normal distribution. There is no requirement of a normal distribution for any population.

81 Slide 81 Copyright © 2004 Pearson Education, Inc. Notation r s = rank correlation coefficient for sample paired data ( r s is a sample statistic)  s = rank correlation coefficient for all the population data (  s is a population parameter) n = number of pairs of data d = difference between ranks for the two values within a pair r s is often called Spearman’s rank correlation coefficient

82 Slide 82 Copyright © 2004 Pearson Education, Inc. Test Statistic for the Rank Correlation Coefficient where each value of d is a difference between the ranks for a pair of sample data Critical values:  If n  30, refer to Table A-6  If n > 30, use Formula 9-1 r s = 1 – 6  d 2 n(n 2 – 1 )

83 Slide 83 Copyright © 2004 Pearson Education, Inc. Formula 9-6 where the value of z corresponds to the significance level r s = n – 1  z (critical values when n > 30)

84 Slide 84 Copyright © 2004 Pearson Education, Inc. Complete the computation of to get the sample statistic. Figure 12-4 Rank Correlation for Testing H 0 :  s = 0 Start Calculate the difference d for each pair of ranks by subtracting the lower rank from the higher rank. Let n equal the total number of signs. Are the n pairs of data in the form of ranks ? Convert the data of the first sample to ranks from 1 to n and then do the same for the second sample. No Square each difference d and then find the sum of those squares to get r s = 1 – 6d26d2 n(n 2 –1 ) (d2)(d2) Yes

85 Slide 85 Copyright © 2004 Pearson Education, Inc. Complete the computation of to get the sample statistic. r s = 1 – 6d26d2 n(n 2 – 1 ) Is n  30 ? If the sample statistic r s is positive and exceeds the positive critical value, there is a correlation. If the sample statistic r s is negative and is less than the negative critical value, there is a correlation. If the sample statistic r s is between the positive and negative critical values, there is no correlation. Find the critical values of r s in Table A-9 Calculate the critical values where z corresponds to the significance level r s = n – 1  z Yes No Figure 13-4 Rank Correlation for Testing H 0 :  s = 0

86 Slide 86 Copyright © 2004 Pearson Education, Inc. Example: Perceptions of Beauty Use the data in Table 12-6 to determine if there is a correlation between the rankings of men and women in terms of what they find attractive. Use a significance level of  = 0.05.

87 Slide 87 Copyright © 2004 Pearson Education, Inc. Example: Perceptions of Beauty Use the data in Table 12-6 to determine if there is a correlation between the rankings of men and women in terms of what they find attractive. Use a significance level of  = 0.05. H 0 :  s = 0 H 1 :  s  0 n = 10 r s = 1 – 6  d 2 n(n 2 – 1 ) r s = 0.552 r s = 1 – 6(74) 10(10 2 – 1 )

88 Slide 88 Copyright © 2004 Pearson Education, Inc. Example: Perceptions of Beauty Use the data in Table 12-6 to determine if there is a correlation between the rankings of men and women in terms of what they find attractive. Use a significance level of  = 0.05. We refer to Table A-9 to determine that the critical values are  0.648. Because the test statistic of r s = 0.552 does not exceed the critical value of 0.648, we fail to reject the null hypothesis. There is no sufficient evidence to support a claim of a correlation between the rankings of men and women.

89 Slide 89 Copyright © 2004 Pearson Education, Inc. Assume that the preceding example is expanded by including a total of 40 women and that the test statistic r s is found to be 0.291. If the significance level of  = 0.05, what do you conclude about the correlation? Example: Perceptions of Beauty with Large Samples

90 Slide 90 Copyright © 2004 Pearson Education, Inc. Example: Perceptions of Beauty with Large Samples r s = n – 1  z r s = 40 – 1  1.96 =  0.314 These are the critical values.

91 Slide 91 Copyright © 2004 Pearson Education, Inc. The test statistic of r s = 0.291 does not exceed the critical value of 0.314, so we fail to reject the null hypothesis. There is not sufficient evidence to support the claim of a correlation between men and women. Example: Perceptions of Beauty with Large Samples

92 Slide 92 Copyright © 2004 Pearson Education, Inc. The data in Table 12-7 are the numbers of games played and the last scores (in millions) of a Raiders of the Lost Ark pinball game. We expect that there should be an association between the number of games played and the pinball score. Is there sufficient evidence to support the claim that there is such an association? Example: Detecting a Nonlinear Pattern

93 Slide 93 Copyright © 2004 Pearson Education, Inc. Example: Detecting a Nonlinear Pattern

94 Slide 94 Copyright © 2004 Pearson Education, Inc. H 0:  s = 0 H 1:  s  0 n = 9 r s = 1 – 6  d 2 n(n 2 – 1 ) r s = 1 – 6(6) 9(9 2 – 1 ) r s = 0.950 Example: Detecting a Nonlinear Pattern


Download ppt "Slide 1 Copyright © 2004 Pearson Education, Inc. Chapter 10 Correlation and Regression 10-1 Overview Overview 10-2 Correlation 10-3 Regression-3 Regression."

Similar presentations


Ads by Google