Download presentation

Presentation is loading. Please wait.

Published byLee Wynter Modified over 2 years ago

1
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 1 Chapter 10 Correlation and Regression 10-2 Correlation (Part 1 Only) 10-3 Regression (Part 1 Only)

2
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 2 Preview In this chapter we introduce methods for determining whether a correlation, or association, between two variables exists and whether the correlation is linear. For linear correlations, we can identify an equation that best fits the data and we can use that equation to predict the value of one variable given the value of the other variable.

3
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 3 Section 10-2 Correlation

4
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 4 Key Concept Part 1 of this section introduces the linear correlation coefficient r, which is a numerical measure of the strength of the relationship between two variables representing quantitative data. Using paired sample data (sometimes called bivariate data), we find the value of r (usually using technology), then we use that value to conclude that there is (or is not) a linear correlation between the two variables.

5
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 5 Key Concept In this section we consider only linear relationships, which means that when graphed, the points approximate a straight- line pattern.

6
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 6 Part 1: Basic Concepts of Correlation

7
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 7 Definition A correlation exists between two variables when the values of one are somehow associated with the values of the other in some way.

8
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 8 Definition The linear correlation coefficient r measures the strength of the linear relationship between the paired quantitative x- and y- values in a sample.

9
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 9 Exploring the Data We can often see a relationship between two variables by constructing a scatterplot. Figure 10-2 following shows scatterplots with different correlation coefficients.

10
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 10 Scatterplots of Paired Data Figure 10-2

11
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 11 Scatterplots of Paired Data Figure 10-2

12
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 12 Scatterplots of Paired Data Figure 10-2

13
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 13 Example: problem 10 on page 509 xy 107.46 86.77 1312.74 97.11 117.81 148.84 66.08 45.39 128.15 76.42 55.73 (a)Construct a scatterplot

14
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 14 Scatterplot with Graphing Calculator Enter the X data values in two lists. Press 2nd STATPLOT and choose #1 PLOT 1. Be sure the plot is ON, the scatter plot icon is highlighted (top row, first icon) Enter the list of the X data values next to Xlist, and the list of the Y data values next to Ylist. Choose any of the three marks. Press ZOOM and #9 ZoomStat.

15
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 15 Example: problem 10 on page 509 (a)Scatterplot Note the outlier at x=13

16
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 16 Properties of the Linear Correlation Coefficient r 1. –1 r 1 2.if all values of either variable are converted to a different scale, the value of r does not change. 3. the value of r is not affected by the choice of x and y. Interchange all x- and y- values and the value of r will not change. 4. r measures strength of a linear relationship (the closer r is to positive or negative 1, the “more linear” the relationship)

17
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 17 Formula 10-1 n xy – ( x)( y) n( x 2 ) – ( x) 2 n( y 2 ) – ( y) 2 r =r = The linear correlation coefficient r measures the strength of a linear relationship between the paired values in a sample. Computer software or calculators can compute r Formula For r

18
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 18 Notation for the Linear Correlation Coefficient n = number of pairs of sample data denotes the addition of the items indicated. x denotes the sum of all x- values. x 2 indicates that each x - value should be squared and then those squares added. ( x ) 2 indicates that the x- values should be added and then the total squared.

19
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 19 Notation for the Linear Correlation Coefficient xy indicates that each x -value should be first multiplied by its corresponding y- value. After obtaining all such products, find their sum. r = linear correlation coefficient for sample data. = linear correlation coefficient for population data.

20
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 20 Example: problem 10 on page 509 xy 107.46 86.77 1312.74 97.11 117.81 148.84 66.08 45.39 128.15 76.42 55.73 (b)Find the linear correlation coefficient r and determine if there is sufficient evidence to support the claim of a linear correlation between x and y.

21
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 21 (see page 508 of your book) Enter the data in two lists. Press STAT and select TESTS LinRegTTest is option F (scroll arrow up 3 places) Enter the names of the lists from step 1. Arrow down to Calculate and then press Enter The r value is the last value displayed; round this value to three decimal places Calculate r Using Calculator

22
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 22 Example: problem 10 on page 509 Calculate r Using Calculator xy 107.46 86.77 1312.74 97.11 117.81 148.84 66.08 45.39 128.15 76.42 55.73 (b)Using calculator we find that r = 0.816

23
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 23 Interpreting r Using Table A-5: Find the row with same value of n as the data set If the absolute value of the computed value of r, denoted | r |, is greater than the number in the first column in Table A-5, conclude that there is a linear correlation. Otherwise, there is not sufficient evidence to support the conclusion of a linear correlation.

24
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 24 Table A-5 (b)For n=11 critical value of r is 0.602

25
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 25 Example: problem 10 on page 509 Interpret r xy 107.46 86.77 1312.74 97.11 117.81 148.84 66.08 45.39 128.15 76.42 55.73 (b) Comparing 0.602 with r = 0.816, there is a (positive) linear correlation since r is greater than 0.602

26
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 26 Example: problem 10 on page 509 Interpret r (c) Identify the feature of the data that would be missed if part (b) was completed without the use of a scatterplot ANSWER: scatterplot shows the relationship is linear except for one point (an error or an outlier at x=13)

27
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 27 Properties of the Linear Correlation Coefficient r 5.r is very sensitive to outliers, they can dramatically affect its value. If (13,12.74) is removed from the list in the previous example, the calculator gives r = 0.999996554

28
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 28 NOTE: if r is negative, take its absolute value before comparing with value in Table A-5 Example: r = -0.375 and n=17 Interpret r From Table A-5, for n=17 we get a value of 0.482 Comparing | r |=0.375 there is not evidence for a linear correlation since | r | is less than 0.482

29
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 29 Caution Know that the methods of this section apply to a linear correlation. If you conclude that there does not appear to be linear correlation, know that it is possible that there might be some other association that is not linear.

30
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 30 Interpreting r : Explained Variation The value of r 2 is the proportion of the variation in y that is explained by the linear relationship between x and y.

31
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 31 Using example problem 10 on page 509, the linear correlation coefficient was r = 0.816. The proportion of y can be explained by the variation in x: r 2 =(0.816)(0.816)=0.666 We conclude that 0.666 (or about 67%) of the variation in y can be explained by the variation in x. This implies that about 33% of the variation in y cannot be explained by the variation in x. Example:

32
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 32 Common Errors Involving Correlation 1. Causation: It is wrong to conclude that correlation implies causality. 2. Averages: Averages suppress individual variation and may inflate the correlation coefficient. 3. Linearity: There may be some relationship between x and y even when there is no linear correlation.

33
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 33 Recap In this section, we have discussed: Correlation. The linear correlation coefficient r. Requirements, notation and formula for r. Interpreting r.

34
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 34 Section 10-3 Regression

35
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 35 Key Concept In part 1 of this section we find the equation of the straight line that best fits the paired sample data. That equation algebraically describes the relationship between two variables. The best-fitting straight line is called a regression line and its equation is called the regression equation.

36
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 36 Part 1: Basic Concepts of Regression

37
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 37 Regression The regression equation expresses a relationship between x (called the explanatory variable, predictor variable or independent variable), and y (called the response variable or dependent variable). ^

38
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 38 Definitions Regression Equation Given a collection of paired data, the regression equation Regression Line The graph of the regression equation is called the regression line (or line of best fit, or least squares line). y = b 0 + b 1 x ^ algebraically describes the relationship between the two variables.

39
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 39 The regression line is the best linear fit of the sample points. Special Property

40
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 40 Regression The typical equation of a straight line: m = slope of the line (coefficient of x) b = y-intercept of the line (constant) y = m x + b

41
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 41 Notation for Regression Equation (Sample) y -intercept of regression equation Slope of regression equation Equation of the regression line Sample Statistic b 0 b 1 y = b 0 + b 1 x ^

42
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 42 Notation for Regression Equation (Population) y -intercept of regression equation Slope of regression equation Equation of the regression line Population Parameter 0 1 y = 0 + 1 x

43
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 43 Requirements 1. The sample of paired ( x, y ) data is a random sample of quantitative data. 2. Visual examination of the scatterplot shows that the points approximate a straight-line pattern. 3. Any outliers must be removed if they are known to be errors. Consider the effects of any outliers that are not known errors.

44
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 44 Formulas for b 0 and b 1 Formula 10-3 (slope) ( y -intercept) Formula 10-4 calculators or computers can compute these values (round to three significant digits)

45
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 45 Calculator 1.Enter the data in two lists. 2.Make a scatter plot of the data (use 2 nd Y= to get STAT PLOT, choose Plot1 On, first scatterplot icon, then zoom 9) (your plot will look different)

46
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 46 Calculator 3. Plot the regression line. Choose: 4. STAT → CALC #4 LinReg(ax+b) Include the parameters L 1, L 2, Y 1. NOTE: Y 1 comes from VARS → YVARS, #Function, Y 1

47
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 47 Calculator 5.Choose Y= and the equation for the regression line will be stored in Y 1 Then choose GRAPH and the regression line will be plotted.

48
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 48 Calculator 6. Choose TRACE and you can see X and Y values on scatterplot or regression line

49
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 49 Example: problem 10 on page 526 Calculate Regression Line xy 107.46 86.77 1312.74 97.11 117.81 148.84 66.08 45.39 128.15 76.42 55.73

50
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 50 Example: problem 10 on page 526 Scatterplot Calculate Regression Line

51
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 51 Example: problem 10 on page 526 Regression line Calculate Regression Line

52
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 52 Calculator A quick method of determining linear regression coefficients (without graphing the line) Choose: Enter lists and then Calculate You can then graph the line by entering equation by hand in Y= STAT → TESTS F LinRegTTest

53
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 53 1.Use the regression equation for predictions only if the graph of the regression line on the scatterplot confirms that the regression line fits the points reasonably well. Using the Regression Equation for Predictions 2.Use the regression equation for predictions only if the linear correlation coefficient r indicates that there is a linear correlation between the two variables (as described in Section 10-2).

54
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 54 3.Use the regression line for predictions only if the data do not go much beyond the scope of the available sample data. (Predicting too far beyond the scope of the available sample data is called extrapolation, and it could result in bad predictions.) Using the Regression Equation for Predictions 4.If the regression equation does not appear to be useful for making predictions, the best predicted value of a variable is its point estimate, which is its sample mean.

55
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 55 Strategy for Predicting Values of Y

56
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 56 Use results from problem 10 on page 526 predict y when x=6.25 Regression line is: The r value for the data was 0.816 (see example on previous slides from 10-2) which is greater than critical r value from Table A-5, so there is a linear correlation Calculate y From Regression Line

57
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 57 Predicted y value is: Calculate y From Regression Line

58
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 10.1 - 58 Problem 6 on page 525 Given n=8 (“eight pairs”) and r = 0.693 Using Table A-5 for n=8 gives critical r value as 0.707. Since r is less than this value, there is not a linear correlation in the data. Use the given mean of the daughter’s heights to estimate y. Calculate y From Regression Line

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google