Presentation is loading. Please wait.

Presentation is loading. Please wait.

Section 10.5-1 Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.

Similar presentations


Presentation on theme: "Section 10.5-1 Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series."— Presentation transcript:

1 Section 10.5-1 Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series by Mario F. Triola

2 Section 10.5-2 Copyright © 2014, 2012, 2010 Pearson Education, Inc. Chapter 10 Correlation and Regression 10-1 Review and Preview 10-2 Correlation 10-3 Regression 10-4 Prediction Intervals and Variation 10-5 Multiple Regression 10-6 Nonlinear Regression

3 Section 10.5-3 Copyright © 2014, 2012, 2010 Pearson Education, Inc. Key Concept This section presents a method for analyzing a linear relationship involving more than two variables. We focus on these key elements: 1. Finding the multiple regression equation. 2. The values of the adjusted R 2, and the P-value as measures of how well the multiple regression equation fits the sample data.

4 Section 10.5-4 Copyright © 2014, 2012, 2010 Pearson Education, Inc. Part 1:Basic Concepts of a Multiple Regression Equation

5 Section 10.5-5 Copyright © 2014, 2012, 2010 Pearson Education, Inc. Definition A multiple regression equation expresses a linear relationship between a response variable y and two or more predictor variables The general form of the multiple regression equation obtained from sample data is

6 Section 10.5-6 Copyright © 2014, 2012, 2010 Pearson Education, Inc. (General form of the multiple regression equation) n = sample size k= number of predictor variables ŷ = predicted value of y are the predictor variables Notation

7 Section 10.5-7 Copyright © 2014, 2012, 2010 Pearson Education, Inc. Technology Use a statistical software package such as  STATDISK  Minitab  Excel  TI-83/84  StatCrunch

8 Section 10.5-8 Copyright © 2014, 2012, 2010 Pearson Education, Inc. Example Table 10-4 includes a random sample of heights of mothers, fathers, and their daughters (based on data from the National Health and Nutrition Examination). Find the multiple regression equation in which the response (y) variable is the height of a daughter and the predictor (x) variables are the height of the mother and height of the father.

9 Section 10.5-9 Copyright © 2014, 2012, 2010 Pearson Education, Inc. Example The Minitab results are shown here:

10 Section 10.5-10 Copyright © 2014, 2012, 2010 Pearson Education, Inc. Definition  The multiple coefficient of determination, R 2, is a measure of how well the multiple regression equation fits the sample data. A perfect fit would result in R 2 = 1.  The adjusted coefficient of determination, R 2, is the multiple coefficient of determination modified to account for the number of variables and the sample size.

11 Section 10.5-11 Copyright © 2014, 2012, 2010 Pearson Education, Inc. Adjusted Coefficient of Determination Adjusted wheren = sample size k = number of predictor ( x ) variables

12 Section 10.5-12 Copyright © 2014, 2012, 2010 Pearson Education, Inc. Example The preceding technology display shows the adjusted coefficient of determination as R-Sq(adj) = 63.7%. When we compare this multiple regression equation to others, it is better to use the adjusted R 2 of 63.7%

13 Section 10.5-13 Copyright © 2014, 2012, 2010 Pearson Education, Inc. P-Value The P-value is a measure of the overall significance of the multiple regression equation. The displayed technology P-value of 0.000 is small, indicating that the multiple regression equation has good overall significance and is usable for predictions. That is, it makes sense to predict the heights of daughters based on heights of mothers and fathers. The value of 0.000 results from a test of the null hypothesis that β 1 = β 2 = 0, and rejection of this hypothesis indicates the equation is effective in predicting the heights of daughters.

14 Section 10.5-14 Copyright © 2014, 2012, 2010 Pearson Education, Inc. Finding the Best Multiple Regression Equation 1. Use common sense and practical considerations to include or exclude variables. 2. Consider the P-value. Select an equation having overall significance, as determined by the P-value found in the computer display.

15 Section 10.5-15 Copyright © 2014, 2012, 2010 Pearson Education, Inc. Finding the Best Multiple Regression Equation 3.Consider equations with high values of adjusted R 2 and try to include only a few variables. If an additional predictor variable is included, the value of adjusted R 2 does not increase by a substantial amount. For a given number of predictor (x) variables, select the equation with the largest value of adjusted R 2. In weeding out predictor (x) variables that don’t have much of an effect on the response (y) variable, it might be helpful to find the linear correlation coefficient r for each of the paired variables being considered.

16 Section 10.5-16 Copyright © 2014, 2012, 2010 Pearson Education, Inc. Example Data Set 2 in Appendix B includes the age, foot length, shoe print length, shoe size, and height for each of 40 different subjects. Using those sample data, find the regression equation that is the best for predicting height. The table on the next slide includes key results from the combinations of the five predictor variables.

17 Section 10.5-17 Copyright © 2014, 2012, 2010 Pearson Education, Inc. Example - Continued

18 Section 10.5-18 Copyright © 2014, 2012, 2010 Pearson Education, Inc. Example - Continued Using critical thinking and statistical analysis: 1.Delete the variable age. 2.Delete the variable shoe size, because it is really a rounded form of foot length. 3.For the remaining variables of foot length and shoe print length, select foot length because its adjusted R 2 of 0.7014 is greater than 0.6520 for shoe print length. 4.Although it appears that only foot length is best, we note that criminals usually wear shoes, so shoe print lengths are likely to be found than foot lengths.

19 Section 10.5-19 Copyright © 2014, 2012, 2010 Pearson Education, Inc. Part 2:Dummy Variables and Logistic Equations

20 Section 10.5-20 Copyright © 2014, 2012, 2010 Pearson Education, Inc. Dummy Variable Many applications involve a dichotomous variable which has only two possible discrete values (such as male/female, dead/alive, etc.). A common procedure is to represent the two possible discrete values by 0 and 1, where 0 represents “failure” and 1 represents “success”. A dichotomous variable with the two values 0 and 1 is called a dummy variable.

21 Section 10.5-21 Copyright © 2014, 2012, 2010 Pearson Education, Inc. Example The data in the table also includes the dummy variable of sex (coded as 0 = female and 1 = male). Given that a mother is 63 inches tall and a father is 69 inches tall, find the regression equation and use it to predict the height of a daughter and a son.

22 Section 10.5-22 Copyright © 2014, 2012, 2010 Pearson Education, Inc. Example

23 Section 10.5-23 Copyright © 2014, 2012, 2010 Pearson Education, Inc. Example - Continued Using technology, we get the regression equation: We substitute in 0 for the height, 63 for the mother, and 69 for the father, and predict the daughter will be 62.8 inches tall. We substitute in 1 for the height, 63 for the mother, and 69 for the father, and predict the son will be 67 inches tall.

24 Section 10.5-24 Copyright © 2014, 2012, 2010 Pearson Education, Inc. Logistic Regression We can use the methods of this section if the dummy variable is the predictor variable. If the dummy variable is the response variable we need to use a method known as logistic regression. As the name implies logistic regression involves the use of natural logarithms. This textbook does not include detailed procedures for using logistic regression (there is a brief example on page 546 of the text).


Download ppt "Section 10.5-1 Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series."

Similar presentations


Ads by Google