Regression Chapter 10 Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania.

Slides:



Advertisements
Similar presentations
Lesson 10: Linear Regression and Correlation
Advertisements

Forecasting Using the Simple Linear Regression Model and Correlation
13- 1 Chapter Thirteen McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
Chapter 9 Hypothesis Testing Understandable Statistics Ninth Edition
Correlation and Regression
Ch11 Curve Fitting Dr. Deshi Ye
Objectives (BPS chapter 24)
Chapter 4 Describing the Relation Between Two Variables
Nonparametric Statistics Chapter 12 Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania.
9. SIMPLE LINEAR REGESSION AND CORRELATION
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Linear Regression and Correlation
The Simple Regression Model
SIMPLE LINEAR REGRESSION
Linear Regression and Correlation Analysis
SIMPLE LINEAR REGRESSION
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Correlation and Regression Analysis
Simple Linear Regression and Correlation
Linear Regression/Correlation
Correlation & Regression
Correlation and Linear Regression
Correlation and Linear Regression
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
Descriptive Methods in Regression and Correlation
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
Linear Regression and Correlation
Correlation and Linear Regression
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
1 Chapter 9. Section 9-1 and 9-2. Triola, Elementary Statistics, Eighth Edition. Copyright Addison Wesley Longman M ARIO F. T RIOLA E IGHTH E DITION.
Chapter 4 Correlation and Regression Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.
Chapter 13 Statistics © 2008 Pearson Addison-Wesley. All rights reserved.
Chapter 6 & 7 Linear Regression & Correlation
Statistics for Business and Economics 7 th Edition Chapter 11 Simple Regression Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
© 2008 Pearson Addison-Wesley. All rights reserved Chapter 1 Section 13-6 Regression and Correlation.
Chapter 4 Correlation and Regression Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.
Introduction to Linear Regression
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
Examining Relationships in Quantitative Research
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Inference for Regression Chapter 14. Linear Regression We can use least squares regression to estimate the linear relationship between two quantitative.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Lecture 10: Correlation and Regression Model.
Chapter 4 Summary Scatter diagrams of data pairs (x, y) are useful in helping us determine visually if there is any relation between x and y values and,
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
CHAPTER 5 CORRELATION & LINEAR REGRESSION. GOAL : Understand and interpret the terms dependent variable and independent variable. Draw a scatter diagram.
Correlation & Regression Analysis
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
Linear Regression and Correlation Chapter GOALS 1. Understand and interpret the terms dependent and independent variable. 2. Calculate and interpret.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Linear Regression and Correlation Chapter 13.
11-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Slide 1 Copyright © 2004 Pearson Education, Inc. Chapter 10 Correlation and Regression 10-1 Overview Overview 10-2 Correlation 10-3 Regression-3 Regression.
Regression and Correlation
Chapter 5 STATISTICS (PART 4).
Correlation and Regression
Linear Regression/Correlation
Correlation and Regression
Product moment correlation
SIMPLE LINEAR REGRESSION
Warsaw Summer School 2017, OSU Study Abroad Program
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Regression Chapter 10 Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 2 Scatter Diagrams A graph in which pairs of points, (x, y), are plotted with x on the horizontal axis and y on the vertical axis. The explanatory variable is x. The response variable is y. One goal of plotting paired data is to determine if there is a linear relationship between x and y.

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 3 Finding the Best Fitting Line One goal of data analysis is to find a mathematical equation that “best” represents the data. For our purposes, “best” means the line that comes closest to each point on the scatter diagram.

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 4 Not All Relationships are Linear

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 5 Correlation Strength Correlation is a measure of the linear relationship between two quantitative variables. If the points lie exactly on a straight line, we say there is perfect correlation. As the strength of the relationship decreases, the correlation moves from perfect to strong to moderate to weak.

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 6 Correlation If the response variable, y, tends to increase as the explanatory variable, x, increases, the variables have positive correlation. If the response variable, y, tends to decrease as the explanatory variable, x, increases, the variables have negative correlation.

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 7 Correlation Examples

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 8 Correlation Coefficient The mathematical measurement that describes the correlation is called the sample correlation coefficient. Denoted by r.

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 9 Correlation Coefficient Features 1)r has no units. 2)-1 ≤ r ≤ 1 3)Positive values of r indicate a positive relationship between x and y (likewise for negative values). 4)r = 0 indicates no linear relationship. 5)Switching the explanatory variable and response variable does not change r. 6)Changing the units of the variables does not change r.

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 10 r = 0

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 11 r = 1 or r = -1

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 12 0 < r < 1

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | < r < 0

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 14 Developing a Formula for r

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 15 Developing a Formula for r

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 16 A Computational Formula for r

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 17 Critical Thinking: Population vs. Sample Correlation Beware: high correlations do not imply a cause and effect relationship between the explanatory and response variables!!

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 18 Critical Thinking: Lurking Variables A lurking variable is neither the explanatory variable nor the response variable. Beware: lurking variables may be responsible for the evident changes in x and y.

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 19 Correlations Between Averages If the two variables of interest are averages rather than raw data values, the correlation between the averages will tend to be higher!

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 20 Linear Regression When we are in the “paired-data” setting: –Do the data points have a linear relationship? –Can we find an equation for the best fitting line? –Can we predict the value of the response variable for a new value of the predictor variable? –What fractional part of the variability in y is associated with the variability in x?

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 21 Least-Squares Criterion The line that we fit to the data will be such that the sum of the squared distances from the line will be minimized. Let d = the vertical distance from any data point to the regression line.

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 22 Least-Squares Criterion

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 23 Least-Squares Line a is the y-intercept. b is the slope.

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 24 Finding the Regression Equation

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 25 Finding the Regression Equation

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 26 Using the Regression Equation The point is always on the least-squares line. Also, the slope tells us that if we increase the explanatory variable by one unit, the response variable will change by the slope (increase or decrease, depending on the sign of the slope).

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 27 Critical Thinking: Making Predictions We can simply plug in x values into the regression equation to calculate y values. Extrapolation can result in unreliable predictions.

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 28 Coefficient of Determination Another way to gauge the fit of the regression equation is to calculate the coefficient of determination, r 2. We can view the mean of y as a baseline for all the y values. Then:

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 29

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 30 Coefficient of Determination We algebraically manipulate the above equation to obtain:

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 31 Coefficient of Determination Thus, r 2 is the measure of the total variability in y that is explained by the regression on x.

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 32 Coefficient of Determination

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 33 Inferences on the Regression Equation We can make inferences about the population correlation coefficient, ρ, and the population regression line slope, β.

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 34 Inferences on the Regression Equation Inference requirements: –The set of ordered pairs (x, y) constitutes a random sample from all ordered pairs in the population. –For each fixed value of x, y has a normal distribution. –All distributions for all y values have equal variance and a mean that lies on the regression equation.

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 35 Testing the Correlation Coefficient

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 36 Testing the Correlation Coefficient We convert to a Student’s t distribution. The sample size must be ≥ 3.

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 37 Testing Procedure

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 38 Standard Error of Estimate As the points get closer to the regression line, S e decreases. If all the points lie exactly on the line, S e will be zero.

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 39 Computational Formula For S e

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 40 Confidence Intervals for y The population regression line: y = α + βx + ε Where ε is random error Because of ε, for each x value there is a corresponding distribution for y. Each y distribution has the same standard deviation, estimated by S e. Each y distribution is centered on the regression line.

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 41 Confidence Intervals for Predicted y

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 42 Confidence Intervals for Predicted y

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 43 Confidence Intervals for Predicted y The confidence intervals will be narrower near the mean of x. The confidence intervals will be wider near the extreme values of x.

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 44 Confidence Intervals for Predicted y

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 45 Inferences About Slope β Recall the population regression equation: y = α + βx + ε Let S e be the standard error of estimate from the sample.

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 46 Inferences About Slope β Has a Student’s t distribution with n – 2 degrees of freedom.

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 47 Testing β

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 48 Finding a Confidence Interval for β

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 49 Multiple Regression In practice, most predictions will improve if we consider additional data. Statistically, this means adding predictor variables to our regression techniques.

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 50 Multiple Regression Terminology y is the response variable. x 1 … x k are explanatory variables. b 0 … b k are coefficients.

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 51 Regression Models – A mathematical package that includes the following: 1)A collection of variables, one of which is the response, and the rest are predictors. 2)A collection of numerical values associated with the given variables. 3)Using software and the above numerical values, construct a least-squares regression equation.

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 52 Regression Models – A mathematical package that includes the following: 4)The model includes information about the variables, coefficients of the equation, and various “goodness of fit” measures. 5)The model allows the user to make predictions and build confidence intervals for various components of the equation.

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 53 Coefficient of Multiple Determination How well does our equation fit the data? –Just as in linear regression, computer output will provide r 2. r 2 is a measure of the amount of variability in y that is explained by the regression on all of the x variables.

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 54 Predictions in the Multiple Regression Case Similar to the techniques described for linear regression, we can “plug in” values for all of the x variables and compute a predicted y value. –Predicting y for any x values outside the x- range is extrapolation, and the results will be unreliable.

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 55 Testing a Coefficient for Significance At times, one or more of the predictor variables may not be helpful in determining y. Using software, we can test any or all of the x variables for statistical significance.

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 56 Testing a Coefficient for Significance The hypotheses will be: H 0 : β i = 0 H 1 : β i ≠ 0 for the i th coefficient in the model. If we fail to reject the null hypothesis for a given coefficient, we may consider removing that predictor from the model.

Copyright © Houghton Mifflin Harcourt Publishing Company. All rights reserved.10 | 57 Confidence Intervals for Coefficients As in the linear case, software can provide confidence intervals for any β i in the model.