LEAST-SQUARES REGRESSION 3.2 Least Squares Regression Line and Residuals.

Slides:



Advertisements
Similar presentations
Least-Squares Regression Section 3.3. Correlation measures the strength and direction of a linear relationship between two variables. How do we summarize.
Advertisements

2nd Day: Bear Example Length (in) Weight (lb)
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Regression, Residuals, and Coefficient of Determination Section 3.2.
Section 3.2 Least-Squares Regression
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
AP STATISTICS LESSON 3 – 3 LEAST – SQUARES REGRESSION.
3.3 Least-Squares Regression.  Calculate the least squares regression line  Predict data using your LSRL  Determine and interpret the coefficient of.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
Chapter 3-Examining Relationships Scatterplots and Correlation Least-squares Regression.
Linear Regression Day 1 – (pg )
Business Statistics for Managerial Decision Making
Residuals Recall that the vertical distances from the points to the least-squares regression line are as small as possible.  Because those vertical distances.
SWBAT: Calculate and interpret the equation of the least-squares regression line Do Now: If data set A of (x, y) data has correlation r = 0.65, and a second.
CHAPTER 3 Describing Relationships
LEAST-SQUARES REGRESSION 3.2 Role of s and r 2 in Regression.
Unit 4 Lesson 3 (5.3) Summarizing Bivariate Data 5.3: LSRL.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 3: Describing Relationships Section 3.2 Least-Squares Regression.
Chapter 5 Lesson 5.2 Summarizing Bivariate Data 5.2: LSRL.
Chapters 8 Linear Regression. Correlation and Regression Correlation = linear relationship between two variables. Summarize relationship with line. Called.
Chapter 3: Describing Relationships
Describing Relationships. Least-Squares Regression  A method for finding a line that summarizes the relationship between two variables Only in a specific.
Describing Bivariate Relationships. Bivariate Relationships When exploring/describing a bivariate (x,y) relationship: Determine the Explanatory and Response.
1. Analyzing patterns in scatterplots 2. Correlation and linearity 3. Least-squares regression line 4. Residual plots, outliers, and influential points.
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Ice Cream Sales vs Temperature
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Least-Squares Regression
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
GET OUT p.161 HW!.
Chapter 3 Describing Relationships Section 3.2
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Least-Squares Regression
Least-Squares Regression
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
3.2 – Least Squares Regression
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
A medical researcher wishes to determine how the dosage (in mg) of a drug affects the heart rate of the patient. Find the correlation coefficient & interpret.
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
9/27/ A Least-Squares Regression.
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Presentation transcript:

LEAST-SQUARES REGRESSION 3.2 Least Squares Regression Line and Residuals

Correlation measures the strength and direction of a linear relationship between two variables. How do we summarize the overall pattern of a linear relationship? Draw a line! Recall from 3.1:

Regression Line A regression line summarizes the relationship between two variables, but only in settings where one of the variables helps explain or predict the other.

Regression Line A regression line is a line that describes how a response variable y changes as an explanatory variable x changes. We often use a regression line to predict the value of y for a given value of x.

Example p. 165 How much is a truck worth? Everyone knows that cars and trucks lose value the more they are driven. Can we predict the price of a used Ford F-150 SuperCrew 4 x 4 if we know how many miles it has on the odometer? A random sample of 16 used Ford F-150 SuperCrew 4 x 4s was selected from among those listed for sale at autotrader.com. The number of miles driven and price (in dollars) was recorded for each of the trucks. Here are the data: Miles driven70,583129,48429,93229,95324,49575, Price (in dollars)21, ,87541,995 28,98631,89137,991 Miles driven34,07758,02344,44768,474144,162140,77629,397131,385 Price (in dollars)34,99529,98822,89633,96116,88320,89727,49513,997

Example p. 165 Miles driven70,583129,48429,93229,95324,49575, Price (in dollars)21, ,87541,995 28,98631,89137,991 Miles driven34,07758,02344,44768,474144,162140,77629,397131,385 Price (in dollars)34,99529,98822,89633,96116,88320,89727,49513,997

Interpreting a Regression Line Suppose that y is a response variable (plotted on the vertical axis) and x is an explanatory variable (plotted on the horizontal axis). A regression line relating y to x has an equation of the form ŷ = a + bx

Interpreting a Regression Line ŷ = a + bx In this equation, ŷ (read “y hat”) is the predicted value of the response variable y for a given value of the explanatory variable x. b is the slope, the amount by which y is predicted to change when x increases by one unit. a is the y intercept, the predicted value of y when x = 0.

Example p. 166: Interpreting slope and y intercept The equation of the regression line shown is PROBLEM: Identify the slope and y intercept of the regression line. Interpret each value in context. SOLUTION: The slope b = tells us that the price of a used Ford F-150 is predicted to go down by dollars (16.29 cents) for each additional mile that the truck has been driven.

The equation of the regression line shown is PROBLEM: Identify the slope and y intercept of the regression line. Interpret each value in context. SOLUTION: The y intercept a = 38,257 is the predicted price of a Ford F-150 that has been driven 0 miles. Example p. 166: Interpreting slope and y intercept

Prediction – Example, p. 167 We can use a regression line to predict the response ŷ for a specific value of the explanatory variable x. Use the regression line to predict price for a Ford F-150 with 100,000 miles driven.

Extrapolation – p. 167 Suppose we wanted to predict the price of a vehicle that had 300,000 miles. According to the regression line, the vehicle would have a negative price. A negative price doesn’t make sense.

Extrapolation Extrapolation is the use of a regression line for prediction far outside the interval of values of the explanatory variable x used to obtain the line. Such predictions are often not accurate. Don’t make predictions using values of x that are much larger or much smaller than those that actually appear in your data.

Residuals A residual is the difference between an observed value of the response variable and the value predicted by the regression line. residual = observed y – predicted y residual = y - ŷ A residual is the difference between an observed value of the response variable and the value predicted by the regression line. residual = observed y – predicted y residual = y - ŷ

Special Property of Residuals The mean of the LS residuals are always zero!

Example, p. 169

Least Squares Regression Line The least-squares regression line of y on x is the line that makes the sum of the squared residuals as small as possible.

Facts about LSRL:

Getting the LSRL a = y-intercept b = slope r 2 = coefficient of determination r = correlation Exercise 68

To plot the line on the scatterplot by hand:

For Example: Smallest x = 9, Largest x = 42 Use these two x-values to predict y. From data set: Exercise 68

For Example: Smallest x = 9, Largest x = 42 (9, ), (42, )

Residual Plot A scatterplot of the regression residuals against the explanatory variable (x). Helps us assess the fit of a regression line.

Residuals vs. Correlation Never rely on correlation alone to determine if an LSRL is the best model for the data. You must check the residual plot!

Examining Residual Plots A residual plot magnifies the deviations of the points from the line, making it easier to see unusual observations and patterns. The residual plot should show no obvious patterns The residuals should be relatively small in size. Pattern in residuals Linear model not appropriate Pattern in residuals Linear model not appropriate

Residual Plots

HW Due: Tuesday p.193 & 199 # 35, 39, 41, 45, 52, 54, 76