Chapter 8 Linear Regression *The Linear Model *Residuals *Best Fit Line *Correlation and the Line *Predicated Values *Regression.

Slides:



Advertisements
Similar presentations
Copyright © 2010 Pearson Education, Inc. Slide The correlation between two scores X and Y equals 0.8. If both the X scores and the Y scores are converted.
Advertisements

Chapter 8 Linear Regression
Linear Regression.  The following is a scatterplot of total fat versus protein for 30 items on the Burger King menu:  The model won’t be perfect, regardless.
Chapter 8 Linear regression
Chapter 8 Linear regression
Linear Regression Copyright © 2010, 2007, 2004 Pearson Education, Inc.
Copyright © 2010 Pearson Education, Inc. Chapter 8 Linear Regression.
Residuals Revisited.   The linear model we are using assumes that the relationship between the two variables is a perfect straight line.  The residuals.
Chapter 8 Linear Regression.
Warm up Use calculator to find r,, a, b. Chapter 8 LSRL-Least Squares Regression Line.
Copyright © 2009 Pearson Education, Inc. Chapter 8 Linear Regression.
Chapter 8 Linear Regression © 2010 Pearson Education 1.
Statistics Residuals and Regression. Warm-up In northern cities roads are salted to keep ice from freezing on the roadways between 0 and -9.5 ° C. Suppose.
CHAPTER 8: LINEAR REGRESSION
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 8 Linear Regression.
Chapter 7 Linear Regression.
Chapter 8: Linear Regression
AP Statistics Chapter 8: Linear Regression
Least Squares Regression Line (LSRL)
Chapter 8: Linear Regression
Linear Regression.
Introduction to Linear Regression and Correlation Analysis
1 Chapter 3: Examining Relationships 3.1Scatterplots 3.2Correlation 3.3Least-Squares Regression.
CHAPTER 8 LEAST SQUARES LINE. IMPORTANT TERMS – WRITE THESE DOWN WE WILL WORK ON THE DEFINITIONS AS WE GO! Model Linear Model Predicted value Residuals.
Chapter 151 Describing Relationships: Regression, Prediction, and Causation.
3.3 Least-Squares Regression.  Calculate the least squares regression line  Predict data using your LSRL  Determine and interpret the coefficient of.
 The equation used to calculate Cab Fare is y = 0.75x where y is the cost and x is the number of miles traveled. 1. What is the slope in this equation?
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 8 Linear Regression.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 1 of 20 Chapter 4 Section 2 Least-Squares Regression.
Slide 8- 1 Copyright © 2010 Pearson Education, Inc. Active Learning Lecture Slides For use with Classroom Response Systems Business Statistics First Edition.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 12: Analyzing the Association Between Quantitative Variables: Regression Analysis Section.
Chapters 8 & 9 Linear Regression & Regression Wisdom.
Objective: Understanding and using linear regression Answer the following questions: (c) If one house is larger in size than another, do you think it affects.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 8 Linear Regression.
Chapter 8 Linear Regression. Slide 8- 2 Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 8 Linear Regression (3)
Chapter 8 Linear Regression
 Find the Least Squares Regression Line and interpret its slope, y-intercept, and the coefficients of correlation and determination  Justify the regression.
Chapter 8 Linear Regression HOW CAN A MODEL BE CREATED WHICH REPRESENTS THE LINEAR RELATIONSHIP BETWEEN TWO QUANTITATIVE VARIABLES?
Chapter 8 Linear Regression. Objectives & Learning Goals Understand Linear Regression (linear modeling): Create and interpret a linear regression model.
CHAPTER 8 Linear Regression. Residuals Slide  The model won’t be perfect, regardless of the line we draw.  Some points will be above the line.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 8- 1.
LEAST-SQUARES REGRESSION 3.2 Least Squares Regression Line and Residuals.
Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:
Linear Regression Chapter 8. Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King.
Copyright © 2010 Pearson Education, Inc. Chapter 8 Linear Regression.
Copyright © 2009 Pearson Education, Inc. Chapter 8 Linear Regression.
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 7, Slide 1 Chapter 7 Linear Regression.
Statistics 8 Linear Regression. Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King.
Honors Statistics Chapter 8 Linear Regression. Objectives: Linear model Predicted value Residuals Least squares Regression to the mean Regression line.
Part II Exploring Relationships Between Variables.
1 Objective Given two linearly correlated variables (x and y), find the linear function (equation) that best describes the trend. Section 10.3 Regression.
AP Statistics Chapter 8 Linear Regression. Objectives: Linear model Predicted value Residuals Least squares Regression to the mean Regression line Line.
Bell Ringer CountryMarijuana (%) Other Drugs (%) Czech Rep.224 Denmark173 England4021 Finland51 Ireland3716 Italy198 Ireland2314 Norway63 Portugal73 Scotland5331.
Least Square Regression Line. Line of Best Fit Our objective is to fit a line in the scatterplot that fits the data the best As just seen, the best fit.
Linear Regression Essentials Line Basics y = mx + b vs. Definitions
Training Activity 4 (part 2)
Chapter 8 Linear Regression.
Finding the Best Fit Line
Chapter 8 Linear Regression.
Chapter 7 Linear Regression.
Chapter 8 Linear Regression Copyright © 2010 Pearson Education, Inc.
Chapter 8 – Linear Regression
Finding the Best Fit Line
Chapter 8 Part 2 Linear Regression
Looking at Scatterplots
Chapter 8 Part 1 Linear Regression
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
9/27/ A Least-Squares Regression.
Presentation transcript:

Chapter 8 Linear Regression *The Linear Model *Residuals *Best Fit Line *Correlation and the Line *Predicated Values *Regression

Burger King- Fat vs Protein x variable = protein y variable = fat predict how much fat is in a menu item based on its protein How much fat is in a sandwich that has 25 grams of protein?

Linear Model To predict values we need an equation of the “best fit” line to go with our scatterplot Note: this is a MODEL. it will not tell you exactly how much fat is in a sandwich based on its protein content. It will give you a predicted value. Linear Model: an equation of a straight line through the data summarizes the general pattern helps us understand how the variables are associated the line will not hit all the points, it might not hit any of the points

Notation “putting a hat on it” is standard statistics notation to indicate that something has been predicted by a model. Whenever you see a hat over a variable name or symbol, you can assume it is the predicted version of that variable or symbol

Residuals the difference between the observed value and its associated predicted value tells us how off the model’s prediction is at that point residual = observed value – predicted value residual = y – y ( y hat) = predicted value negative residual means the predicted value is too big = overestimate positive residual means the predicted value is too small = underestimate

Residual - Example In the figure, the estimated fat of the BK Broiler chicken sandwich is 36 grams, while the true value of fat is 25 grams residual is -11 grams 25 – 36 = -11

Best Fit Line When we draw a line through a scatterplot, some residuals are positive and some are negative we can’t add these up to tell us anything because the positive and negative would just cancel each other out. What can we do to out residuals to that we can add them up for a number that would actually tell us something?? Squaring all the residuals will make them all positive it will also emphasize the larger ones When we add up all these squared residuals the sum tells us how well the line we drew fits the data the smaller the sum the better the fit the larger the sum the worse the fit

Line of Best Fit Is the line for which the sum of the squared residuals is the smallest Taking a look at the residual Taking a look at the residual

Correlation and the Line The figure shows the scatterplot of z-scores for fat and protein. If a burger has average protein content, it should have about average fat content too. Moving one standard deviation away from the mean in x moves us r standard deviations away from the mean in y.

Looking at standardized data Scatterplot: z y (standardized fat) vs. z x (standardized protein) The line must go through the point (x, y) when plotting the z-scores the line must pass through the origin (0,0) (because it’s the mean) Put generally, moving any number of standard deviations away from the mean in x moves us r times that number of standard deviations away from the mean in y.

How big can a predicted value get? r cannot be bigger than 1 (in absolute value) Each predicted y tends to be closer to its mean (in standard deviations) than its corresponding x was. this property is called regression to the mean the line is called the regression line

The Regression Line in Real Units Remember from Algebra that a straight line can be written as: In Statistics we use a slightly different notation: We write to emphasize that the points that satisfy this equation are just our predicted values, not the actual data values. This model says that our predictions from our model follow a straight line. If the model is a good one, the data values will scatter closely around it.

Slide The Regression Line in Real Units(cont.) We write b 1 and b 0 for the slope and intercept of the line. b 1 is the slope, which tells us how rapidly changes with respect to x. b 0 is the y-intercept, which tells where the line crosses (intercepts) the y -axis.

The Regression Line in Real Units (cont.) In our model, we have a slope (b 1 ): The slope is built from the correlation and the standard deviations: Our slope is always in units of y per unit of x.

The Regression Line in Real Units (cont.) In our model, we also have an intercept (b 0 ). The intercept is built from the means and the slope: Our intercept is always in units of y.

Fat Versus Protein: An Example The regression line for the Burger King data fits the data well: The equation is The predicted fat content for a BK Broiler chicken sandwich (with 30 g of protein) is (30) = 35.9 grams of fat.

The Regression Line in Real Units (cont.) Since regression and correlation are closely related, we need to check the same conditions for regressions as we did for correlations: Quantitative Variables Condition Straight Enough Condition Outlier Condition

Checking In Let’s look at the relationship between house prices (in thousands of $) and house size (in thousands of ft 2 ). The regression model is: price = size What does the slope of mean? What are the units? How much can a homeowner expect the value of his house to increase if he builds on an additional 2000 square feet? How much would you expect to pay for a house of 3000 ft 2 ?

Example Fill in the missing information in the table.