Download presentation

Presentation is loading. Please wait.

Published byLoren Norrie Modified over 2 years ago

2
Least Squares Regression Fitting a Line to Bivariate Data

3
Linear Relationships Avg. occupants per car n 1980: 6/car n 1990: 3/car n 2000: 1.5/car n By the year 2010 every fourth car will have nobody in it! Food for Thought n Kind of mathematical relationship between year and avg. no. of occupants per car? n Why might relation- ship break down by 2010?

4
Basic Terminology n Scatterplots, correlation: interested in association between 2 variables (assign x and y arbitrarily) n Least squares regression: does one quantitative variable explain or cause changes in another variable?

5
Basic Terminology (cont.) n Explanatory variable: explains or causes changes in the other variable; the x variable. (independent variable) n Response variable: the y -variable; it responds to changes in the x - variable. (dependent variable)

6
Examples n Fertilizer (x ) corn yield (y ) n Advertising $ (x ) store income (y ) n Drug dose (x ) blood pressure (y ) n Daily temperature (x ) natural gas demand (y ) n change in min wage(x) unemployment rate (y)

7
Simplest Relationship n Simplest equation that describes the dependence of variable y on variable x y = b 0 + b 1 x n linear equation n graph is line with slope b 1 and y- intercept b 0

8
Graph y x0 b0b0 y=b 0 +b 1 x run rise Slope b=rise/run

9
Notation n (x 1, y 1 ), (x 2, y 2 ),..., (x n, y n ) n draw the line y= b 0 + b 1 x through the scatterplot, the point on the line corresponding to x i is

10
Observed y, Predicted y predicted y when x=2.7 yhat = a + bx = a + b*2.7 2.7

11
Scatterplot: Fuel Consumption vs Car Weight “Best” line?

12
Scatterplot with least squares prediction line

13
How do we draw the line? Residuals

14
Residuals: graphically

15
Criterion for choosing what line to draw: method of least squares n The method of least squares chooses the line that makes the sum of squares of the residuals as small as possible n This line has slope b 1 and intercept b 0 that minimizes

16
Least Squares Line y = b 0 + b 1 x: Slope b 1 and Intercept b 0

17
Example: Income vs Consumption Expenditure

18
Questions n Construct scatterplot; determine if linear model is appropriate. If so … n … find the least squares prediction line n Estimate consumption expenditure in a household with an income of (i) $6,000 (ii) $25,000. Comfortable with estimates? n Compute the residuals

19
Scatterplot

20
Solution

21
Calculations

22
least squares prediction line

23
Least Squares Prediction Line

24
Consumption Expenditure Prediction When x=$6,000 6 7.4

25
Consumption Expenditure Prediction When x=$25,000 25 11.2

26
The least squares line always goes through the point with coordinates (x, y) ( x, y ) = ( 9, 8 )

27
C. Compute the Residuals

28
Residuals

29
Income Residual Plot

30
residuals, residuals) 2 n Note that * residuals = 0 residuals) 2 = 3.6 *From formula in box on p. 7: SSE= y i 2 – b 0 * y i – b 1 * x i y i 330 – 6.2*40 -.2*392 = 330 – 248 – 78.4 = 3.6 Any other line drawn through the scatterplot will have residuals) 2 > 3.6

31
Car Weight, Fuel Consumption Example, cont. (x i, y i ): (3.4, 5.5) (3.8, 5.9) (4.1, 6.5) (2.2, 3.3) (2.6, 3.6) (2.9, 4.6) (2, 2.9) (2.7, 3.6) (1.9, 3.1) (3.4, 4.9)

32
Wt (x) Fuel (y) 3.45.5.5.251.111.231.555 3.85.9.9.811.512.28011.359 4.16.51.21.442.114.45212.532 2.23.3-.7.49-1.091.1881.763 2.63.6-.3.09-.79.6241.237 2.94.600.21.04410 2.02.9-.9.81-1.492.22011.341 2.73.6-.2.04-.79.6241.158 1.93.11-1.291.66411.29 3.44.9.5.25.51.2601.255 2943.905.18014.5898.49 col. sum

33
Calculations

34
Scatterplot with least squares prediction line

35
The Least Squares Line Always goes Through ( x, y ) (x, y ) = (2.9, 4.39)

36
Using the least squares line for prediction. Fuel consumption of 3,000 lb car? (x=3)

37
Be Careful! Fuel consumption of 500 lb car? (x =.5) x =.5 is outside the range of the x-data that we used to determine the least squares line

38
Avoid GIGO! Evaluating the least squares line 1. Create scatterplot. Approximately linear? 2. Calculate r 2, the square of the correlation coefficient 3. Examine residual plot

39
r 2 : The Variation Accounted For n The square of the correlation coefficient r gives important information about the usefulness of the least squares line

40
r 2 : important information for evaluating the usefulness of the least squares line The square of the correlation coefficient, r 2, is the fraction of the variation in y that is explained by the least squares regression of y on x. -1 ≤ r ≤ 1 implies 0 ≤ r 2 ≤ 1 The square of the correlation coefficient, r 2, is the fraction of the variation in y that is explained by the variation in x.

41
Example: car weight, fuel consumption n x=car weight, y=fuel consumption r 2 = (.9766) 2 .95 About 95% of the variation in fuel consumption (y) is explained by the linear relationship between car weight (x) and fuel consumption (y). n What else affects fuel consumption? –Driver, size of engine, tires, road, etc.

42
Example: SAT scores

43
SAT scores: calculations

44
SAT scores: result r 2 = (-.868) 2 =.7534 If 57% of NC seniors take the SAT, the predicted mean score is

45
Avoid GIGO! Evaluating the least squares line 1. Create scatterplot. Approximately linear? 2. Calculate r 2, the square of the correlation coefficient 3. Examine residual plot

46
Residuals n residual=observed y - predicted y = y - y n Properties of residuals 1.The residuals always sum to 0 (therefore the mean of the residuals is 0) 2.The least squares line always goes through the point (x, y)

47
Graphically residual = y - y y y i y i e i =y i - y i X x i

48
Residual Plot n Residuals help us determine if fitting a least squares line to the data makes sense n When a least squares line is appropriate, it should model the underlying relationship; nothing interesting should be left behind n We make a scatterplot of the residuals in the hope of finding… NOTHING!

49
Car Wt/ Fuel Consump: Residuals n CAR WT. FUEL CONSUMP. Pred FUEL CONSUMP. Residuals n 3.4 5.55.2094980690.290501931 n 3.8 5.95.865096525 0.034903475 n 4.1 6.56.356795367 0.143204633 n 2.2 3.33.242702703 0.057297297 n 2.6 3.63.898301158 -0.29830115 n 2.9 4.64.39 0.21 n 2 2.92.914903475 -0.01490347 n 2.7 3.64.062200772 -0.46220077 n 1.9 3.12.751003861 0.348996139 n 3.4 4.95.209498069 -0.309498069

50
Example: Car wt/fuel consump. residual plot page 13

51
SAT Residuals

52
Linear Relationship?

53
Garbage In Garbage Out

54
Residual Plot – Clue to GIGO

Similar presentations

Presentation is loading. Please wait....

OK

^ y = a + bx Stats Chapter 5 - Least Squares Regression

^ y = a + bx Stats Chapter 5 - Least Squares Regression

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Free ppt on network topology Ppt on you can win pdf Ppt on history of basketball Ppt on vertical axis wind turbine Ppt on class ab power amplifier Ppt on obesity prevention initiative Ppt on mumbai terror attack 26/11 Pdf to ppt online free Ppt on regular expression in php Ppt on history of olympics in usa