Presentation is loading. Please wait.

Presentation is loading. Please wait.

REGRESSION. Regression is a mathematical means for finding the best fitting line for bivariate data. ? ? ? ? ?

Similar presentations


Presentation on theme: "REGRESSION. Regression is a mathematical means for finding the best fitting line for bivariate data. ? ? ? ? ?"— Presentation transcript:

1 REGRESSION

2 Regression is a mathematical means for finding the best fitting line for bivariate data. ? ? ? ? ?

3 Age of 1st Computer use 1 2 2.5 Age requiring Glasses 2 4 5 3 ? y = 2x Rise Run Rise Run slope:

4 Two numbers tell us all we need to know about ANY line. Slope tells us how STEEP or angled the line is (rise/run) Intercept tells us simply at which value does the line CROSS the y-axis. Slope = 1/2

5 Two numbers tell us all we need to know about ANY line. Slope tells us how STEEP or angled the line is (rise/run) Intercept tells us simply at which value does the line CROSS the y-axis. Slope = 3

6 Two numbers tell us all we need to know about ANY line. Slope tells us how STEEP or angled the line is (rise/run) Intercept tells us simply at which value does the line CROSS the y-axis. Slope = 1

7 Two numbers tell us all we need to know about ANY line. Slope tells us how STEEP or angled the line is (rise/run) Intercept tells us simply at which value does the line CROSS the y-axis. Slope = -1

8 ? ?? ? ? ? ? Regression finds the best fitting line to a set of data y = b x + a Correlation and Regression

9 Correlation finds the strength and direction of the best fitting line to the data. Mean weight = 135 lbs Mean height = 67 inches

10 Regression finds the the best fitting line to the data. y = bx + a Mean weight = 135 lbs Mean height = 67 inches a b Looks like something you probably already know: y = mx +b

11 Calculation Example I. set of paired heights and weights for 5 people Person Height Weight 160180 255160 354161 452156 557171 Let’s find the equation of the regression line to predict weight from height. You must know which variable is which! (unlike correlation)

12 Regression summarizes the data into two statistics: the slope, b, and the intercept, a, of the best fitting line. y = bx + a The SLOPE b = r xy (s y /s x ) The INTERCEPT a = y - bx _ _

13 Person Height Weight 160180 255160 354161 452156 557171 y =165.6 s y = 9.76 x = 55.6 s x = 3.05 _ r =.98 b =.98 (9.76/3.05) b = 3.14 a = 165.6 – 3.14(55.6) a = -8.98 _ The SLOPE b = r xy (s y /s x ) The INTERCEPT a = y - bx _ _

14 This IS the equation of the regression line—the line that BEST FITS our data: y = 3.14x -8.98 What if I would like to predict the weight of someone who is 50 inches tall? y = 3.14x -8.98 y = 3.14(50) -8.98 y = 157 -8.98 y = 148.02

15

16 How good of a predictor is the independent variable? Coefficient of determination: r 2 Gives the percent of the variation that can be explained by the regression equation.

17 Coefficent of determination (r 2 ) = 96.04%

18 Homoscedacticity: Assumptions about the error of Prediction Homoscadadic Not Homoscadadic

19 Regression summarizes the data into two statistics: the slope, b, and the intercept, a, of the best fitting line. Predicting y from x: Predicting x from y: b: r x y (s y /s x ) b: r x y (s x /s y ) a: y – b x a: x – b y This is the same as simply always using y for the predicted variable and x for the predictor variable.

20 Person Height Weight 1 60 180 2 55 160 3 54 161 4 52 156 5 57 171

21 Disclaimer about Regression lines and their equations. If the best fitting line to predict y from x is: y = 2x +1 It does NOT mean that the best fitting line to predict x from y is: x = 2y + 1 NOR is it: x =.5y -.5 y = 3.12x +8.08 y =.305x + 5.12

22 Regression is used for prediction ? Height Weight H1H1 W1W1 ? No Observations H2H2 W2W2 Predicting Weight from Height from paired measurements of 18 people. Best fitting line H3H3 W3W3

23 Examples from everyday life: Waiter/Waitress Collect data on the size of tips with the time of your shift. Do some times of day bring larger more tips (and get you more money)? Baseball: You desperately need to pick players for your fantasy baseball team, and you need a way to know which players will do best. You may be able to figure out which variable (batting average, etc) predicts performance best. Eg: multiple regression Correlation will tell you the direction and STRENGTH of the relationship. Regression calculates the best-fitting line to your data, and allows for predictions to be made.

24 An example of Regression… Equation of a line: y= bx + a You are a waiter/waitress at a local restaurant. You would like to predict how much tip money to expect from customers on a 60- degree day.

25 An example: Temperature (Fahrenheit) Hourly Tip (dollars) 6512 569 7320 7515 456 You are a waiter/waitress at a local restaurant. You would like to predict how much tip money to expect from customers on a 60- degree day.  x N = 314 5 = 62.8 Find the mean (x): _ r =.91 x y Find the standard deviation (s):  x = 314  x 2 = 20340 SS =  x 2 -  x) 2 N = 20340 – 98596 5 = 620.8 s = SS N-1 √ = 12.46 620.8 4 √

26 An example: Temperature (Fahrenheit) Hourly Tip (dollars) 6512 569 7320 7515 456 You are a waiter/waitress at a local restaurant. You would like to predict how much tip money to expect from customers on a 60- degree day.  y N = 62 5 = 12.4 Find the mean (y): _ r =.91 x y Find the standard deviation (s):  y = 62  y 2 = 886 SS =  y 2 -  y) 2 N = 886 – 3844 5 =117.2 s = SS N-1 √ = 5.41 117.2 4 √

27 An example: Temperature (Fahrenheit) Hourly Tip (dollars) 6512 569 7320 7515 456 You are a waiter/waitress at a local restaurant. You would like to predict how much tip money to expect from customers on a 60- degree day. r =.91 x y x = 62.8 s x = 12.46 y = 12.40 s y = 5.41 To find the slope: b = r (s y /s x ) b =.91 (5.41/12.46) b =.40 To find the intercept: a = 12.40 –.40(62.8) a = -12.72 a = y – b x Equation of the regression line to predict tip from temperature: y =.40x – 12.72

28 An example: Temperature (Fahrenheit) Hourly Tip (dollars) 6512 569 7320 7515 456 You are a waiter/waitress at a local restaurant. You would like to predict how much tip money to expect from customers on a 60- degree day. r =.91 x y x = 62.8 s x = 12.46 y = 12.40 s y = 5.41 Equation of the regression line to predict tip from temperature: y =.40x – 12.72 Prediction for a 60-degree day: y =.40(60) – 12.72 y = 24 – 12.72 y = 11.28

29 An example: Temperature (Fahrenheit) Hourly Tip (dollars) 6512 569 7320 7515 456 You are a waiter/waitress at a local restaurant. You would like to predict the temperature outside based on your hourly tip rate. Predict the weather if you earned 25 dollars hourly on a particular day. r =.91 y x y = 62.8 s y = 12.46 x = 12.40 s x = 5.41 To find the slope: b = r (s y /s x ) b =.91 (12.46/5.41) b = 2.10 To find the intercept: a = 62.80 – 2.10(12.40) a = 36.76 a = y – b x Equation of the regression line to predict tip from temperature: y = 2.10x + 36.76

30 An example: Temperature (Fahrenheit) Hourly Tip (dollars) 6512 569 7320 7515 456 r =.91 y x y = 62.8 s y = 12.46 x = 12.40 s x = 5.41 Equation of the regression line to predict tip from temperature: y = 2.10x + 36.76 Prediction for a 25-dollar tip hourly: y = 2.10(25) + 36.76 y = 52.5 + 36.76 y = 89.26 You are a waiter/waitress at a local restaurant. You would like to predict the temperature outside based on your hourly tip rate. Predict the weather if you earned 25 dollars hourly on a particular day.

31 Another way to think about regression… Z-Score equation: Zy=  Zx  Where  is the same thing as r

32 An example: Temperature (Fahrenheit) Hourly Tip (dollars) 6512 569 7320 7515 456 You are a waiter/waitress at a local restaurant. You would like to predict how much tip money to expect from customers on a 60- degree day. r =.91 x y x = 62.8 s x = 12.46 y = 12.40 s y = 5.41 Zy=  Zx Zy=  Zx z(60) = (60-62.8)/12.46 =-.22 Zy=  (-.22) Zy=  y=  Raw(-.20)=12.40 + (-.20)5.41


Download ppt "REGRESSION. Regression is a mathematical means for finding the best fitting line for bivariate data. ? ? ? ? ?"

Similar presentations


Ads by Google