Download presentation

Presentation is loading. Please wait.

Published byAlex Asay Modified about 1 year ago

1
Correlation and Regression Statistics 2126

2
Introduction Means etc are of course useful We might also wonder, “how do variables go together?” IQ is a great example It goes together with so much stuff

3

4
A scatterplot You tend to put the predictor on the x axis and the predicted on the y, though this is not a hard and fast rule A scatterplot is a pretty good EDA tool too eh Pick an appropriate scale for you axes Plot the (x,y) pairs

5
So what does it mean If, as one variable increases, the other variable increases we have a positive association If, as one goes up, the other goes down, we have a negative association There could be no association at all

6
Linear relationships BTW, I am only talking about straight line relationships Not curvilinear Say like the Yerkes Dotson Law, as far as a the stuff we will talk about, there is no relationship, yet we know there is

7
The strength is important too The more the points cluster around a line, the stronger the relationship is Height and weight vs height in cm vs height in inches We need something that ignores the units though, so if I did IQ and your income in real money or IQ and your income in that worthless stuff they use across the river, the numbers would be the same

8
The Pearson Product Moment Correlation Coefficient

9
Properties of r <= r <= The sign indicates ONLY the direction (think of it as going uphill or downhill) |r| indicates the strength So, r = -.77 is a stronger correlation than r =.40

10
Some examples

11
EDA is KEY

12
Check these out.. All of these have have the same correlation R =.7 in each case Note the problem of outliers Note the problem of two subpopulations

13
Remember this Correlation is not causation I said, correlation is not causation Let me say it again, correlation is not causation Birth control and the toaster method

14
Wouldn’t it be nice If we could predict y from x You know, like an equation Remember that in school, you would get an equation, plug in the x and get the y Well surprise surprise, there is a method like this in statistics

15
If we are going to predict with a line Well, we will make mistakes We will want to minimize those mistakes

16
There is a problem, a common problem Those prediction errors or residuals (e) sum to 0 Damn Though guess what we could do… Why square them of course So we get a line that minimizes squared residuals

17
The line will look like this

18
In general the equation of the line is….. Y hat (predicted y) Y interceptslope

19
This might help

20
So…. With a regression line you can predict y from x Just because it says that some value = a linear combination of numbers it does not mean that there is necessarily a causal link Don’t go outside the range Linear only

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google