Download presentation
Presentation is loading. Please wait.
Published byAlex Asay Modified over 9 years ago
1
Correlation and Regression Statistics 2126
2
Introduction Means etc are of course useful We might also wonder, “how do variables go together?” IQ is a great example It goes together with so much stuff
4
A scatterplot You tend to put the predictor on the x axis and the predicted on the y, though this is not a hard and fast rule A scatterplot is a pretty good EDA tool too eh Pick an appropriate scale for you axes Plot the (x,y) pairs
5
So what does it mean If, as one variable increases, the other variable increases we have a positive association If, as one goes up, the other goes down, we have a negative association There could be no association at all
6
Linear relationships BTW, I am only talking about straight line relationships Not curvilinear Say like the Yerkes Dotson Law, as far as a the stuff we will talk about, there is no relationship, yet we know there is
7
The strength is important too The more the points cluster around a line, the stronger the relationship is Height and weight vs height in cm vs height in inches We need something that ignores the units though, so if I did IQ and your income in real money or IQ and your income in that worthless stuff they use across the river, the numbers would be the same
8
The Pearson Product Moment Correlation Coefficient
9
Properties of r -1.00 <= r <= +1.00 The sign indicates ONLY the direction (think of it as going uphill or downhill) |r| indicates the strength So, r = -.77 is a stronger correlation than r =.40
10
Some examples
11
EDA is KEY
12
Check these out.. All of these have have the same correlation R =.7 in each case Note the problem of outliers Note the problem of two subpopulations
13
Remember this Correlation is not causation I said, correlation is not causation Let me say it again, correlation is not causation Birth control and the toaster method
14
Wouldn’t it be nice If we could predict y from x You know, like an equation Remember that in school, you would get an equation, plug in the x and get the y Well surprise surprise, there is a method like this in statistics
15
If we are going to predict with a line Well, we will make mistakes We will want to minimize those mistakes
16
There is a problem, a common problem Those prediction errors or residuals (e) sum to 0 Damn Though guess what we could do… Why square them of course So we get a line that minimizes squared residuals
17
The line will look like this
18
In general the equation of the line is….. Y hat (predicted y) Y interceptslope
19
This might help
20
So…. With a regression line you can predict y from x Just because it says that some value = a linear combination of numbers it does not mean that there is necessarily a causal link Don’t go outside the range Linear only
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.