Download presentation
Presentation is loading. Please wait.
1
Correlation and Regression
2
Relationships between variables Example: Suppose that you notice that the more you study for an exam, the better your score typically is. This suggests that there is a relationship between study time and test performance. We call this relationship a correlation.
3
Relationships between variables Properties of a correlation Form (linear or non-linear) Direction (positive or negative) Strength (none, weak, strong, perfect) To examine this relationship you should: Make a scatterplot Compute the Correlation Coefficient
4
Scatterplot Plots one variable against the other Useful for “ seeing ” the relationship Form, Direction, and Strength Each point corresponds to a different individual Imagine a line through the data points
5
Scatterplot Hours study X Exam perf. Y 66 12 56 34 32 Y X 1 2 3 4 5 6 123456
6
Correlation Coefficient A numerical description of the relationship between two variables For relationship between two continuous variables we use Pearson ’ s r It basically tells us how much our two variables vary together As X goes up, what does Y typically do X , Y X , Y X , Y
7
Form Non-linear Linear
8
Direction NegativePositive As X goes up, Y goes up X & Y vary in the same direction Positive Pearson’s r As X goes up, Y goes down X & Y vary in opposite directions Negative Pearson’s r Y X Y X
9
Strength Zero means “ no relationship ”. The farther the r is from zero, the stronger the relationship The strength of the relationship Spread around the line (note the axis scales)
10
Strength r = 1.0 “perfect positive corr.” r = -1.0 “perfect negative corr.” r = 0.0 “no relationship” 0.0+1.0 The farther from zero, the stronger the relationship
11
Strength 0.0+1.0 -.8.5 Which relationship is stronger? Rel A, -0.8 is stronger than +0.5 r = -0.8 Rel A r = 0.5 Rel B
12
Y X 1 2 3 4 5 6 123456 Regression Compute the equation for the line that best fits the data points Y = (X)(slope) + (intercept) 2.0 Change in Y Change in X = slope 0.5
13
Y X 1 2 3 4 5 6 123456 Regression Can make specific predictions about Y based on X Y = (X)(.5) + (2.0) X = 5 Y = ? Y = (5)(.5) + (2.0) Y = 2.5 + 2 = 4.5 4.5
14
Regression Also need a measure of error Y = X(.5) + (2.0) + error Y X 1 2 3 4 5 6 123456 Y X 1 2 3 4 5 6 123456 Same line, but different relationships (strength difference)
15
Cautions with correlation & regression Don ’ t make causal claims Don ’ t extrapolate Extreme scores (outliers) can strongly influence the calculated relationship
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.