Download presentation
Presentation is loading. Please wait.
Published byAlban Wilkerson Modified over 9 years ago
1
Regression and Correlation
2
Bivariate Analysis Can we say if there is a relationship between the number of hours spent in Facebook and the number of friends you have? In this question, we have two seemingly unrelated variables –Number of hours spent in facebook –Number of facebook friends. 2
3
Bivariate Analysis UnivariateBivariate Single variableTwo variables Does not deal with causes or relationship Deals with causes or relationships Main purpose is to describeMajor purpose is to explain Central tendency, dispersion, frequency distributions, graphs Analysis of two variables simultaneously, correlations, comparisons, relationships, causes, explanations, independent and dependent variables 3
4
The question is: –How can we find a relationship between the two? (assuming that there exists such a relationship) Bivariate Analysis 4
5
Relative Variation/Dispersion Unitless; used to compare one data set to another Coefficient of Variation (CV) –Ratio of SD to mean, expressed in percentage –The higher the value, the more your data ‘varies’ is ‘dispersed’ Z Score –Measures how many SDs an observation is above/below the mean Refer to your handout for the formulae. 5
6
Covariance
7
prepared by: Rose Ann V. Sale7
8
Covariance Measure of relationship between two variables Computed as below: 8
9
Example1 Covariance: 278.243 9
10
Example2 Covariance: -2.64 10 GPATV in hours per week 3.114 2.410 2.020 3.87 2.225 3.49 2.915 3.213 3.74 3.521
11
cov(X,Y) > 0 X and Y are positively correlated cov(X,Y) < 0 X and Y are inversely correlated cov(X,Y) = 0 X and Y are independent Interpreting Covariance 11
12
Correlation
13
prepared by: Rose Ann V. Sale13
14
Correlation Measures the relative strength of the linear relationship between two variables Unitless Ranges between –1 and 1 The closer to –1, the stronger the negative linear relationship The closer to 1, the stronger the positive linear relationship The closer to 0, the weaker the linear relationship 14
15
Correlation Pearson r –Used for quantitative data (remember interval & ratio levels?) –Computed as below: Spearman –Pearson r between ordinal variables, used for qualitative data, and applies to quantitative as well! –Computed as below (if your data has no ties!): In case of ties, Pearson r formula is applicable to ranked data, (tied values’ rank = Ave of their positions in ascending order) 15
16
Scatter Plots of Data with Various Correlation Coefficients Y X Y X Y X r = -1 r = -.6 r = 0 16
17
Y X Y X r = +.3 r = +1 Y X r = 0 Scatter Plots of Data with Various Correlation Coefficients 17
18
Y X Y X Y Y X X Linear relationshipsCurvilinear relationships Linear Correlation 18
19
Y X Y X Y Y X X Strong relationshipsWeak relationships Linear Correlation 19
20
Linear Correlation Y X Y X No relationship 20
21
Example 1 r xy = 0.934 21 = 0.786
22
Example2 r xy = -0.6284 22 GPATV in hours per week 3.114 2.410 2.020 3.87 2.225 3.49 2.915 3.213 3.74 3.521 = -0.636
23
Linear Regression
24
In correlation, the two variables are treated as equals. In regression, one variable is considered independent (=predictor) variable (X) and the other the dependent (=outcome) variable Y. 24
25
Linear Regression Independent variable cause Dependent variable effect Linear regression is a method of predicting the value of dependent variable Y from the value of the independent variable X 25
26
What is “ Linear ” ? Remember this: y = mx + b B m 26
27
Line of Regression Prediction line or line of “best fit” –This is where you find the expected value of one variable given the other. Data points tend to cluster about this line (-1 < r < 1) General form given below: Can you give the y = mx + b equivalent of the above? 27
28
Standard Error of Estimate Absolute difference from your line of regression (“predicted”) to your actual measurements Calculated as follows: Interpreted as “ we can expect 68% of the time the true value of Y will lie in the band y units from the line of regression. ” 28
29
Example1 Given the following ordered pair –(2,3) (5,5) (9,13) (12,7) (13,14) –Draw the scatter graph –Find the formula for the regression line –Draw an approximation of the regression line –Compute for expected value (Ŷ) given X=9 29
30
Example1 Answer 30 Y= 0.8018x + 1.8249 (Ŷ | x=9) = 9.0411
31
Example2 31 -Compute for the regression line - Predict the number of hours spent in Facebook of a person who has 400 fb friends
32
Example2 Answer y = 0.0065x – 0.13 (y | x=400) = 2.47 32
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.