Presentation is loading. Please wait.

Presentation is loading. Please wait.

Correlation and Regression 1. Bivariate data When measurements on two characteristics are to be studied simultaneously because of their interdependence,

Similar presentations


Presentation on theme: "Correlation and Regression 1. Bivariate data When measurements on two characteristics are to be studied simultaneously because of their interdependence,"— Presentation transcript:

1 Correlation and Regression 1

2 Bivariate data When measurements on two characteristics are to be studied simultaneously because of their interdependence, we get observations in pairs. Such a set of data in pairs is called bivariate data. 2

3 COVARIANCE While variance measures the variation among the observations in a data set, COVARIANCE measures the joint variation among the pairs of observations in a bivariate data set. i.e. Covariance measures the strength of linear relationship between two or more variables. But it cannot be used to compare the linear relationship between these variables. Hence, there is a necessity to study the concept of correlation. 3

4 CORRELATION Correlation analysis: When changes in one variable also show changes in the other variable, the two variables are said to be correlated. 4

5 5 Correlation Positive Zero Negative Perfect Imperfect Perfect Imperfect Strong Weak

6 Methods of assessing Correlation SCATTER DIAGRAM Scatter diagram is the graphical method of assessing correlation between two variables. 6

7 7

8 8

9 9

10 10

11 11

12 Correlation is measured with the help of correlation coefficient r. Its value always lies between -1 and +1 i.e. -1 ≤ r ≤ 1 12

13 13 Correlation Positive Correlation No Correlation Negative Correlation 0 < r  1 r = 0 -1 < r < 0 Perfect Positive Imperfect Positive Perfect Negative Imperfect Negative Correlation Correlation Correaltion Correlation r = 1 0< r < 1 r = -1 -1 < r < 0 Weak Positive Strong Positive Weak Negative Strong Negative r tends to 0 r tends to 1 r tends to 0 r tends to -1

14 Karl Pearson’s Coefficient of correlation : Karl Pearson defined coefficient of correlation as a measure of intensity or degree of linear relationship between two variables. Let X and Y be the two variables with n pairs of observations, then they are represented as: (x i, y i ) i = 1, 2, …, n 14

15 Spurious Correlation : When the value of correlation coefficient shows high presence of significant relationship, but no logical relationship exists between the two variables, such a correlation is called Spurious Correlation. Ex. Number of students getting graduate degree every year and number of auto accidents in the city. 15

16 Coefficient of Determination The square of the correlation coefficient r, expressed as r 2, is known as coefficient of determination. It indicates the extent to which variation in one variable is explained by the variation in other. Ex: If the correlation coefficient between x and y is 0.9, the coefficient of determination will be 0.81. It implies that there is 81% of variation in y explained by the variation in x and the remaining 19% is explained by some other factors. This 1-r 2 is referred to as coefficient of nondetermination. The square root of coefficient of nondetermination is known as coefficient of alienation. 16

17 Rank Correlation Some times the data on two variables cannot be measured quantitatively. In such situations the observations can be ranked. Karl Pearson’s correlation coefficient is not an appropriate measure for qualitative data. Hence Spearman has defined a coefficient of correlation for qualitative data called as Spearman’s Rank Correlation coefficient. E.g. ranks given by judges in a beauty contes t. 17

18 Spearman’s Rank Correlation Coefficient (R) 18 where d i = X i – Y i X i : Rank assigned by Judge 1 Y i : Rank assigned by Judge 2 n : Number of pairs of observations

19 Case of Tied Ranks A correction factor has to be added to Σd i 2 for each tie 19 where m: number of individuals having a tie


Download ppt "Correlation and Regression 1. Bivariate data When measurements on two characteristics are to be studied simultaneously because of their interdependence,"

Similar presentations


Ads by Google