Presentation on theme: "Correlation Correlation is the relationship between two quantitative variables. Correlation coefficient (r) measures the strength of the linear relationship."— Presentation transcript:
Correlation Correlation is the relationship between two quantitative variables. Correlation coefficient (r) measures the strength of the linear relationship between two variables. If for two variables X and Y, SS(X) and SS(Y) stand for their sum of squares respectively, and SP(X,Y) for their sum of product, then r is defined as SP(X,Y) =, SS(X) = and SS(Y) =
Correlation The statistical significance of r is tested using a t-test. The null hypothesis is that in whole population there is no relationship between y and x. The hypotheses for this test are: H 0 : r= 0 H a : r <> 0 We refer this value to the t distribution table with df = n – 2, to find p - value. A low p - value for this test (less than 0.05 for example) means that there is evidence to reject the null hypothesis in favor of the alternative hypothesis, or that there is a statistically significant relationship between the two variables. with df = n – 2
Correlation The height and weight of 7 students are given below. Calculate the coefficient of correlation ( ‘ r ’ value) between height and weight. Height (in inch): 65, 66, 67, 68, 69, 70, 71 Weight (in pound): 67, 68, 66, 69, 72, 72, 69 xyx2x2 y2y2 xy 6567422544894355 6668435646244488 6766448943564422 6869462447614692 6972476151844968 7072490051845040 716950417614899
Linear regression Linear regression is used to develop an equation (a linear regression line) for predicting a value of the dependent variables given a value of the independent variable. A regression line is the line described by the equation and the regression equation is the formula for the line. The regression equation is given by: Y = a + bX Where X is the independent variable, Y is the dependent variable, a is the intercept and b is the slope of the line.
Linear regression - Exercise xy 24 42 65 89 103 1211 148 xx2x2 yy2y2 xy 244168 4 248 63652530 86498172 101003930 1214411121132 14169864112 5656042392 y = a + bx; y = 2.0 + 0.50x Now we can draw the best fitting line with this equation.