Presentation is loading. Please wait.

Presentation is loading. Please wait.

Linear regression and correlation Learning outcomes This work will help you 1.To draw a scatter diagrams. 2.Draw linear regression lines y on x and x on.

Similar presentations


Presentation on theme: "Linear regression and correlation Learning outcomes This work will help you 1.To draw a scatter diagrams. 2.Draw linear regression lines y on x and x on."— Presentation transcript:

1 Linear regression and correlation Learning outcomes This work will help you 1.To draw a scatter diagrams. 2.Draw linear regression lines y on x and x on y and work out their equations. 3.Identify the types of correlation and calculate product moment correlation coefficient. 4.Use technology to find all of the above.

2 Linear Regression When looking for a linear relationship between two sets of data we can plot what is known as a scatter diagram. x y Looking at the graph we can see that there is some positive correlation.

3 x y It is possible to draw a line called a regression line. There are two types y on x and x on y. First lets consider y on x regression line. y on x The y on x line, draws the regression line by keeping the sum of the squares of the vertical distance to a minimum. Note: The equation of the line is called “The Equation of the Least Squares regressions Lines”

4 x y Now consider the x on y regression line. x on y The x on y line, draws the regression line by keeping the sum of the squares of the horizontal distance to a minimum.

5 x y y on x x on y Drawing both graphs on the same graph we have We should note that both graphs will pass through the means of both sets of data,.

6 It is possible to calculate the equations of the y on x and x on y regression lines. Important formulae y on x regression line is of the form and can be calculated by using the formula. Where is called the covariance and links the x and y data. is the variance of the x data

7 x on y regression line is of the form and can be calculated by using the formula. Where is called the covariance and links the x and y data. is the variance of the y data

8 Example In the table below are the results of ten students in both their Mathematics and Physics examinations. The teacher thinks there might be a relationship between the two. His hypothesis is “a student who has Mathematical ability also has ability in Physics.” Mathematics Mark /100 (x)Physics Mark /100 (y) 6156 3445 2415 8992 4761 6757 8275 68 5347 8976

9 Drawing a scatter graph xy 6156 3445 2415 8992 4761 6757 8275 68 5347 8976

10 Maths/100

11 Finding y on x using technology Product -Moment Correlation Coefficient

12 Finding x on y using technology Remember you have to interchange the x and y when writing down the x on y regression line. Product-Moment Correlation Coefficient

13 Example In the table below are the results of ten students in both their Mathematics and Physics examinations. The teacher thinks there might be a relationship between the two. His hypothesis is “a student who has Mathematical ability also has ability in Physics.” Mathematics Mark /100 (x)Physics Mark /100 (y) 6156 3445 2415 8992 4761 6757 8275 68 5347 8976

14 Now calculating the regression lines 552 532 5.8 -21.2 -31.2 33.8 -8.2 11.8 26.8 -49.2 -2.2 33.8 2.8 -8.2 -38.2 38.8 7.8 3.8 21.8 -45.2 -6.2 22.8 33.64 449.44 973.44 1142.44 67.24 139.24 718.24 2420.64 4.84 1142.44 7091.60 7.84 67.24 1459.24 1505.44 60.84 14.44 475.24 2043.04 38.44 519.84 6191.60 16.24 173.84 1191.84 1311.44 -63.96 44.84 584.24 2223.84 13.64 770.64 6266.60

15 Using alternate formulae and the TI-nspire Calculator Variance of x Variance of y

16 Covariance Having done the 2-variable stats calculation the actual value of variance (which is the standard deviation squared) can be found using the “Var” menu on the calculator.

17 For regression line y on x which has form

18 For regression line x on y which has form

19 Plotting both lines on the scatter diagram y on x, and for x on y, Note: For x on y line, remember to rearrange it into the following form before trying to plot y on x x on y

20 Correlation We need a way to determine if there is linear correlation or not. So we calculate what is known as the Product-Moment Correlation Coefficient (r). (covariance), (standard deviation of x) (standard deviation of y). We can see that the quantity r from the following five sets of data above tells us something about the degree of scatter of the two sets of data, if we are looking for a linear relationship.

21 x05101520253035 y3828261917851 Table 1 y on x x on y The product moment correlation coefficient In table 1 we notice that the two regressions lines (y on x and x on y) nearly coincide and that as the x-data increases the y-data decreases. The value of r is -0.990, which is close to –1. Here we have what is called strong negative linear correlation.

22 x05101520253035 y233020231532202 Table 2 y on x x on y The product moment correlation coefficient In table 2, the two regression lines are further apart although there is weak negative linear correlation. The value of r is -0.529 and it is getting closer to 0.

23 Table 3 y on x x on y The product moment correlation coefficient x05101520253035 y53119233032206 In table 3, the two regression lines are virtually perpendicular and there is no linear correlation. The value of r is -.00548 and it is very close to 0.

24 Table 4 y on x x on y The product moment correlation coefficient x05101520253035 y121723912381840 In table 4, the two regression lines are further apart but we notice that as the x-data increases the y-data increases. We say there is weak positive linear correlation. The value of r is 0.612 and it is moving away from 0 and getting closer to 1.

25 x05101520253035 y24121618262732 Table 5 y on x x on y The product moment correlation coefficient In table 5, we notice that the two regressions lines (y on x and x on y) nearly coincide and that as the x-data increases the y-data increases. The value of r is 0.990, which is very close to 1. Here we have what is called strong positive linear correlation.

26 The value of r determines the degree of linear scatter of the two sets of data and - indicates that the data have perfect negative linear correlation, - indicates that the data has no linear correlation, - indicates that the data have perfect positive linear correlation. r is called Product-Moment Correlation Coefficient.

27 y on x x on y Returning to our example So we can conclude that as r is close to 1, that the results show that his hypothesis that “a student who has Mathematical ability also has ability in Physics’” might be true.

28 Maths/100


Download ppt "Linear regression and correlation Learning outcomes This work will help you 1.To draw a scatter diagrams. 2.Draw linear regression lines y on x and x on."

Similar presentations


Ads by Google