Download presentation

1
**Linear regression and correlation**

International Baccalaureate Mathematical Studies International Baccalaureate Mathematical Studies International Baccalaureate Mathematical Studies International Baccalaureate Mathematical Studies International Baccalaureate Mathematical Studies International Baccalaureate Mathematical Studies International Baccalaureate Mathematical Studies International Baccalaureate Mathematical Studies International Baccalaureate Mathematical Studies International Baccalaureate Mathematical Studies International Baccalaureate Mathematical Studies International Baccalaureate Mathematical Studies International Baccalaureate Mathematical Studies International Baccalaureate Mathematical Studies International Baccalaureate Mathematical Studies International Baccalaureate Mathematical Studies International Baccalaureate Mathematical Studies International Baccalaureate Mathematical Studies International Baccalaureate Mathematical Studies International Baccalaureate Mathematical Studies International Baccalaureate Mathematical Studies International Baccalaureate Mathematical Studies International Baccalaureate Mathematical Studies International Baccalaureate Mathematical Studies International Baccalaureate Mathematical Studies International Baccalaureate Mathematical Studies International Baccalaureate Mathematical Studies International Baccalaureate Mathematical Studies International Baccalaureate Mathematical Studies Linear regression and correlation Learning outcomes This work will help you To draw a scatter diagrams. Draw linear regression lines y on x and x on y and work out their equations. Identify the types of correlation and calculate product moment correlation coefficient. Use technology to find all of the above.

2
Linear Regression When looking for a linear relationship between two sets of data we can plot what is known as a scatter diagram. y x Looking at the graph we can see that there is some positive correlation.

3
**First lets consider y on x regression line.**

It is possible to draw a line called a regression line. There are two types y on x and x on y. First lets consider y on x regression line. x y y on x The y on x line, draws the regression line by keeping the sum of the squares of the vertical distance to a minimum. Note: The equation of the line is called “The Equation of the Least Squares regressions Lines”

4
**Now consider the x on y regression line.**

The x on y line, draws the regression line by keeping the sum of the squares of the horizontal distance to a minimum.

5
**Drawing both graphs on the same graph we have**

x on y x y y on x We should note that both graphs will pass through the means of both sets of data,

6
**It is possible to calculate the equations of the y on x and x on y regression lines.**

Important formulae y on x regression line is of the form and can be calculated by using the formula. Where is called the covariance and links the x and y data. is the variance of the x data

7
**x on y regression line is of the form and can be calculated by using the formula.**

Where is called the covariance and links the x and y data. is the variance of the y data

8
Example In the table below are the results of ten students in both their Mathematics and Physics examinations. The teacher thinks there might be a relationship between the two. His hypothesis is “a student who has Mathematical ability also has ability in Physics.” Mathematics Mark /100 (x) Physics Mark /100 (y) 61 56 34 45 24 15 89 92 47 67 57 82 75 6 8 53 76

9
**Drawing a scatter graph**

x y 61 56 34 45 24 15 89 92 47 67 57 82 75 6 8 53 76

10
Maths/100

11
**Finding y on x using technology**

Product-Moment Correlation Coefficient

12
**Finding x on y using technology**

Remember you have to interchange the x and y when writing down the x on y regression line. Product-Moment Correlation Coefficient

13
Example In the table below are the results of ten students in both their Mathematics and Physics examinations. The teacher thinks there might be a relationship between the two. His hypothesis is “a student who has Mathematical ability also has ability in Physics.” Mathematics Mark /100 (x) Physics Mark /100 (y) 61 56 34 45 24 15 89 92 47 67 57 82 75 6 8 53 76

14
**Now calculating the regression lines**

76 89 47 8 75 57 61 92 15 45 56 y 53 6 82 67 24 34 x 5.8 2.8 33.64 7.84 16.24 -21.2 -8.2 449.44 67.24 173.84 -31.2 -38.2 973.44 33.8 38.8 -8.2 7.8 67.24 60.84 -63.96 11.8 3.8 139.24 14.44 44.84 26.8 21.8 718.24 475.24 584.24 -49.2 -45.2 -2.2 -6.2 4.84 38.44 13.64 33.8 22.8 519.84 770.64 552 532

15
**Using alternate formulae and the TI-nspire Calculator**

Variance of x Variance of y

16
Covariance Having done the 2-variable stats calculation the actual value of variance (which is the standard deviation squared) can be found using the “Var” menu on the calculator.

17
**For regression line y on x which has form**

18
**For regression line x on y which has form**

19
**Plotting both lines on the scatter diagram**

y on x, and for x on y, x on y y on x Note: For x on y line, remember to rearrange it into the following form before trying to plot

20
**(standard deviation of x)**

Correlation We need a way to determine if there is linear correlation or not. So we calculate what is known as the Product-Moment Correlation Coefficient (r). (covariance), (standard deviation of x) (standard deviation of y). We can see that the quantity r from the following five sets of data above tells us something about the degree of scatter of the two sets of data, if we are looking for a linear relationship.

21
Table 1 x 5 10 15 20 25 30 35 y 38 28 26 19 17 8 1 y on x x on y The product moment correlation coefficient In table 1 we notice that the two regressions lines (y on x and x on y) nearly coincide and that as the x-data increases the y-data decreases. The value of r is , which is close to –1. Here we have what is called strong negative linear correlation.

22
Table 2 x 5 10 15 20 25 30 35 y 23 32 2 y on x x on y The product moment correlation coefficient In table 2, the two regression lines are further apart although there is weak negative linear correlation. The value of r is and it is getting closer to 0.

23
Table 3 x 5 10 15 20 25 30 35 y 31 19 23 32 6 y on x x on y The product moment correlation coefficient In table 3, the two regression lines are virtually perpendicular and there is no linear correlation. The value of r is and it is very close to 0.

24
Table 4 x 5 10 15 20 25 30 35 y 12 17 23 9 38 18 40 y on x x on y The product moment correlation coefficient In table 4, the two regression lines are further apart but we notice that as the x-data increases the y-data increases. We say there is weak positive linear correlation. The value of r is and it is moving away from 0 and getting closer to 1.

25
Table 5 x 5 10 15 20 25 30 35 y 2 4 12 16 18 26 27 32 y on x x on y The product moment correlation coefficient In table 5, we notice that the two regressions lines (y on x and x on y) nearly coincide and that as the x-data increases the y-data increases. The value of r is 0.990, which is very close to 1. Here we have what is called strong positive linear correlation.

26
**r is called Product-Moment Correlation Coefficient.**

The value of r determines the degree of linear scatter of the two sets of data and - indicates that the data have perfect negative linear correlation, - indicates that the data has no linear correlation, - indicates that the data have perfect positive linear correlation. r is called Product-Moment Correlation Coefficient.

27
**Returning to our example**

y on x x on y So we can conclude that as r is close to 1, that the results show that his hypothesis that “a student who has Mathematical ability also has ability in Physics’” might be true.

28
Maths/100

Similar presentations

© 2021 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google