Download presentation
Presentation is loading. Please wait.
Published bySabrina McBride Modified over 9 years ago
2
10/03/2003Gerard.Golding@ul.ie Correlation Scatter Plots Correlation Coefficients Significance Test
3
10/03/2003Gerard.Golding@ul.ie Introduction We are often asked to describe the relationship between two or more variables Is there a relationship between points in the leaving cert and QCA Is there a relationship between parents IQ and children's IQ
4
10/03/2003Gerard.Golding@ul.ie What are Scatter Plots Two dimensional plot showing the (X,Y) value for each observation Used to determine whether there is any pronounced relationship and if so whether the relationship may be treated as approximately linear. Y is usually the response (dependent) variable X is usually the explanatory (independent) variable The response variable is the variable whose variation we wish to explain An explanatory variable is a variable used to explain variation in the response variable
5
10/03/2003Gerard.Golding@ul.ie Positive Linear Relationship
6
10/03/2003Gerard.Golding@ul.ie Negative Linear Relationship
7
10/03/2003Gerard.Golding@ul.ie No Linear Relationship
8
10/03/2003Gerard.Golding@ul.ie No Relationship
9
10/03/2003Gerard.Golding@ul.ie Example 1 Two sets of exam results for 11 students Maths & Physics Are they related Does a good performance in Maths go with a good performance in Physics Let the Maths mark be X Let the Physics mark be Y
10
10/03/2003Gerard.Golding@ul.ie Table of Results X4137383949474234364829 Y3620312437354226272923 X- Total is 440 X-mean is 40 Y-Total is 330 Y-mean is 30
11
10/03/2003Gerard.Golding@ul.ie Maths V s Physics
12
10/03/2003Gerard.Golding@ul.ie What does the Graph tell us The means divide the graph into four quadrants Most of the data lies in the bottom left or top right quadrants Only two fall outside these quadrants This indicates a probable relationship between X and Y for a particular student
13
10/03/2003Gerard.Golding@ul.ie Correlation Coefficient From a diagram we get a general idea of the relationship. For precision we need a numerical measure. We need to measure the strength of the relationship The most common measure is the Pearson Product Moment Correlation Coefficient Usually known as the Correlation Coefficient We will usually be dealing with population samples The sample correlation coefficient is called r
14
10/03/2003Gerard.Golding@ul.ie Properties of r r can take values from -1 to +1 r = +1 or r = -1 represents a perfect linear correlation or a perfect relationship between the variables r = 0 indicates little or no linear relationship i.e. as X increases there is no definite tendency for the values of Y to increase or decrease in a straight line r close to +1 indicates a large positive correlation i.e. Y tends to increase as X increases. r close to -1 indicates a large negative correlation i.e. Y tends to decrease as X increases. Further r differs from 0, the stronger the relationship. The sign of r indicates the direction of the relationship
15
10/03/2003Gerard.Golding@ul.ie Examples of various r values r = +1r = -1r = -0.54 r = 0.70r = 0
16
10/03/2003Gerard.Golding@ul.ie The formula for Calculating r
17
10/03/2003Gerard.Golding@ul.ie Example 2 Find the correlation coefficient r between Y and X SubjectABCDEFG X135791113 Y74 16102219
18
10/03/2003Gerard.Golding@ul.ie Create a table SubjectXiXi YiYi XiYiXiYi X i squared Y i squared A177149 B3412916 C5136525169 D71611249256 E9109081100 F1122242121484 G1319247169361 Total49917754551435
19
10/03/2003Gerard.Golding@ul.ie Calculating S xx
20
10/03/2003Gerard.Golding@ul.ie Calculating S yy
21
10/03/2003Gerard.Golding@ul.ie Calculating S xy
22
10/03/2003Gerard.Golding@ul.ie Calculating r
23
10/03/2003Gerard.Golding@ul.ie Significance Test H o : No Linear relationship exists r equal to 0 H A: There is a linear relationship r not equal to 0 Confidence Interval say 90%, 95%, 99% etc This means alpha = 0.1, 0.05, 0.01 etc Use table 10: Percentage points of the Correlation Coefficient Left hand column choose v = n-2 ( n = sample size) Find critical value If r > critical value then reject H o
24
10/03/2003Gerard.Golding@ul.ie Conclusion r = 0.82 let alpha = 0.05 v = n-2 giving v = 5 From tables the critical point is 0.7545 0.82 > 0.7545 We reject Ho and conclude: We are 95% confident that there is a linear relationship between X and Y
25
10/03/2003Gerard.Golding@ul.ie Example 3 Is there an obvious relationship between X and Y Y = X+2 This is a Perfect Relationship What will r be r will be equal to 1 X345678 Y5678910
26
10/03/2003Gerard.Golding@ul.ie Set up the data table SubjectYXXYX squaredY squared A5315 9 25 B6424 16 36 C7535 25 49 D8648 36 64 E9763 49 81 F10880 64 100 Total4533265 199 355
27
10/03/2003Gerard.Golding@ul.ie Calculate S xx
28
10/03/2003Gerard.Golding@ul.ie Calculate S yy
29
10/03/2003Gerard.Golding@ul.ie Calculate S xy
30
10/03/2003Gerard.Golding@ul.ie Calculate r Perfect Positive Linear Relationship
31
10/03/2003Gerard.Golding@ul.ie Back to Example 1 In our original example with the student results we drew a scatter plot. From the diagram it looked as if there was a probable positive linear relationship To be sure we need to calculate r Using a significance level of alpha = 0.05 we will test the claim that there is no linear correlation between Maths results and Physics results
32
10/03/2003Gerard.Golding@ul.ie Create a data table StudentXYXYX squaredY squared A4136147616811296 B37207401369400 C383111781444961 D39249361521576 E4937181324011369 F4735164522091225 G42 1764 H34268841156676 I36279721296729 J482913922304841 K2923667841529 Total440330134671798610366
33
10/03/2003Gerard.Golding@ul.ie Apply the formulae
34
10/03/2003Gerard.Golding@ul.ie Correlation Coefficient is
35
10/03/2003Gerard.Golding@ul.ie Conclusion From the tables the critical point is 0.6021 r = 0.63 0.63 > 0.6201 We Reject the claim and conclude that There is a Positive Linear Relationship between results in Maths and results in Physics
36
10/03/2003Gerard.Golding@ul.ie Regression Least Squares Predicting Y using X
37
10/03/2003Gerard.Golding@ul.ie What is Regression? Regression Analysis is used for prediction It allows us to predict the value of one variable given the value of another variable It gives us an equation that uses one variable to help explain variation in another In this course we deal with Simple Linear Regression
38
10/03/2003Gerard.Golding@ul.ie Simple Linear Regression First step in determining a relationship was drawing a scatter plot If a possible relationship was shown we found the strength of the relationship by calculating the correlation coefficient r The next stage is to calculate an equation which best describes the relationship between the two variables This line is called the Regression Line
39
10/03/2003Gerard.Golding@ul.ie What is the ‘best fit’ line Example 1
40
10/03/2003Gerard.Golding@ul.ie ‘Least Squares’ best fit line We can have several lines of the form We want ‘best’ least residuals
41
10/03/2003Gerard.Golding@ul.ie Least Squares estimates are the least squares estimates of Closely related to r
42
10/03/2003Gerard.Golding@ul.ie Example 2
43
10/03/2003Gerard.Golding@ul.ie Combining we get
44
10/03/2003Gerard.Golding@ul.ie Regression line is
45
10/03/2003Gerard.Golding@ul.ie Example 3 X345678 Y5678910 We know Y=X+2
46
10/03/2003Gerard.Golding@ul.ie Verifying the equation is correct
47
10/03/2003Gerard.Golding@ul.ie Giving
48
10/03/2003Gerard.Golding@ul.ie Example 1
49
10/03/2003Gerard.Golding@ul.ie Regression line
50
10/03/2003Gerard.Golding@ul.ie Example 1 continued If a student received a grade of 53 in Maths, what would the expected grade be in Physics We use the Regression line in order to predict the Physics result
51
10/03/2003Gerard.Golding@ul.ie Graphing The Regression Line
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.