Presentation is loading. Please wait.

Presentation is loading. Please wait.

Spearman’s Rank For relationship data. Non-parametric i.e. no assumptions are made about data fitting a normal distribution You must have more than 5.

Similar presentations


Presentation on theme: "Spearman’s Rank For relationship data. Non-parametric i.e. no assumptions are made about data fitting a normal distribution You must have more than 5."— Presentation transcript:

1 Spearman’s Rank For relationship data

2 Non-parametric i.e. no assumptions are made about data fitting a normal distribution You must have more than 5 pairs of data (10+ better) Measures the strength and direction of the relationship between two variables Using Spearman’s Rank

3 Bedload particle size (cm) 0 0.5 1 0 1 2 3 Distance Downstream (km) Velocity (m/s) 30 0 10 20 Distance Downstream (km) 0 1 2 0 20 40 60 0 1 2 3 Discharge (cumecs) Number of passing dog walkers No correlation r s = 0 3 The value for r s (spearman rank) will be between +1 and -1 +1 indicates a perfect positive correlation -1 indicates a perfect negative correlation 0 indicates no correlation at all Positive correlation r s = +1 Negative correlation r s = -1

4 rsrs =1- 6  d 2 n (n 2 - 1) The Equation

5 rsrs =1- 6  d 2 n (n 2 - 1) Where: r s = Spearman Rank Correlation Coefficient  d 2 = Sum of the squared differences between ranks n = Number of pairs of observations in the sample The Equation

6 1. Establish the Null Hypothesis H 0 (this is always the negative form. i.e. there is no significant correlation between the variables) and the alternative hypothesis (H 1 ). H 0 - There is no significant correlation between variable X and variable Y H 1 - There is a significant correlation between variable X and variable Y Method

7 Distance from source (km) Rank R 1 PO4 ppm Rank R 2 d (R 1 - R 2 ) d2d2 (variable x)(variable y) 050 2 440 620 810 0 2. Copy your data into the table below as variable x and variable y and label the data sets

8 3. Rank the individual data sets in sets in increasing order as separate sets of data (i.e. Give the lowest data value the lowest rank) Take each variable in turn Lowest value gets a rank of 1 When you have data values that are the same, they must have the same rank The same thing is done for all data values that are the same Distance from source (km) Rank R 1 PO4 ppm Rank R 2 (variable x)(variable y) 050 2 440 620 810 0 Distance from source (km) Rank R 1 PO4 ppm Rank R 2 (variable x)(variable y) 0 1 50 2 2 4 3 40 4 6 4 20 3 8 5 10 2 6 0 1 If the next two data values were not the same we would be assigning ranks 5 and 6 5 + 6 = 11 so we will divide this rank equally between the data values (there are 2 data values so we divide 11 by 2) 11 / 2 = 5.5 so both the data values are assigned a rank of 5.5

9 Distance from source (km) Rank R 1 PO4 ppm Rank R 2 d (R 1 - R 2 ) d2d2 (variable x)(variable y) 0 1 50 5.5 2 2 50 5.5 4 3 40 4 6 4 20 3 8 5 10 2 6 0 1 The assigned ranks should be recorded in the table

10 Distance from source (km) Rank R 1 PO4 ppm Rank R 2 d (R 1 - R 2 ) d2d2 (variable x)(variable y) 0 1 50 5.5 2 2 50 5.5 4 3 40 4 6 4 20 3 8 5 10 2 6 0 1 4. Calculate the difference between each pair of ranks R 1 -R 2 (if done correctly the differences should equal zero) Take each variable in turn and record the differences in column d

11 Distance from source (km) Rank R 1 PO4 ppm Rank R 2 d (R 1 - R 2 ) d2d2 (variable x)(variable y) 0 1 50 5.5-4.5 2 2 50 5.5-3.5 4 3 40 4 6 4 20 31 8 5 10 23 6 0 15 5. Square the differences in column d Record in column d 2

12 Distance from source (km) Rank R 1 PO4 ppm Rank R 2 d (R 1 - R 2 ) d2d2 (variable x)(variable y) 0 1 50 5.5-4.5 21.25 2 2 50 5.5-3.5 12.25 4 3 40 4 1 6 4 20 31 1 8 5 10 23 9 6 0 15 25 6. Calculate Sum of d 2 Add up all the values in the d 2 column

13 Substitute the numbers calculated for the symbols in the equation Work out each part in turn e.g. 1. Work out 6 x  d 2 2. Work out n2 3. Work out n2 – 1 4. Work out n x answer to step 3 rsrs =1- 6  d 2 n (n 2 - 1) 5. Work out answer to step one divided by the answer to step 4 6. Work out 1 – the answer to step 5 6. Calculate the r s value  d 2 = 69.5 6 x 69.5 = 417 n =66 2 -1 = 35 6 x 35 = 210 417/ 210 = 1.986 1 - 1.986 = 0.986

14 If it is a negative number then you have a negative correlation If it is a positive number then you have a positive correlation (You do not need to worry about + and – for the next bit!) Is your r s value positive or negative?

15 If r s is greater than or equal to the critical value, then there is a significant correlation and the null hypothesis can be rejected Compare your r s value against the table of critical values

16 Significance level Number of pairs of measurements (n) p = 0.05 (95%) (+ or -) p = 0.01 (99%) (+ or -) 51.000 60.8861.000 70.7860.929 80.7380.881 90.6830.833 100.6480.818 110.6230.794 120.5910.780 130.5660.745 140.5450.716 150.5250.689 160.5070.666 170.4900.645 180.4760.625 190.4620.608 200.4500.591 Critical values for Spearman’s Rank Correlation Coefficient

17 If we have a significant correlation at 95% we can go back and check if we have a significant correlation at 99% as well (so we can be 99% confident our results were not due to chance) Check the P-0.05 (95%) confidence level first This means we are 95% confident our results were not due to chance

18 Is our r s value smaller or larger than our critical value from the critical value table? If the r s value is greater than or equal to the critical value then the null hypothesis can be rejected – There is a significant correlation If the r s value is NOT greater than or equal to the critical value then the null hypothesis cannot be rejected – There is no significant correlation

19 Use the following data to calculate r s independently Light (Lux) Rank R 1 Hedera helix leaf area cm 2 Rank R 2 d (R 1 - R 2 ) d2d2 (variable x)(variable y) 116522.8 98024.7 98026.8 70037.5 76037.5 50037.5 49541.3 36644.6 34878.3 29858.0

20 Key questions Is there a significant correlation? Which data value/s would you consider to be anomalous and why? Which graph would you use to present this data?


Download ppt "Spearman’s Rank For relationship data. Non-parametric i.e. no assumptions are made about data fitting a normal distribution You must have more than 5."

Similar presentations


Ads by Google