Presentation on theme: "Mann-Whitney U-test Testing for a difference U 1 = n 1 n 2 + ½ n 1 (n 1 + 1) – R 1."— Presentation transcript:
Mann-Whitney U-test Testing for a difference U 1 = n 1 n 2 + ½ n 1 (n 1 + 1) – R 1
What does it do? Tests for a difference in averages (medians – the middle value – to be exact) Compares two cases (eg species diversity in polluted and unpolluted water) The data can be of any kind, provided it’s numerical eg lengths, percentages, numbers of people… The samples do not have to be of the same size
Planning to use it? You want to test for difference You have just two cases to compare You have five or more values from each case If your data are likely to be normally distributed, it may be easier to get a significant result using the t-test Make sure that…
How does it work? You assume (null hypothesis) there is no difference between the two cases The test involves ranking all the data together, then adding up the ranks for each sample. If, for example, the values in the first sample were all much bigger, then the first sample would have higher ranks
Doing the test These are the stages in doing the test: 1.Write down your hypotheseshypotheses 2.Doing the rankingranking 3.Calculating your U-valuesU-values 4.Look at the tablestables 5.Make a decisiondecision Click here Click here for an example
Hypotheses H 0: There is no difference between population 1 and population 2 For H 1, you have a choice, depending on what alternative you were looking for. H 1: Population 1 is larger than population 2 eg: Species diversity in unpolluted water is greater than in polluted water orH 1: Population 1 is different to population 2 eg: Species diversity is different in unpolluted water and polluted water Unless you have a good scientific reason for expecting one to be larger, you should choose “different” for H 1
Ranking We need to put all the data together, and rank it, but remember which sample it’s from. One easy way to do this is to write data from different samples in different colours Give rank 1 to the highest value, rank 2 to the second highest and so on. If there are any ties, give them the average of the ranks they would have had. eg Suppose three pieces of data tie for second place. They would otherwise have been in 2 nd, 3 rd and 4 th place. So give them all the average of 2 nd, 3 rd and 4 th – that’s rank 3.
U-values First work out: R 1 = sum of ranks for sample 1 R 2 = sum of ranks for sample 2 Then work out U 1 = n 1 n 2 + ½ n 1 (n 1 + 1) – R 1 U 2 = n 1 n 2 + ½ n 2 (n 2 + 1) – R 2 n 1, n 2 are the sizes of the two samples
Tables This is a Mann-Whitney table You usually have to look in a different table for different significance levels Sizes of the two samples. The bigger one is “n 2 ”
Make a decision If you used: H 1 : Population 1 is larger than population 2: You are doing a 1-tailed test (1 alternative only considered) Choose the U-value from the sample you’d expected to be larger (It should be the smaller U-value) If your U-value is smaller than the tables value, you reject your null hypothesis If you used H 1 : Population 1 is different to population 2: You are doing a 2-tailed test (both alternatives considered) Choose the smaller of the two U-values If your U-value is smaller than the tables value, you reject your null hypothesis
Example: Invertebrates in Long & Short Grass Data were obtained for the number of invertebrates caught in sweep nets at 8 sites in long and short grass. Hypotheses: H 0: There is no difference in the number of invertebrates in long and short grass H 1 There is a difference in the number of invertebrates in long and short grass
The data Site long grass short grass
Ranking We need to put all the data together, and rank it, but remember whether it’s a long or short grass We’ll do this with colours long short We have: 41, 43,34, 37,15, 22, 27, 47, 38, 98, 27, 72, 65 In order: 98, 72, 65, 47, 43, 41, 38, 37, 34, 27, 27, 22, 15 Ranks:
U-Values First find the sum of ranks for long and short grass: Long: = 63.5 Short: = 27.5 Now work out the two U values, using the formulae: U 1 = n 1 n 2 + ½ n 1 (n 1 + 1) – R 1 U 2 = n 1 n 2 + ½ n 2 (n 2 + 1) – R 2 So U 1 = (7)(6) + ½ 7 (7 + 1) – 63.5 = 6.5 (long grass) U 2 = (7)(6) + ½ 6 (6 + 1) – 27.5 = 35.5 (short grass)
The Test Since our H 1 referred to “a difference”, we’re doing the 2-tailed test U = smaller of U 1 and U 2 = 6.5 Critical value (5%) = 6 Our value is larger. So accept H 0 – there is no significant difference between the numbers of invertebrates in long grass and in short grass.