2The Kruskal-Wallis H Test The Kruskal-Wallis H Test is a nonparametric procedure that can be used to compare more than two populations in a completely randomized design.All n = n1+n2+…+nk measurements are jointly ranked (i.e.treat as one large sample).We use the sums of the ranks of the k samples to compare the distributions.
3The Kruskal-Wallis H Test Rank the total measurements in all k samplesfrom 1 to n. Tied observations are assigned average of the ranks they would have gotten if not tied.CalculateTi = rank sum for the ith sample i = 1, 2,…,kAnd the test statistic
4The Kruskal-Wallis H Test H0: the k distributions are identical versusHa: at least one distribution is differentTest statistic: Kruskal-Wallis HWhen H0 is true, the test statistic H has an approximate chi-square distribution with df = k-1.Use a right-tailed rejection region or p-value based on the Chi-square distribution.
5Example Four groups of students were randomly assigned to be taught with four different techniques, and their achievement test scores were recorded. Are the distributions of test scores the same, or do they differ in location?886281796778593836975273876518089944
6Teaching Methods55153531Ti(14)(2)(11)(9)(4)(8)(1)(12)(5)(7)(6)(13)(3)(10)(15)(16)886281796778593836975273876518089944Rank the 16 measurements from 1 to 16, and calculate the four rank sums.H0: the distributions of scores are the sameHa: the distributions differ in location
7Teaching Methods H0: the distributions of scores are the same Ha: the distributions differ in locationReject H0. There is sufficient evidence to indicate that there is a difference in test scores for the four teaching techniques.Rejection region: For a right-tailed chi-square test with a = .05 and df = 4-1 =3, reject H0 if H 7.81.
8Key Concepts I. Nonparametric Methods These methods can be used when the data cannot be measured on a quantitative scale, or whenThe numerical scale of measurement is arbitrarily set by the researcher, or whenThe parametric assumptions such as normality or constant variance are seriously violated.
9Key Concepts Kruskal-Wallis H Test: Completely Randomized Design 1. Jointly rank all the observations in the k samples (treat as one large sample of size n say). Calculate the rank sums, Ti = rank sum of sample i, and the test statistic2. If the null hypothesis of equality of distributions is false, H will be unusually large, resulting in a one-tailed test.3. For sample sizes of five or greater, the rejection region for H is based on the chi-square distribution with (k - 1) degrees of freedom.
10Testing for trends: the Jonckheere-Terpstra test This test looks at the differences between the medians of the groups, just as the Kruskall-Wallis test does.Additionally, it includes information about whether the medians are ordered.In our example, we predict an order for the number of sperms in the 4 groups, indeed:no meal > 1 meal > 4 meals > 7 mealsIn the coding variable, we have already encoded the order which we expect (1>2>3>4)
11Output of the J-T test If you have J-T in your version of SPSS, it would looklike thisZ-score =( )/116.33=-2.476J-T test should always be 1-tailed (since we have a directed hypo!) We compare against 1.65 which is the z-value for an -level of 5% for a 1- tailed test. Since 2.47>1.65 the result is significant.The negative sign means that medians are in descending order (a positive sign would have meant ascending order).
12Differences between several related groups: Friedman's ANOVA Friedman's ANOVA is the non-parametric analogue to a repeated measure ANOVA (see chapter 11) where the same subjects have been subjected to various conditions.Example here: Testing the effect of a new diet called 'Andikins diet' on n=10 women. Their weight (in kg) was tested 3 times:StartMonth 1Month 2Would they loose weight in the course of the diet?
13Theory of Friedman's ANOVA Subject's weight on each of the 3 dates is listed in a separate column. Then ranks for the 3 dates are determined and listed in separate columns.Then, the ranks are summed up for each Condition (Ri)Always the 3scores arecompared:The smallestone gets 1,the next 2,and the biggestone 3.Diet data with ranks
14The Test statistic FrFrom the sum of ranks for each group, the test statistic Fr is derived:kFr = 12/Nk (k+1) Σi=1 R2i - 3N(k+1)= (12/(10x3)(3+1)) ( )) – (3x10)(3+1)=12/120 ( ) – 120=0.1 (1202) – 120= = 0.2
15Data Input and provisional analysis (using) diet.sav First, test for normality:Analyze Descriptive Statistics Explore, tick 'Normality plots with tests' in the 'Plots' windowData sheetIn the Shapiro-Wilk test (which is more accurate than the K-S Test, two groups (Start, 1 month) show non-normal distributions. This violation of a parametric constraint justifies the choice of a non-para-metric test.
16Running Friedman's ANOVA Analyze Non-parametric Tests K Related Samples...If you have 'Exact', tick'Exact and limit calculationtime to 5 minutes.Other optionsOther optionsExact...Requesteverything there is -it is not much...
17Other optionsKendall's W: Similar to Friedman's ANOVA, but looks specifically at agreement between raters. For example: to what extent (from 0-1) women rate Justin Timberlake, David Beckham, or Tony Blair on their attractiveness. This is like a correlation coefficient.Cochran's Q: This is an extension of NcNemar's test. It is like a Friedman's test for dichotomous data. For example, if women should judge whether they would like to kiss Justin Timberlake, David Beckham, or Tony Blair and they could only answer: Yes or No.
18Output from Friedman's ANOVA The F-Statistics iscalled Chi-Square, here.It has df=2 (k-1, wherek is the # of groups).The statistics is n.s.
19Posthoc tests for Friedman's ANOVA Wilcoxon signed-rank tests but correcting for the numbers of tests we do, here = .05/3=.0167.Analyze Nonparametric Tests 2-Related Tests, tick 'Wilcoxon', specify the 3 pairs of groupsMean ranks and sum of ranksfor all 3 comparisonsSo, actually,we do nothave tocalculateany further...All comparisons are ns, asexpected from the overall nseffect.
20Posthoc tests for Friedman's ANOVA - calculation by hand We take the difference between the mean ranks of the different groups and compare them to a value based on the value of z (corrected for the # of comparions) and a constant based on the total sample size (n=10) and the # of conditions (k=3)Ru - Rvzk(k-1) k(k+1)/6Nzk(k-1) = .05/3(3-1) =If the difference is significant, it should have a higher value than the value of z for which only other values of z are bigger. As before, we look in the Appendix A.1 under the column Smaller Portion. The number corresponding to is the critical value: it is between 2.39 and 2.4.k(k-1) = 3 (3-1) = 6
21Calculating the critical differences Critical difference = zk(k-1) k(k+1)/6Ncrit. Diff = 2.4 (3(3+1)/6x10crit. Diff = 2.4 12/60crit. Diff = 2.4 0.2crit. Diff = 1.07 If the differences between mean ranks are the critical difference 1.07, then that difference is significant.
22Calculating the differences between mean ranks for diet data None of the differences is the critical difference 1.07, hence none of the comparisons is significant.
23Calculating the effect size Again, we will only calculate the effect sizes for single comparisons:r = z2nrStart – 1 month = / = -.01rStart – 2 months = /0 = -.06r1 month – 2 months = /0 =-.03Tiny effectsTiny effectsTiny effects
24Reporting the results of Friedman's ANOVA (Field_2005_566) „The weight of participants did not significantly change over the 2 months of the diet (2(2) = 0.20, p > .05). Wilcoxon tests were used to follow up on this finding. A Bonferroni correction was applied and so all effects are reported at a level of significance. It appeared that weight didn't significantly change from the start of the diet to 1 month, T=27, r=-.01, from the start of the diet to 2 months, T=25, r=-.06, or from 1 month to 2 months, T=26,r=-0.3. We can conclude that the Andikinds diet (...) is a complete failure.“