Presentation is loading. Please wait.

Presentation is loading. Please wait.

Non-parametric Methods Chung-Yi Li, PhD Dept. of Public Health, College of Med. NCKU 1.

Similar presentations


Presentation on theme: "Non-parametric Methods Chung-Yi Li, PhD Dept. of Public Health, College of Med. NCKU 1."— Presentation transcript:

1 Non-parametric Methods Chung-Yi Li, PhD Dept. of Public Health, College of Med. NCKU 1

2 2 Nonparametric Statistics Use when assumptions of parametric tests (i.e., T-test) are not met or not sure Use when data are ranks or classifications Computationally easier, but most require a lot of calculation and too cumbersome to do by hand for more than a few cases Most have less power than the parametric alternatives so require larger sample size to achieve the same level of significance

3 3 Main Nonparametric Statistical Tests Sign test / Median Test (1-sample/paired-smaple t- test) Wilcoxon signed rank test (paired t-test) Wilcoxon rank sum test (independent t test) Mann Whitney U Test (similar to WRS test) Kruskal-Wallis Test (One-Way ANOVA) Kolmogorov-Smirnov (Test for Normality) Spearman and Kendall Nonparametric Correlations (Pearson’s correlation coefficient) Fisher’s exact test (Chisq-test)

4 Sign test / Median Test (1-sample/paired-smaple t-test) 4

5 5 Sign / Median Test Used in situations with limited ability to assess ranking of differences –can only assess if score for a subject is less than, greater than, or equal to the paired score –Test statistic depends only on the sign of the differences (in mean or frequently in median) –Special case of one-sample binomial test with p=0.5 Assumptions –Ordinal measurement with continuous values

6 6 Single Sample For most surgeries, the median time from anesthesia use to recovery is 7 hours. A new anesthesia procedure claims that it may reduce the time period required to get recovered. The following table shows the data for the study Time445556667789 sign ———————— 00  Ho: The median time for the new anesthesia procedure is 7 hours α =0.05

7 7 Single Sample Applying the pdf for the binomial distribution. Given the Ho being true, we need to observe 5 “+” out of 10 trials. Now we observed only 2 “+”. We should have at most 1 “+” (or at least 9 “-”) out of 10 trials (n=10) in order to reject Ho. So that the test fails to reject Ho, i.e., the new anesthesia procedure does not prove to be significantly better than the traditional one in terms of reducing the time required for recovery.

8 8 Paired Data Differences in REE (Resting Energy Expenditure, kcal/day) between patients and the age-sex-body height-body weight matched healthy people Pairs(kcal/day)Differences Sign PatientsHealthy people 11153996157 + 21132108052+ 311651182-17- 4146014528+ 516341162472+ 614931619-126- 713581140218+ 814531123330+ 91185111372+ 1018241463361+ 1117931632161+ 1219301614316+ 1320751836239+ Ho: Median of difference =0

9 Applying Binomial Distribution 9

10 Wilcoxon signed rank test (paired t-test) 10

11 Wilcoxon Signed-rank Test Widely used alternative to paired t-test –Uses relative rankings, but does not depend on interval assumptions Use sums of ranks of differences –Tests the number of positive differences in pair Assumptions –Ordinal data from a continuous distribution –Symmetric population distributed around mean 11

12 12 Wilcoxon Signed-rank Test Used for Dependent Data PairsFEV reduction (ml)DifferencesRankSign PlaceboDiuretic drug 1224213111+ 28095-152 - 37533423+ 45414401014+ 574-321065+ 685-281136+ 7293445-1527 - 8-23-1781558+ 95253671589+ 10-38140-17810 - 1150832318511+ 122551024512+ 135256546013+ 14102334368014+

13 13 Wilcoxon Signed-rank Test Used for Two Dependent Samples

14 Wilcoxon Rank Sum Test Mann-Whitney U Test (independent t test) 14

15 Wilcoxon Rank Sum Test Widely used alternative to t-test –t-test robust against deviations from normality, but some situations are extremely non-normal Use differences between ranks –More ‘powerful’ than the sign test which ignores the magnitude of the differences –Tests sum of ranks of one group –Essentially a test of medians Assumptions –Interval measurement with continuous values –Symmetric population distributed around mean 15

16 Wilcoxon Rank Sum Test Used for Two Independent Samples A study was intended to test the difference in IQ score between two samples of PKU children who had different levels of phenylalanine. The sample size for the low ( =10 mg/dl) level group was 21 and 18, respectively. 16

17 Low exposed (<10.0 mg/dl)High exposed (>=10 mg/dl) IQRankIQRank 34.52.028.01.0 37.56.035.03.0 39.57.037.04.5 40.08.037.04.5 45.511.543.59.0 47.014.544.010.0 47.014.545.511.5 47.516.046.013.0 48.719.548.017.0 49.021.048.318.0 51.023.048.719.5 51.023.051.023.0 52.025.552.025.5 53.028.053.028.0 54.031.553.028.0 54.031.554.031.5 55.034.554.031.5 56.536.055.034.5 57.037.0 313.0 58.538.5 58.538.5 467.0 17

18 Wilcoxon Rank Sum Test Used for Two Independent Samples 18

19 Wilcoxon Rank Sum Test Used for Two Independent Samples 19

20 Wilcoxon Rank Sum Test Used for Two Independent Samples The above approach can be applied when sample size in each group is relatively large, say, somewhere between 6-25. Should the sample size is small, alternative approach with calculation of exact probability must be applied. 20

21 Mann-Whitney U Test Test of ranks between two samples Ranks the pooled observations in the two samples and then total the ranks in each sample If the medians are the same, the ranks will be similar Assumptions –Independent –Ordinal scale and continuous values –Any difference reflected in the medians 21

22 22 Mann-Whitney U Test The observations are ranked together in order of increasing magnitude. There are n1 x n2 pairs (xi, yj) Uxy is the number of pairs for which xi<yj Uyx is the number of pairs for which xi>yj Any pairs for which xi=yj count ½ a unit towards both Uxy and Uyx Either of these statistics may be used for a test, with exactly equivalent results.

23 Mann-Whitney U Test Using Uyx, for instance, the statistic must lie between 0 and n1xn2. On the null hypothesis its expectation is (n1xn2)/2; high values will suggest a difference between the distributions, with x tending to take higher values than y. Conversely, lower values of Uyx suggest that x tends to be less than y The test results from Mann-Whitney U Test would be exact equivalent to those from Wilcoxon’s rank sum test. 23

24 Kruskal-Wallis Test (One-Way ANOVA) 24

25 Kruskal-Wallis Test An alternative to parametric ANOVA using ranks –Uses more information than the k-sample median test Assign ranks to grouped observations and then sum ranks in each group Test statistics is distributed as a chi-square statistics for larger samples –Distribution for small samples tabled Assumptions –Independent –Ordinal measurements –Distributions differ by location parameter 25

26 26 K-W 1-way ANOVA for 2 or More Samples Three instructors were assigned to three different classes in which the students were randomly assigned to each class. The investigator wanted to know if there is difference in students’ median performance scores.

27 Performance score Instructor ABC 9680115 128124149 83132166 61135147 101109 - Performance score ranks Instructor ABC 427 9813 31014 11112 56 - R1=22R2=37R3=46

28 28 K-W 1-way ANOVA for 2 or More Samples

29 29 K-W 1-way ANOVA for 2 or More Samples (Paired Comparisons) The nonparametric analog of the Bonferroni t test may be used for paired comparisons so long the K-W test concludes significant differences between groups. If there are k populations (treatments), then there are k(k+1)/2 possible pairs of median that can be compared which lead to k(k+1)/2 tests of the form:

30 30 K-W 1-way ANOVA for 2 or More Samples (Paired Comparisons) Ho: The medians of the ith and jth populations are the same. H1: The medians of the ith and jth populations are different.

31 Kolmogorov-Smirnov (Goodness-of-Fit Test) 31

32 Kolmogorov-Smirnov Goodness-of- fit Test Alternative to the chi-square goodness-of-fit test to test for difference between a sample cumulative distribution and a theoretical cumulative distribution While the chi-square is specifically designed for use with discrete or categorical data, the Kolmogorov-Smirnov test is for random samples from continuous populations. It is used frequently to test whether a sample could come from a population with a particular continuous distribution (eg, normal?) 32

33 33 Kolmogorov-Smirnov Goodness-of-fit Test Test statistic is the greatest difference at any point between the sample distribution and the theoretical distribution Advantages –Does not require grouped observations Can be used with continuous measures –Can be used with any size sample

34 34 Kolmogorov-Smirnov Goodness-of-fit Test In an observational study of the Edmonds sea star at Polka Point, North Stradbroke Island, the following radii (in cm) were measured for 67 individuals, Do these data support the hypothesis that sea star radii are normally distributed? 2.55.06.06.57.08.09.5 4.05.06.06.57.08.010.0 4.05.06.07.07.58.010.0 4.55.56.57.07.58.510.5 4.55.56.57.07.58.511.0 4.55.56.57.07.58.513.0 4.55.56.57.07.58.513.5 4.56.06.57.07.58.5 5.06.06.57.08.09.0 5.06.06.57.08.09.0

35 35 Select a confidence level for the test, say  =0.05. The sample mean and standard deviation was 7.0 and 2.0 cm. Creating a table showing the range of data set that is divided into convenient intervals, usually but not always of the same width. Here 1-cm interval are used.

36 X<=xCum. Freq.S(x)ZxF(Zx) |S(x)-F(Zx)| 2.7510.015-2.120.0170.002 3.7520.030-1.620.0530.038 4.7580.119-1.120.1320.013 5.75170.254-0.620.2690.015 6.75320.478-0.120.4540.024 7.75480.7160.390.6500.066 8.75580.8660.890.8120.054 9.75610.9101.390.9170.007 10.75640.9551.890.9700.015 11.75650.9702.390.9920.022 ++ 671.000 ++ 0.000 =1/67 Z<=(2.75-7.0)/2.0 P(Z<-2.12) 36

37 37 Kolmogorov-Smirnov Goodness-of-fit Test K=max |S(x)-F(Zx)| In this example, k=0.066. At the  =0.05 level and a sample size of 67, the c.v.=0.1632; Ho is rejected only if k is greater than the critical value. Since 0.066<=0.1632, we failed to reject Ho The hypothesis that sear star radii are normally distributed with a mean of 6.98cm and a standard deviation of 2.00 cm is supported.

38 38 Kolmogorov-Smirnov Goodness-of-fit Test

39 Spearman and Kendall Nonparametric Correlations (Pearson’s correlation coefficient) 39

40 Nonparametric Correlations Spearman’s rank-order correlation –Uses ranks of two variables and calculates the Pearson correlation between the ranks Kendall’s tau correlation –Also uses ranks of two variables, but uses the difference between the ranks and the maximum possible difference score to calculate the correlation Spearman’s and Kendall’s are both correlations and both use the same amount of information, but are not directly comparable Assumptions –Variables measured on an ordinal scale 40

41 41 Spearman Rank-order Correlation Coefficient Two professors assessed 12 students. The following table showed the information. What is the correlation between the two sets of scores assigned by the two professors?

42 Student idProf. AProf. Bdidi 2 xiyixi-yi(xi-yi) 2 12.5*5-2.56.25 22.5*2**0.50.25 3981.01.00 45.5*7-1.52.25 512 00 67.5*11-3.512.25 712**1.00 81064.016.00 942**2.04.00 105.5*41.52.25 117.5*10-2.56.25 121192.04.00 Sum=55.50 42

43 Spearman Rank-order Correlation Coefficient 43

44 44 Kendall’s tau Correlation The Kendall correlation coefficient depends on a direct comparison of the n observations (xi, yi) with each other.

45 45 Kendall’s tau Correlation Two observations, for example (190,186) and (182,185), are called concordant if both members of one pair are larger than the corresponding number of the other pair (190>182 and 186>185); A pair of observations, such as (180,188) and (182,185), are called discordant if a number in the first pair is larger than the corresponding number in the second pair (188>185), while the other number in the first pair is smaller than the corresponding number in the second pair (180<182).

46 Kendall’s tau Correlation Pairs with at least one “tie” between respective members are neither concordant nor discordant. Because there are n paired observations in the sample, there are n(n-1)/2 such comparisons that are made. n(n-1)/2=C+D+E, where C=# of concordant pairs of observations; D=# of discordant pairs, and E the # of ties. 46

47 Notion for Kendall’s tau Correlation If the concordant comparisons greatly out number the discordant ones (C>>D), there is strong evidence for a positive correlation. Conversely, there is strong evidence for a negative correlation. If the numbers of concordant and discordant comparisons are roughly equal, there is evidence for no correlation. 47

48 48 If C=n(n-1)/2 and D=0 then tau=+1 If D=n(n-1)/2 and C=0 then tau=-1 In all other cases, -1<tau<+1 Ties are not counted in tau since they do not constitute evidence for either a positive or negative correaltion Notion for Kendall’s tau Correlation

49 Fisher’s Exact Test (Chisq-test) 49

50 Fisher’s Exact Test Used when there are small sample sizes in at least one cell Test for independence in a 2x2 table (extended to rxc tables) Gives the exact p-value for the result (or more extreme) where the chi-square test is an approximation Today, can be used in virtually any situation, not just for small sample sizes Limitations on the chi-square test: not good when (1) one or more cells with expected value <2 or (2) 20% or more of cells with expected vales <5 50

51 Fisher’s Exact Test Computationally, Fisher’s Exact Test is: 51

52 Fisher’s Exact Test Gives us the probability for only the observed table. –We need the probability of that table and all tables more extreme to be consistent with the approach to hypothesis testing –Use the hypergeometric distribution to test this 52

53 Fisher’s Exact Test Nine newborns were in need of undergoing heart transplant surgery, but only 5 of them received the surgery. The following table showed their survival status after 12 months. Is heart surgery associated with survival at the 12th month after birth? 53

54 Fisher’s Exact Test Null hypothesis: No difference in one-year survival rate between the groups Alternative hypothesis: Heart surgery had better survival at the 12th month after birth Alpha=0.05 54

55 Fisher’s Exact Test Survival at the 12 th month YesNoTotal Heart surgery Yesa=4b=15 Noc=1d=34 Total549 55

56 Fisher’s Exact Test Survival at the 12 th month YesNoTotal Heart surgery Yesa=5b=05 Noc=0d=44 Total549 A more extreme situation would be: 56

57 Fisher’s Exact Test P=p1+p2=0.159+0.008=0.167 For two-tail test, simply multiple 0.167 by 2, obtaining a p-value of 0.334 Based on the hypothesis testing, there is no evidence that the heart transplant surgery would result in a favorable survival status for the infants who received such surgery at the age of 12 months. 57


Download ppt "Non-parametric Methods Chung-Yi Li, PhD Dept. of Public Health, College of Med. NCKU 1."

Similar presentations


Ads by Google