Nonparametric tests European Molecular Biology Laboratory Predoc Bioinformatics Course 17 th Nov 2009 Tim Massingham,
What is a nonparametric test? Parametric: assume data from some family of distribution functions Gamma distribution with different parameters Normal distribution mean variance Gamma distribution shape scale etc… Non-parametric means that no assumptions about distribution Generally means just look at ranks of data Most traditional tests assume a normal distribution Shape Scale
Robustness Pearsons correlation test Correlation = (p-value = ) Correlation = (p-value = ) Correlation = (p-value = 5.81e-06) Correlation = (p-value = 6.539e-08) A single observation can change the outcome of many tests Robust tests are resistant to outliers but require more data 200 observations from normal distribution x ~ normal(0,1)y ~ normal(1,3)
Robustness A single observation can change the outcome of many tests Robust tests are resistant to outliers but require more data Spearmans correlation test Correlation = (p-value = ) Correlation = (p-value = ) Correlation = (p-value = 0.101) Pearsons correlation test Correlation = (p-value = ) Correlation = (p-value = ) Correlation = (p-value = 5.81e-06) Correlation = (p-value = 6.539e-08) Non-parametricParametric
Newcombs speed of light data Newcombs lab (1878) Washington monument (~12 s later) Standard test of all data Mean % confidence interval (width=5.3) Newcomb dropped the outlier Mean % confidence interval (width=3.1) Robust test (Sign test for median) Median % confidence interval (width=2.5)
Efficiency of robust tests Few results, mostly for large samples Using median rather than mean50% more data Wilcoxon test vs. t-test20% more data (no more than) Potvin and Roff (1993) Ecology 74: Percentage extra data for same tests Asymptotic Relative Efficiency Asymptotic valid for large samples Relative efficiency ratio of variance
Efficiency of robust tests Few results, mostly for large samples Using median rather than mean50% more data Wilcoxon test vs. t-test20% more data (no more than) Potvin and Roff (1993) Ecology 74: Percentage extra data for same tests Requires less data!
Kolmogorov test AKA Kolmogorov-Smirnov test Type of data:continuous Parametric equivalent:none Distribution of statistic:exact when no ties in data Does this data follow a specific distribution? Are two sets of data from the same distribution? Maximum difference
Kolmogorov test Why does it work? Rank difference constant under transformation stretch and contract x axis
Kolmogorov test For testing whether data is normally distributed or not, the Shapiro-Wilk test is preferred. See shapiro.test in R Not valid when null distribution has been fitted to data, e.g. test against normal but fit mean and variance ks.test(stud_logexp, pnorm) One-sample Kolmogorov-Smirnov test data: stud_logexp D = , p-value < 2.2e-16 alternative hypothesis: two-sided Is Studentized expression data normal?
Kolmogorov two-sample test Are two sets of data from the same distribution? Gene expression data from Arabidopsis thaliana sprayed with 1.6mM Tween sprayed with water ks.test(logexp1,logexp2) Two-sample Kolmogorov-Smirnov test data: logexp1 and logexp2 D = , p-value = alternative hypothesis: two-sided Biggest deviations for low expression
Sign test Is the median of the data zero? Is the median x? (Subtract x from data and test against zero) Type of data:continuous Parametric equivalent:Students t-test (one sample) Distribution of statistic:exact when no ties in data 50% 50:50 chance each side median Count them up use binomial test median <00> Gene expression differences
binom.test( c(12334,10155) ) Exact binomial test data: c(10155, 12334) number of successes = 10155, number of trials = 22489, p-value < 2.2e-16 alternative hypothesis: true probability of success is not equal to percent confidence interval: sample estimates: probability of success Sign test Is the median of the data zero? Is the median x? (Subtract x from data and test against zero) Gene expression differences Expect difference in expression to be zero Discard differences of exactly zero <00> Confidence interval is on proportion not the expression difference SIGN.test in the PASWR package is a more convenient way of doing a sign test and gives confidence intervals.
Wilcoxon Signed Rank test Type of data:ordinal (interval for paired data) Parametric equivalent:Students t-test Distribution of statistic:exact Is the data symmetric about zero? Is the data symmetric about x? (Subtract x and test against zero) Much stronger assumption than signed test median=0.72 Test rejects non-symmetric data a <- rweibull(1000,1,1) wilcox.test( a-median(a) ) p-value = 1.087e-05
Wilcoxon Signed Rank test Special case when we do expect symmetry X=Intrinsic + Random X Y=Intrinsic + Random Y Look a pair X & Y Random property measurement error natural variation Paired data Same gene under two different conditions Measuring response (before and after) Paired control, e.g. sibling pairs
Wilcoxon Signed Rank test Paired data Same gene under two different conditions Measuring response (before and after) Paired control, e.g. sibling pairs Special case when we do expect symmetry X=Intrinsic + Random X Y=Intrinsic + Random Y - Distribution of difference is symmetric about zero Look a pair X & Y -=
Wilcoxon Signed Rank test Have gene expression data in two matched Arabidopsis thaliana plants one sprayed with 1.6mM Tween and left for one hour one sprayed with distilled water and left for one hour The genes form matched pairs WaterTweenDifference
Wilcoxon Signed Rank test wilcox.test( lexp1, lexp2, paired=TRUE ) Wilcoxon signed rank test with continuity correction data: lexp1 and lexp2 V = , p-value < 2.2e-16 alternative hypothesis: true location shift is not equal to 0 wilcox.test( lexp1, lexp2, paired=TRUE, conf.int=TRUE) Wilcoxon signed rank test with continuity correction data: lexp1 and lexp2 V = , p-value < 2.2e-16 alternative hypothesis: true location shift is not equal to 0 95 percent confidence interval: sample estimates: (pseudo)median
Wilcoxon Rank Sum Test Also referred to as Mann-Whitney or Mann-Whitney-Wilcoxon test Type of data:ordinal Parametric equivalent:two-sample Students t-test Distribution of statistic:exact Do two samples have the same median? Look at same expression data but ignore pairing wilcox.test( lexp1, lexp2, conf.int=TRUE) Wilcoxon rank sum test with continuity correction data: lexp1 and lexp2 W = , p-value = alternative hypothesis: true location shift is not equal to 0 95 percent confidence interval: sample estimates: difference in location
Paired vs two-sample tests Pairing can make a huge difference to power of test Look at a case where the variation in intrinsic greater than effect wilcox.test(sample1,sample2) Wilcoxon rank sum test data: sample1 and sample2 W = 4930, p-value = alternative hypothesis: true location shift is not equal to 0 wilcox.test(sample1,sample2,paired=TRUE) Wilcoxon signed rank test data: sample1 and sample2 V = 1609, p-value = alternative hypothesis: true location shift is not equal to 0
Kruskal-Wallis Type of data:ordinal Parametric equivalent: ANOVA Distribution of statistic:approximate What if we have several groups? Arabidopis gene expression data consisted of 6 experiments 6 groups of expression data; do they have different medians? kruskal.test(gene_expression) Kruskal-Wallis rank sum test data: gene_expression Kruskal-Wallis chi-squared = , df = 5, p-value = 2.575e-11 For two samples, Kruskal-Wallis is equivalent to Wilcoxon Rank Sum
Friedman test Paired observations Wilcoxon Signed Rank test Genes Groups Type of data:ordinal Parametric equivalent: ANOVA with blocks Distribution of statistic:approximate Genes G1 G2Groups Many groups Kruskal-Wallis test Many groups in distinct units
Friedman test Classic example: wine tasting Ask 4 women to rank 3 different wines, is one wine preferred? MerlotShirazPinot Noir Agnes123 Clara213 Mona132 Pam123 wine Merlot Shiraz Pinot Noir Agnes Clara Mona Pam friedman.test(wine) Friedman rank sum test data: wine Friedman chi-squared = 4.5, df = 2, p-value = friedman.test(t(wine)) Friedman rank sum test data: t(wine) Friedman chi-squared = , df = 3, p-value = AgnesClaraMonaPam Merlot1211 Shiraz2132 Pinot Noir3323 Flip the question: Are judges ranking wines in a consistent manner? Expected since forcing judges to rank
Friedman test Another look at the Arabidopis data - look at first 20 genes Exp 1 Exp 2 Exp 3 Exp 4 Exp 5 Exp 6 Friedman Test p-value = Kruskal-Wallis Test p-value = Genes
Friedman test Exp 1Exp 2Exp 3Exp 4Exp 5Exp 6 Exp Exp Exp Exp Exp Exp 6 Friedman Test p-value = Pairwise Wilcoxon Signed Rank (multiple comparisons problem) Friedman / Kruskal-Wallis: at least one experiment shows difference Does not say which experiment Exp 1Exp 2Exp 3Exp 4Exp 5Exp 6 Exp Exp Exp Exp Exp Exp 6 Raw p-values Adjusted p-values
Friedman test Exp 1Exp 2Exp 3Exp 4Exp 5Exp 6 Exp Exp Exp Exp Exp Exp 6 Adjusted p-values from Signed Rank testExperiment map Actually have three pairs of experiments AExp 6 & Exp 1: with and without Tween, 1 hour BExp 2 & Exp 3: with and without Tween, 2.5 hours CExp 5 & Exp 4: with and without Tween, 1 hour (replicate of A) Difference detected may not be a useful one But note: Looked at first 20 genes Full set has 22810
Aside on blocking Gene Experiment The Friedman tests assumes that all treatments are applied to all blocks balanced complete design Statistical lingo Experiments are treatments Genes are blocks Might not be able to do this too expensive blocks only available in packs of fixed size Incomplete experimental design Which treatments with which blocks is a critical issue
Aside on blocking Gene Experiment The Friedman tests assumes that all treatments are applied to all blocks balanced complete design Statistical lingo Experiments are treatments Genes are blocks Might not be able to do this too expensive blocks only available in packs of fixed size Incomplete experimental design Which treatments with which blocks is a critical issue Talk to a statistician before you start