# Nonparametric tests I Back to basics. Lecture Outline What is a nonparametric test? Rank tests, distribution free tests and nonparametric tests Which.

## Presentation on theme: "Nonparametric tests I Back to basics. Lecture Outline What is a nonparametric test? Rank tests, distribution free tests and nonparametric tests Which."— Presentation transcript:

Nonparametric tests I Back to basics

Lecture Outline What is a nonparametric test? Rank tests, distribution free tests and nonparametric tests Which type of test to use

MTB > dotplot 'Male' 'Female'; SUBC> same.. :...... :: :..:::.. :..:: :....:..... :.. ---+---------+---------+---------+---------+---------+---MALE..:. : : :..: ::::::.::.:. ::.: :. :.. ---+---------+---------+---------+---------+---------+---FEMALE 0.32 0.48 0.64 0.80 0.96 1.12

MTB > dotplot 'Male' 'Female'; SUBC> same.. :...... :: :..:::.. :..:: :....:..... :.. ---+---------+---------+---------+---------+---------+---MALE..:. : : :..: ::::::.::.:. ::.: :. :.. ---+---------+---------+---------+---------+---------+---FEMALE 0.32 0.48 0.64 0.80 0.96 1.12 MTB > desc 'Male' 'Female’ Variable N Mean Median TrMean StDev SEMean MALE 50 0.5908 0.5600 0.5770 0.1979 0.0280 FEMALE 50 0.5180 0.4950 0.5102 0.1315 0.0186 Variable Min Max Q1 Q3 MALE 0.2900 1.1300 0.4275 0.7150 FEMALE 0.3200 0.8500 0.4100 0.6125

Lecture Outline What is a nonparametric test? –What is a parameter? –What are examples of non-parametric tests? Rank tests, distribution free tests and nonparametric tests Which type of test to use

Parameters are central to inference in GLM and ANOVA and represent assumptions about the underlying processes

LET K1=4.7 # Group 1 mean minus grand mean LET K2=-2.5 # Group 2 mean minus grand mean LET K3=10.4 # The grand mean LET K4=1.9 # Standard deviation of the error RANDOM 30 'Error' LET 'Y'=K3+K1*'DUM1'+K2*'DUM2'+K4*'Error'

LET K1=4.7 # Group 1 mean minus grand mean LET K2=-2.5 # Group 2 mean minus grand mean LET K3=10.4 # The grand mean LET K4=1.9 # Standard deviation of the error RANDOM 30 'Error' LET 'Y'=K3+K1*'DUM1'+K2*'DUM2'+K4*'Error' Fitted value =  + Group 1  1 2  2 3-  1 -  2 Error has Normal Distribution with zero mean and standard deviation 

LET K1=4.7 # Group 1 mean minus grand mean LET K2=-2.5 # Group 2 mean minus grand mean LET K3=10.4 # The grand mean LET K4=1.9 # Standard deviation of the error RANDOM 30 'Error' LET 'Y'=K3+K1*'DUM1'+K2*'DUM2'+K4*'Error' Fitted value =  + Group 1  1 2  2 3-  1 -  2 Error has Normal Distribution with zero mean and standard deviation 

Parameters are central to inference in GLM and ANOVA but represent assumptions about the underlying processes

Parameters are central to inference in GLM and ANOVA but represent assumptions about the underlying processes can be done without in some simple situations

Parameters are central to inference in GLM and ANOVA but represent assumptions about the underlying processes can be done without in some simple situations – BUT HOW?

RnkWtSex 10.291 20.322 30.341 4 2 5 2 60.361 7 1 80.371 9 1 100.371 110.372 120.372 130.381 140.381 150.382 160.382 170.392 180.402 190.402 200.402 210.411 220.411 230.412 240.412 250.412 260.412 270.421 280.431 290.432 300.432 310.451 320.452 330.452 340.452 350.462 360.471 370.471 380.481 390.481 400.482 410.482 420.492 430.492 440.501 450.501 460.501 470.502 480.502 490.511 500.512 510.521 520.522 530.522 540.532 550.532 560.552 570.561 580.561 590.561 600.571 610.582 620.582 630.591 640.592 650.592 660.601 670.611 680.612 690.621 700.621 710.622 720.622 730.622 740.631 750.632 760.651 770.661 780.671 790.672 800.672 810.672 820.681 830.711 840.722 850.731 860.751 870.751 880.771 890.781 900.782 910.782 920.822 930.831 940.851 950.852 960.881 970.981 980.981 991.051 1001.131

RnkWtSex 10.291 20.322 30.341 4 2 5 2 60.361 7 1 80.371 9 1 100.371 110.372 120.372 130.381 140.381 150.382 160.382 170.392 180.402 190.402 200.402 210.411 220.411 230.412 240.412 250.412 260.412 270.421 280.431 290.432 300.432 310.451 320.452 330.452 340.452 350.462 360.471 370.471 380.481 390.481 400.482 410.482 420.492 430.492 440.501 450.501 460.501 470.502 480.502 490.511 500.512 510.521 520.522 530.522 540.532 550.532 560.552 570.561 580.561 590.561 600.571 610.582 620.582 630.591 640.592 650.592 660.601 670.611 680.612 690.621 700.621 710.622 720.622 730.622 740.631 750.632 760.651 770.661 780.671 790.672 800.672 810.672 820.681 830.711 840.722 850.731 860.751 870.751 880.771 890.781 900.782 910.782 920.822 930.831 940.851 950.852 960.881 970.981 980.981 991.051 1001.131 Remember ties

1009080706050403020100 140 120 100 80 60 40 20 0 Mean Rank

1009080706050403020100 140 120 100 80 60 40 20 0 The ‘Male’ mean rank = 55.26 The ‘Female’ mean rank = 45.74 Mean Rank

MTB > mann-whitney male female

Mann-Whitney Test and CI: MALE, FEMALE

MTB > mann-whitney male female Mann-Whitney Test and CI: MALE, FEMALE MALE N = 50 Median = 0.5600 FEMALE N = 50 Median = 0.4950

MTB > mann-whitney male female Mann-Whitney Test and CI: MALE, FEMALE MALE N = 50 Median = 0.5600 FEMALE N = 50 Median = 0.4950 Point estimate for ETA1-ETA2 is 0.0500 95.0 Percent CI for ETA1-ETA2 is (-0.0100,0.1200)

MTB > mann-whitney male female Mann-Whitney Test and CI: MALE, FEMALE MALE N = 50 Median = 0.5600 FEMALE N = 50 Median = 0.4950 Point estimate for ETA1-ETA2 is 0.0500 95.0 Percent CI for ETA1-ETA2 is (-0.0100,0.1200) W = 2763.0

MTB > mann-whitney male female Mann-Whitney Test and CI: MALE, FEMALE MALE N = 50 Median = 0.5600 FEMALE N = 50 Median = 0.4950 Point estimate for ETA1-ETA2 is 0.0500 95.0 Percent CI for ETA1-ETA2 is (-0.0100,0.1200) W = 2763.0 Sum of ranks of 2763 corresponds to a mean rank of 2763/50 = 55.26

1009080706050403020100 140 120 100 80 60 40 20 0 The ‘Male’ mean rank = 55.26 The ‘Female’ mean rank = 45.74 Mean Rank

1009080706050403020100 140 120 100 80 60 40 20 0 The ‘Male’ mean rank = 55.26 The ‘Female’ mean rank = 45.74 Mean Rank

MTB > mann-whitney male female Mann-Whitney Test and CI: MALE, FEMALE MALE N = 50 Median = 0.5600 FEMALE N = 50 Median = 0.4950 Point estimate for ETA1-ETA2 is 0.0500 95.0 Percent CI for ETA1-ETA2 is (-0.0100,0.1200) W = 2763.0 Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at 0.1016

MTB > mann-whitney male female Mann-Whitney Test and CI: MALE, FEMALE MALE N = 50 Median = 0.5600 FEMALE N = 50 Median = 0.4950 Point estimate for ETA1-ETA2 is 0.0500 95.0 Percent CI for ETA1-ETA2 is (-0.0100,0.1200) W = 2763.0 Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at 0.1016 The test is significant at 0.1014 (adjusted for ties)

MTB > mann-whitney male female Mann-Whitney Test and CI: MALE, FEMALE MALE N = 50 Median = 0.5600 FEMALE N = 50 Median = 0.4950 Point estimate for ETA1-ETA2 is 0.0500 95.0 Percent CI for ETA1-ETA2 is (-0.0100,0.1200) W = 2763.0 Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at 0.1016 The test is significant at 0.1014 (adjusted for ties) Cannot reject at alpha = 0.05

MTB > mann-whitney male female Mann-Whitney Test and CI: MALE, FEMALE MALE N = 50 Median = 0.5600 FEMALE N = 50 Median = 0.4950 Point estimate for ETA1-ETA2 is 0.0500 95.0 Percent CI for ETA1-ETA2 is (-0.0100,0.1200) W = 2763.0 Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at 0.1016 The test is significant at 0.1014 (adjusted for ties) Cannot reject at alpha = 0.05

MTB > mann-whitney male female Mann-Whitney Test and CI: MALE, FEMALE MALE N = 50 Median = 0.5600 FEMALE N = 50 Median = 0.4950 Point estimate for ETA1-ETA2 is 0.0500 95.0 Percent CI for ETA1-ETA2 is (-0.0100,0.1200) W = 2763.0 Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at 0.1016 The test is significant at 0.1014 (adjusted for ties) Cannot reject at alpha = 0.05 The null hypothesis is better expressed as “the distributions of male and female weights are the same”.

Parameters are central to inference in GLM and ANOVA but represent assumptions about the underlying processes can be done without in some simple situations

Nonparametric vs Parametric

Sign TestOne-sample t-test

Nonparametric vs Parametric Sign Test Mann-Whitney Test One-sample t-test Two-sample t-test

Nonparametric vs Parametric Sign Test Mann-Whitney Test Spearman Rank Test One-sample t-test Two-sample t-test Correlation/Regression

Nonparametric vs Parametric Sign Test Mann-Whitney Test Spearman Rank Test Kruskal-Wallis Test One-sample t-test Two-sample t-test Correlation/Regression One-way ANOVA

Nonparametric vs Parametric Sign Test Mann-Whitney Test Spearman Rank Test Kruskal-Wallis Test Friedman Test One-sample t-test Two-sample t-test Correlation/Regression One-way ANOVA One-way blocked ANOVA

Lecture Outline What is a nonparametric test? Rank tests, distribution free tests and nonparametric tests Which type of test to use

A rose by any other name.. Non-parametric tests lack parameters Rank tests start by ranking the data Distribution-free tests don’t assume a Normal distribution (or any other) These are mainly but not completely overlapping sets of tests (and some are scale-invariant too).

Lecture Outline What is a nonparametric test? Rank tests, distribution free tests and nonparametric tests Which type of test to use

Fewer assumptions but... still some assumptions (including independence) limited range of situations –no more than 2 x-variables –can’t mix continuous and categorical x-variables provide p-values but estimation is dodgy loss of efficiency if parametric assumptions are upheld there is a grand scheme for parametric statistics (GLM) but a lot of separate strange names for nonparametrics

When is there a choice? when there is a non-parametric test –fewer than two or three variables altogether and prediction is not required

How to choose: If the assumptions of parametric test are upheld, use it – on grounds of efficiency If not upheld, consider fixing the assumptions (e.g. by transforming the data, as in the practical) If assumptions not fixable, use nonparametric test

MTB > dotplot 'LogM' 'LogF'; SUBC> same...... ::: :... :::.. :..::.:....: : :. :.. +---------+---------+---------+---------+---------+-------LogM.:. :... : ::.:: : :. ::.::. ::.:. :. :.. +---------+---------+---------+---------+---------+-------LogF -1.25 -1.00 -0.75 -0.50 -0.25 0.00

MTB > dotplot 'LogM' 'LogF'; SUBC> same...... ::: :... :::.. :..::.:....: : :. :.. +---------+---------+---------+---------+---------+-------LogM.:. :... : ::.:: : :. ::.::. ::.:. :. :.. +---------+---------+---------+---------+---------+-------LogF -1.25 -1.00 -0.75 -0.50 -0.25 0.00 MTB > desc 'LogM' 'LogF' Variable N Mean Median TrMean StDev SEMean LogM 50 -0.5786 -0.5798 -0.5850 0.3248 0.0459 LogF 50 -0.6878 -0.7032 -0.6928 0.2453 0.0347 Variable Min Max Q1 Q3 LogM -1.2379 0.1222 -0.8499 -0.3355 LogF -1.1394 -0.1625 -0.8916 -0.4902

Lecture Outline What is a nonparametric test? Rank tests, distribution free tests and nonparametric tests Which type of test to use

Last remarks Nonparametric tests are an opportunity to revise the basic ideas of statistical inference They are sometimes useful in biology They are often used in biology NEXT WEEK: more nonparametrics, including confidence intervals and randomisation tests. READ the handout

Download ppt "Nonparametric tests I Back to basics. Lecture Outline What is a nonparametric test? Rank tests, distribution free tests and nonparametric tests Which."

Similar presentations