Nonparametric Statistics. In previous testing, we assumed that our samples were drawn from normally distributed populations. This chapter introduces some.

Slides:



Advertisements
Similar presentations
1 Chapter 20: Statistical Tests for Ordinal Data.
Advertisements

Chapter 16 Introduction to Nonparametric Statistics
Introduction to Nonparametric Statistics
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Nonparametric Methods Chapter 15.
PSY 307 – Statistics for the Behavioral Sciences Chapter 20 – Tests for Ranked Data, Choosing Statistical Tests.
statistics NONPARAMETRIC TEST
Lecture 10 Non Parametric Testing STAT 3120 Statistical Methods I.
Copyright © 2010, 2007, 2004 Pearson Education, Inc Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Copyright © 2010, 2007, 2004 Pearson Education, Inc Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Nonparametric Statistics Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing.
Chapter 12 Chi-Square Tests and Nonparametric Tests
Lesson #25 Nonparametric Tests for a Single Population.
Statistics 07 Nonparametric Hypothesis Testing. Parametric testing such as Z test, t test and F test is suitable for the test of range variables or ratio.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 17: Nonparametric Tests & Course Summary.
© 2004 Prentice-Hall, Inc.Chap 10-1 Basic Business Statistics (9 th Edition) Chapter 10 Two-Sample Tests with Numerical Data.
Basic Business Statistics (9th Edition)
15-1 Introduction Most of the hypothesis-testing and confidence interval procedures discussed in previous chapters are based on the assumption that.
Chapter 15 Nonparametric Statistics
Chapter 11 Nonparametric Tests Larson/Farber 4th ed.
11 Chapter Nonparametric Tests © 2012 Pearson Education, Inc.
Chapter 14: Nonparametric Statistics
Copyright © 2010, 2007, 2004 Pearson Education, Inc Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
14 Elements of Nonparametric Statistics
NONPARAMETRIC STATISTICS
Non-parametric Tests. With histograms like these, there really isn’t a need to perform the Shapiro-Wilk tests!
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
CHAPTER 14: Nonparametric Methods
Chapter 14 Nonparametric Statistics. 2 Introduction: Distribution-Free Tests Distribution-free tests – statistical tests that don’t rely on assumptions.
Chapter 11 Nonparametric Tests.
Lesson Inferences about the Differences between Two Medians: Dependent Samples.
What are Nonparametric Statistics? In all of the preceding chapters we have focused on testing and estimating parameters associated with distributions.
Previous Lecture: Categorical Data Methods. Nonparametric Methods This Lecture Judy Zhong Ph.D.
CHAPTER 14: Nonparametric Methods to accompany Introduction to Business Statistics seventh edition, by Ronald M. Weiers Presentation by Priscilla Chaffe-Stengel.
Nonparametric Statistics aka, distribution-free statistics makes no assumption about the underlying distribution, other than that it is continuous the.
© Copyright McGraw-Hill CHAPTER 13 Nonparametric Statistics.
Biostatistics, statistical software VII. Non-parametric tests: Wilcoxon’s signed rank test, Mann-Whitney U-test, Kruskal- Wallis test, Spearman’ rank correlation.
Ordinally Scale Variables
Copyright © Cengage Learning. All rights reserved. 14 Elements of Nonparametric Statistics.
1 Nonparametric Statistical Techniques Chapter 17.
Nonparametric Statistics
Lesson 15 - R Chapter 15 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Two-Sample Hypothesis Testing. Suppose you want to know if two populations have the same mean or, equivalently, if the difference between the population.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests and Nonparametric Tests Statistics for.
Ch11: Comparing 2 Samples 11.1: INTRO: This chapter deals with analyzing continuous measurements. Later, some experimental design ideas will be introduced.
Nonparametric Statistical Methods. Definition When the data is generated from process (model) that is known except for finite number of unknown parameters.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Kruskal-Wallis H TestThe Kruskal-Wallis H Test is a nonparametric procedure that can be used to compare more than two populations in a completely randomized.
Medical Statistics (full English class) Ji-Qian Fang School of Public Health Sun Yat-Sen University.
Statistics in Applied Science and Technology Chapter14. Nonparametric Methods.
CD-ROM Chap 16-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition CD-ROM Chapter 16 Introduction.
NON-PARAMETRIC STATISTICS
Nonparametric Statistics
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Nonparametric Statistics.
NONPARAMETRIC STATISTICS In general, a statistical technique is categorized as NPS if it has at least one of the following characteristics: 1. The method.
Nonparametric Statistical Methods. Definition When the data is generated from process (model) that is known except for finite number of unknown parameters.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Nonparametric Statistics.
Copyright © 2010, 2007, 2004 Pearson Education, Inc Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Lesson Test to See if Samples Come From Same Population.
Nonparametric statistics. Four levels of measurement Nominal Ordinal Interval Ratio  Nominal: the lowest level  Ordinal  Interval  Ratio: the highest.
1 Nonparametric Statistical Techniques Chapter 18.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. CHAPTER 14: Nonparametric Methods to accompany Introduction to Business Statistics fifth.
Chapter 12 Chi-Square Tests and Nonparametric Tests
NONPARAMETRIC STATISTICS
Lesson Inferences about the Differences between Two Medians: Dependent Samples.
Lecture Slides Elementary Statistics Twelfth Edition
The Rank-Sum Test Section 15.2.
Nonparametric Statistics
Presentation transcript:

Nonparametric Statistics

In previous testing, we assumed that our samples were drawn from normally distributed populations. This chapter introduces some techniques that do not make that assumption. These methods are called distribution-free or nonparametric tests. In situations where the normal assumption is appropriate, nonparametric tests are less efficient than traditional parametric methods. Nonparametric tests frequently make use only of the order of the observations and not the actual values.

In this section, we will discuss four nonparametric tests: the Wilcoxon Rank Sum Test (or Mann-Whitney U test), the Wilcoxon Signed Ranks Test, the Kruskal-Wallis Test, and the one sample test of runs.

The Wilcoxon Rank Sum Test or Mann-Whitney U Test This test is used to test whether 2 independent samples have been drawn from populations with the same median. It is a nonparametric substitute for the t-test on the difference between two means.

Based on the following samples from two universities, test at the 10% level whether graduates from the two schools have the same average grade on an aptitude test. Wilcoxon Rank Sum Test Example: university AB

university AB First merge and rank the grades. Sum the ranks for each sample. rankgradeuniversity 150A 252A 356A 460A 564A 668A 770B 871A 973B 1074A 1177B 1280B 1383B 1485B 1587B 1688B 1789A 1895A 1996B 2099B rank sum for university A: 74 rank sum for university B: 136 Note: If there are ties, each value gets the average rank. For example, if 2 values tie for 3 th and 4 th place, both are ranked 3.5. If three differences would be ranked 7, 8, and 9, rank them all 8.

Z critical region Since the critical values for a 2-tailed Z test at the 10% level are and , we reject H 0 that the medians are the same and accept H 1 that the medians are different.

For small sample sizes, you can use Table E.6 in your textbook, which provides the lower and upper critical values for the Wilcoxon Rank Sum Test. That table shows that for our 10% 2-tailed test, the lower critical value is 82 and the upper critical value is 128. Since our smaller sample’s rank sum is 74, which is outside the interval (82, 128) indicated in the table, we reject the null hypothesis that the medians are the same and conclude that they are different. Equivalently, since the larger sample’s rank sum is 136, which is also outside the interval (82, 128), we again reject the null hypothesis that the medians are the same and conclude that they are different.

The Wilcoxon Signed Rank Test This test is used to test whether 2 dependent samples have been drawn from populations with the same median. It is a nonparametric substitute for the paired t-test on the difference between two means.

Wilcoxon Signed Rank Test Procedure 1.Calculate the differences in the paired values (D i =X 1i – X 2i ) 2.Take absolute values of the differences and rank them (Discard all differences that equal 0.) 3.Assign ranks R i with the smallest rank equal to 1. As in the rank sum test, if two or more of the differences are equal, each difference gets the average rank. (That is, if two differences would be ranked 3 and 4, rank them both 3.5. If three differences would be ranked 7, 8, and 9, rank them all 8.) 4.Assign the symbol + to positive differences and – to negative differences. 5.Calculate the Wilcoxon statistic W as the sum of the positive ranks. So,

Wilcoxon Signed Rank Test Procedure (cont’d)

exam1exam2 diff (ex2-ex1) rank (+) rank (-) exam1exam2 diff (ex2-ex1) rank (+) rank (-) Example Suppose we have a class with 22 students, each of whom has two exam grades. We want to test at the 5% level whether there is a difference in the median grade for the two exams.

exam1exam2 diff (ex2-ex1) rank (+) rank (-) exam1exam2 diff (ex2-ex1) rank (+) rank (-) We calculate the difference between the exam grades: diff = exam2 – exam 1.

exam1exam2 diff (ex2-ex1) rank (+) rank (-) exam1exam2 diff (ex2-ex1) rank (+) rank (-) Then we rank the absolute values of the differences from smallest to largest, omitting the two zero differences. The smallest non-zero |differences| are the two |-1|’s. Since they are tied for ranks 1 and 2, we rank them both 1.5. Since the differences were negative, we put the ranks in the negative column.

exam1exam2 diff (ex2-ex1) rank (+) rank (-) exam1exam2 diff (ex2-ex1) rank (+) rank (-) The next smallest non-zero |differences| are the two |2|’s. Since they are tied for ranks 3 and 4, we rank them both 3.5. Since the differences were positive, we put the ranks in the positive column.

exam1exam2 diff (ex2-ex1) rank (+) rank (-) exam1exam2 diff (ex2-ex1) rank (+) rank (-) The next smallest non-zero |differences| are the two |-3|’s and the |3|. Since they are tied for ranks 5, 6, and 7, we rank them all 6. Then we put the ranks in the appropriately signed columns.

exam1exam2 diff (ex2-ex1) rank (+) rank (-) exam1exam2 diff (ex2-ex1) rank (+) rank (-) We continue until we have ranked all the non-zero |differences|.

exam1exam2 diff (ex2-ex1) rank (+) rank (-) exam1exam2 diff (ex2-ex1) rank (+) rank (-) Then we total the signed ranks. We get 154 for the sum of the positive ranks and 56 for the sum of the negative ranks. The Wilcoxon test statistic is the sum of the positive ranks. So W = 154.

Since we had 22 students and 2 zero differences, the number of non-zero differences n = Z critical region Since the critical values for a 2-tailed Z test at the 5% level are 1.96 and -1.96, we can not reject the null hypothesis H 0 and so we conclude that the medians are the same.

For small sample sizes, you can use Table in the online material associated with section 12.8 of your textbook, which provides the lower and upper critical values for the Wilcoxon Signed Rank Test. This table is shown on the next slide.

Lower & Upper Critical Values, W, of Wilcoxon Signed Ranks Test ONE-TAILα = 0.05α = 0.025α = 0.01α = TWO-TAILα = 0.10α = 0.05α = 0.02α = 0.01 n(Lower, Upper) 50,15—,— 62,190,21—,— 73,252,260,28—,— 85,313,331,350,36 98,375,403,421, ,458,475,503, ,5310,567,595, ,6113,6510,687, ,7017,7412,7910, ,8021,8416,8913, ,9025,9519,10116, ,10129,10723,11319, ,11234,11927,12623, ,12440,13132,13927, ,13746,14437,15332, ,15052,15843,16737,173 Recall that we have 20 non-zero differences and are performing a 5% 2-tailed test. Here we see that the lower critical value is 52 and the upper critical value is 158. Our statistic W, the sum of the positive ranks, is 154, which is inside the interval (52, 158) indicated in the table. So we can not reject the null hypothesis and we conclude that the medians are the same.

The Kruskal-Wallis Test This test is used to test whether several populations have the same median. It is a nonparametric substitute for a one-factor ANOVA F-test.

where n j is the number of observations in the j th sample, n is the total number of observations, and R j is the sum of ranks for the j th sample. In the case of ties, a corrected statistic should be computed: where t j is the number of ties in the j th sample.

Kruskal-Wallis Test Example: Test at the 5% level whether average employee performance is the same at 3 firms, using the following standardized test scores for 20 employees. Firm 1Firm 2Firm 3 scorerankscorerankscorerank n 1 = 7n 2 = 6n 3 =7

We rank all the scores. Then we sum the ranks for each firm. Then we calculate the K statistic. Firm 1Firm 2Firm 3 scorerankscorerankscorerank n 1 = 7R 1 = 106n 2 = 6R 2 = 47n 3 =7R 3 = 57

f(  2 ) acceptance region crit. reg From the  2 table, we see that the 5% critical value for a  2 with 2 dof is Since our value for K was 6.641, we reject H 0 that the medians are the same and accept H 1 that the medians are different.

One sample test of runs a test for randomness of order of occurrence

A run is a sequence of identical occurrences that are followed and preceded by different occurrences. Example: The list of X’s & O’s below consists of 7 runs. x x x o o o o x x o o o o x x x x o o x

Suppose r is the number of runs, n 1 is the number of type 1 occurrences and n 2 is the number of type 2 occurrences.

If n 1 and n 2 are each at least 10, then r is approximately normal.

Example: A stock exhibits the following price increase (+) and decrease (  ) behavior over 25 business days. Test at the 1% whether the pattern is random   +    + +  +  +   + +  + +  +  r =16, n 1 (+) = 13, n 2 (  ) = 12 Since the critical values for a 2-tailed 1% test are and , we accept H 0 that the pattern is random Z critical region.005 acceptance region.495