# Chapter 16 Introduction to Nonparametric Statistics

## Presentation on theme: "Chapter 16 Introduction to Nonparametric Statistics"— Presentation transcript:

Chapter 16 Introduction to Nonparametric Statistics
Business Statistics: A Decision-Making Approach 6th Edition Chapter 16 Introduction to Nonparametric Statistics Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Chapter Goals After completing this chapter, you should be able to:
Recognize when and how to use the Wilcoxon signed rank test for a population median Recognize the situations for which the Wilcoxon signed rank test applies and be able to use it for decision-making Know when and how to perform a Mann-Whitney U-test Perform nonparametric analysis of variance using the Kruskal-Wallis one-way ANOVA Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Nonparametric Statistics
Fewer restrictive assumptions about data levels and underlying probability distributions Population distributions may be skewed The level of data measurement may only be ordinal or nominal Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Wilcoxon Signed Rank Test
Used to test a hypothesis about one population median the median is the midpoint of the distribution: 50% below, 50% above A hypothesized median is rejected if sample results vary too much from expectations no highly restrictive assumptions about the shape of the population distribution are needed Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Performing the Wilcoxon Signed Rank Test
The W Test Statistic Performing the Wilcoxon Signed Rank Test Calculate the test statistic W using these steps: Step 1: collect sample data Step 2: compute di = difference between each value and the hypothesized median Step 3: convert di values to absolute differences Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Performing the Wilcoxon Signed Rank Test
The W Test Statistic (continued) Performing the Wilcoxon Signed Rank Test Step 4: determine the ranks for each di value eliminate zero di values Lowest di value = 1 For ties, assign each the average rank of the tied observations Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Performing the Wilcoxon Signed Rank Test
The W Test Statistic (continued) Performing the Wilcoxon Signed Rank Test Step 5: Create R+ and R- columns for data values greater than the hypothesized median, put the rank in an R+ column for data values less than the hypothesized median, put the rank in an R- column Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Performing the Wilcoxon Signed Rank Test
The W Test Statistic (continued) Performing the Wilcoxon Signed Rank Test Step 6: the test statistic W is the sum of the ranks in the R+ column Test the hypothesis by comparing the calculated W to the critical value from the table in appendix P Note that n = the number of non-zero di values Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Example The median class size is claimed to be 40
Sample data for 8 classes is randomly obtained Compare each value to the hypothesized median to find difference Class size = xi Difference di = xi – 40 | di | 23 45 34 78 66 61 95 -17 5 -6 38 26 21 55 17 6 Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Example Rank the absolute differences: tied (continued) | di | Rank 5
6 17 21 26 38 55 1 2.5 4 7 8 tied Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Example Put ranks in R+ and R- columns and find sums: (continued)
Class size = xi Difference di = xi – 40 | di | Rank R+ R- 23 45 34 78 66 61 95 -17 5 -6 38 26 21 55 17 6 4 1 2.5 7 8  = 27  = 9 These three are below the claimed median, the others are above Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Completing the Test H0: Median = 40 HA: Median ≠ 40
Test at the  = .05 level: This is a two-tailed test and n = 8, so find WL and WU in appendix P: WL = 3 and WU = 33 The calculated test statistic is W = R+ = 27 Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Completing the Test H0: Median = 40 HA: Median ≠ 40 WL = 3 and WU = 33
(continued) H0: Median = 40 HA: Median ≠ 40 WL = 3 and WU = 33 WL < W < WU so do not reject H0 (there is not sufficient evidence to conclude that the median class size is different than 40) W = R+ = 27 reject H0 do not reject H0 reject H0 WL = 3 WU = 33 Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

If the Sample Size is Large
The W test statistic approaches a normal distribution as n increases For n > 20, W can be approximated by where W = sum of the R+ ranks d = number of non-zero di values Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Nonparametric Tests for Two Population Centers
Mann-Whitney U-test Wilcoxon Matched-Pairs Signed Rank Test Small Samples Large Samples Small Samples Large Samples Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Mann-Whitney U-Test Used to compare two samples from two populations
Assumptions: The two samples are independent and random The value measured is a continuous variable The measurement scale used is at least ordinal If they differ, the distributions of the two populations will differ only with respect to the central location Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Mann-Whitney U-Test Consider two samples
(continued) Consider two samples combine into a singe list, but keep track of which sample each value came from rank the values in the combined list from low to high For ties, assign each the average rank of the tied values separate back into two samples, each value keeping its assigned ranking sum the rankings for each sample Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Mann-Whitney U-Test (continued) If the sum of rankings from one sample differs enough from the sum of rankings from the other sample, we conclude there is a difference in the population medians Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Mann-Whitney U-Test Mann-Whitney U-Statistics (continued) where:
n1 and n2 are the two sample sizes R1 and R2 = sum of ranks for samples 1 and 2 Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Mann-Whitney U-Test (continued) Claim: Median class size for Math is larger than the median class size for English A random sample of 9 Math and 9 English classes is selected (samples do not have to be of equal size) Rank the combined values and then split them back into the separate samples Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Mann-Whitney U-Test Suppose the results are: (continued)
Class size (Math, M) Class size (English, E) 23 45 34 78 66 62 95 81 30 47 18 44 61 54 28 40 Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Mann-Whitney U-Test Ranking for combined samples Size Rank 18 1 23 2
(continued) Ranking for combined samples Size Rank 18 1 23 2 28 3 30 4 34 6 40 8 44 9 Size Rank 45 10 47 11 54 12 61 13 62 14 66 15 78 16 81 17 95 18 tied Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Mann-Whitney U-Test Split back into the original samples: (continued)
Class size (Math, M) Rank Class size (English, E) 23 45 34 78 66 62 95 81 2 10 6 16 15 14 18 17 30 47 44 61 54 28 40 4 11 1 9 13 12 3 8  = 104  = 67 Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Mann-Whitney U-Test H0: MedianM ≤ MedianE HA: MedianM > MedianE
(continued) Claim: Median class size for Math is larger than the median class size for English H0: MedianM ≤ MedianE HA: MedianM > MedianE Math: English: Note: U1 + U2 = n1n2 Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Mann-Whitney U-Test (continued) The Mann-Whitney U tables in Appendices L and M give the lower tail of the U-distribution For one-tailed tests like this one, check the alternative hypothesis to see if U1 or U2 should be used as the test statistic Since the alternative hypothesis indicates that population 1 (Math) has a higher median, use U1 as the test statistic Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Mann-Whitney U-Test Use U1 as the test statistic: U = 22
(continued) Use U1 as the test statistic: U = 22 Compare U = 22 to the critical value U from the appropriate table For sample sizes less than 9, use Appendix L For samples sizes from 9 to 20, use Appendix M If U < U, reject H0 Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Since U  U, do not reject H0
Mann-Whitney U-Test (continued) Use U1 as the test statistic: U = 19 U from Appendix M for  = .05, n1 = 9 and n2 = 9 is U = 7 U = 19 reject H0 do not reject H0 U = 7 Since U  U, do not reject H0 Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Mann-Whitney U-Test for Large Samples
The table in Appendix M includes U values only for sample sizes between 9 and 20 The U statistic approaches a normal distribution as sample sizes increase If samples are larger than 20, a normal approximation can be used Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Mann-Whitney U-Test for Large Samples
(continued) The mean and standard deviation for Mann-Whitney U Test Statistic: Where n1 and n2 are sample sizes from populations 1 and 2 Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Mann-Whitney U-Test for Large Samples
(continued) Normal approximation for Mann-Whitney U Test Statistic: Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Large Sample Example H0: Median1  Median2 HA: Median1 < Median2
We wish to test Suppose two samples are obtained: n1 = 40 , n2 = 50 When rankings are completed, the sum of ranks for sample 1 is R1 = 1475 When rankings are completed, the sum of ranks for sample 2 is R2 = 2620 Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Large Sample Example Compute the U statistics:
(continued) Compute the U statistics: Since the alternative hypothesis indicates that population 2 has a higher median, use U2 as the test statistic U statistic is found to be U = 655 Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Large Sample Example Since z = -2.80 < -1.645, we reject H0
(continued)  = .05 Reject H0 Do not reject H0 Since z = < , we reject H0 Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Wilcoxon Matched-Pairs Signed Rank Test
The Mann-Whitney U-Test is used when samples from two populations are independent If samples are paired, they are not independent Use Wilcoxon Matched-Pairs Signed Rank Test with paired samples Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

The Wilcoxon T Test Statistic
Performing the Small-Sample Wilcoxon Matched Pairs Test (for n < 25) Calculate the test statistic T using these steps: Step 1: collect sample data Step 2: compute di = difference between the sample 1 value and its paired sample 2 value Step 3: rank the differences, and give each rank the same sign as the sign of the difference value Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

The Wilcoxon T Test Statistic
(continued) Performing the Small-Sample Wilcoxon Matched Pairs Test (for n < 25) Step 4: The test statistic is the sum of the absolute values of the ranks for the group with the smaller expected sum Look at the alternative hypothesis to determine the group with the smaller expected sum For two tailed tests, just choose the smaller sum Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Claim: Median value is smaller after than before
Small Sample Example Paired samples, n = 9: Value (before) Value (after) 38 45 34 58 30 46 42 55 41 47 18 31 24 40 Claim: Median value is smaller after than before Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Ranks with smaller expected sum
Small Sample Example (continued) Paired samples, n = 9: Value (before) Value (after) Difference d Rank of d Ranks with smaller expected sum 36 45 34 58 30 46 42 55 41 47 18 54 38 31 24 62 40 6 -2 16 4 -8 15 -7 1 8 3 -6 7 9 -5 2 5  = T = 13 Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Since T  T, do not reject H0
Small Sample Example (continued) The calculated T value is T = 13 Complete the test by comparing the calculated T value to the critical T-value from Appendix N For n = 9 and  = .025 for a one-tailed test, T = 6 T = 13 reject H0 do not reject H0 T = 6 Since T  T, do not reject H0 Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Wilcoxon Matched Pairs Test for Large Samples
The table in Appendix N includes T values only for sample sizes from 6 to 25 The T statistic approaches a normal distribution as sample size increases If the number of paired values is larger than 25, a normal approximation can be used Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Wilcoxon Matched Pairs Test for Large Samples
(continued) The mean and standard deviation for Wilcoxon T : where n is the number of paired values Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Mann-Whitney U-Test for Large Samples
(continued) Normal approximation for the Wilcoxon T Test Statistic: Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Kruskal-Wallis One-Way ANOVA
Tests the equality of more than 2 population medians Assumptions: variables have a continuous distribution. the data are at least ordinal. samples are independent. samples come from populations whose only possible difference is that at least one may have a different central location than the others. Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Kruskal-Wallis Test Procedure
Obtain relative rankings for each value In event of tie, each of the tied values gets the average rank Sum the rankings for data from each of the k groups Compute the H test statistic Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Kruskal-Wallis Test Procedure
(continued) The Kruskal-Wallis H test statistic: (with k – 1 degrees of freedom) where: N = Sum of sample sizes in all samples k = Number of samples Ri = Sum of ranks in the ith sample ni = Size of the ith sample Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Kruskal-Wallis Test Procedure
(continued) Complete the test by comparing the calculated H value to a critical 2 value from the chi-square distribution with k – 1 degrees of freedom (The chi-square distribution is Appendix G) Decision rule Reject H0 if test statistic H > 2 Otherwise do not reject H0 Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Kruskal-Wallis Example
Do different departments have different class sizes? Class size (Math, M) Class size (English, E) Class size (History, H) 23 45 54 78 66 55 60 72 70 30 40 18 34 44 Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Kruskal-Wallis Example
Do different departments have different class sizes? Class size (Math, M) Ranking Class size (English, E) Class size (History, H) 23 41 54 78 66 2 6 9 15 12 55 60 72 45 70 10 11 14 8 13 30 40 18 34 44 3 5 1 4 7  = 44  = 56  = 20 Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Kruskal-Wallis Example
(continued) The H statistic is Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Kruskal-Wallis Example
(continued) Compare H = to the critical value from the chi-square distribution for 5 – 1 = 4 degrees of freedom and  = .05: Since H = 6.72 < do not reject H0 There is not sufficient evidence to reject that the population medians are all equal Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Kruskal-Wallis Correction
If tied rankings occur, give each observation the mean rank for which it is tied The H statistic is influenced by ties, and should be corrected Correction for tied rankings: where: g = Number of different groups of ties ti = Number of tied observations in the ith tied group of scores N = Total number of observations Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

H Statistic Corrected for Tied Rankings
Corrected H statistic: Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.

Chapter Summary Developed and applied the Wilcoxon signed rank W-test for a population median Small Samples Large sample z approximation Developed and applied the Mann-Whitney U-test for two population medians Large Sample z approximation Used the Wilcoxon Matched-Pairs T-test for paired samples Applied the Kruskal-Wallis H-test for multiple population medians Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc.