Nonparametric Methods and Chi-Square Tests Session 5.

Slides:



Advertisements
Similar presentations
Prepared by Lloyd R. Jaisingh
Advertisements

COMPLETE BUSINESS STATISTICS
Nonparametric tests I Back to basics. Lecture Outline What is a nonparametric test? Rank tests, distribution free tests and nonparametric tests Which.
Chapter 16 Introduction to Nonparametric Statistics
Economics 105: Statistics Go over GH 11 & 12 GH 13 & 14 due Thursday.
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
Inference about the Difference Between the
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Nonparametric Methods Chapter 15.
INTRODUCTION TO NON-PARAMETRIC ANALYSES CHI SQUARE ANALYSIS.
© 2003 Pearson Prentice Hall Statistics for Business and Economics Nonparametric Statistics Chapter 14.
Chapter 12 Chi-Square Tests and Nonparametric Tests
Chapter 14 Analysis of Categorical Data
Chapter 12 Chi-Square Tests and Nonparametric Tests
Test statistic: Group Comparison Jobayer Hossain Larry Holmes, Jr Research Statistics, Lecture 5 October 30,2008.
Final Review Session.
Statistics for Managers Using Microsoft® Excel 5th Edition
Nemours Biomedical Research Statistics March 26, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
1 Pertemuan 25 Metode Non Parametrik-1 Matakuliah: A0064 / Statistik Ekonomi Tahun: 2005 Versi: 1/1.
Student’s t statistic Use Test for equality of two means
15-1 Introduction Most of the hypothesis-testing and confidence interval procedures discussed in previous chapters are based on the assumption that.
BCOR 1020 Business Statistics
Chapter 12 Chi-Square Tests and Nonparametric Tests
Nonparametrics and goodness of fit Petter Mostad
Chapter 15 Nonparametric Statistics
Chapter 9 Title and Outline 1 9 Tests of Hypotheses for a Single Sample 9-1 Hypothesis Testing Statistical Hypotheses Tests of Statistical.
AM Recitation 2/10/11.
Marketing Research, 2 nd Edition Alan T. Shao Copyright © 2002 by South-Western PPT-1 CHAPTER 17 BIVARIATE STATISTICS: NONPARAMETRIC TESTS.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 12-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Chapter 11 Nonparametric Tests Larson/Farber 4th ed.
11 Chapter Nonparametric Tests © 2012 Pearson Education, Inc.
14 Elements of Nonparametric Statistics
NONPARAMETRIC STATISTICS
Statistics 11 Correlations Definitions: A correlation is measure of association between two quantitative variables with respect to a single individual.
Chapter 9: Non-parametric Tests n Parametric vs Non-parametric n Chi-Square –1 way –2 way.
Nonparametric Statistics aka, distribution-free statistics makes no assumption about the underlying distribution, other than that it is continuous the.
Chapter 14 Nonparametric Tests Part III: Additional Hypothesis Tests Renee R. Ha, Ph.D. James C. Ha, Ph.D Integrative Statistics for the Social & Behavioral.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests and Nonparametric Tests Statistics for.
© 2000 Prentice-Hall, Inc. Statistics Nonparametric Statistics Chapter 14.
Biostatistics, statistical software VII. Non-parametric tests: Wilcoxon’s signed rank test, Mann-Whitney U-test, Kruskal- Wallis test, Spearman’ rank correlation.
1 1 Slide Chapter 11 Comparisons Involving Proportions n Inference about the Difference Between the Proportions of Two Populations Proportions of Two Populations.
Nonparametric Statistics. In previous testing, we assumed that our samples were drawn from normally distributed populations. This chapter introduces some.
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 26.
1 Nonparametric Statistical Techniques Chapter 17.
Lesson 15 - R Chapter 15 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Statistical Testing of Differences CHAPTER fifteen.
STATISTICAL ANALYSIS FOR THE MATHEMATICALLY-CHALLENGED Associate Professor Phua Kai Lit School of Medicine & Health Sciences Monash University (Sunway.
Chapter 13 CHI-SQUARE AND NONPARAMETRIC PROCEDURES.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 12-1 Chapter 12 Chi-Square Tests and Nonparametric Tests Statistics for Managers using.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests and Nonparametric Tests Statistics for.
Nonparametric Statistical Methods. Definition When the data is generated from process (model) that is known except for finite number of unknown parameters.
Chapter Outline Goodness of Fit test Test of Independence.
Statistics in Applied Science and Technology Chapter14. Nonparametric Methods.
CD-ROM Chap 16-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition CD-ROM Chapter 16 Introduction.
Lesson Nonparametric Statistics Overview. Objectives Understand Difference between Parametric and Nonparametric Statistical Procedures Nonparametric.
NON-PARAMETRIC STATISTICS
© Copyright McGraw-Hill 2004
Biostatistics Nonparametric Statistics Class 8 March 14, 2000.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Chapter 14 Nonparametric Methods and Chi-Square Tests
Chapter Fifteen Chi-Square and Other Nonparametric Procedures.
Nonparametric Statistical Methods. Definition When the data is generated from process (model) that is known except for finite number of unknown parameters.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
SUMMARY EQT 271 MADAM SITI AISYAH ZAKARIA SEMESTER /2015.
Hypothesis Tests u Structure of hypothesis tests 1. choose the appropriate test »based on: data characteristics, study objectives »parametric or nonparametric.
1 Nonparametric Statistical Techniques Chapter 18.
CHI SQUARE DISTRIBUTION. The Chi-Square (  2 ) Distribution The chi-square distribution is the probability distribution of the sum of several independent,
1 Pertemuan 26 Metode Non Parametrik-2 Matakuliah: A0064 / Statistik Ekonomi Tahun: 2005 Versi: 1/1.
Slide 1 Shakeel Nouman M.Phil Statistics Nonparametric Methods and Chi-Square Tests (1) Nonparametric Methods and Chi-Square Tests (1) By Shakeel Nouman.
Chapter 12 Chi-Square Tests and Nonparametric Tests
SA3202 Statistical Methods for Social Sciences
Presentation transcript:

Nonparametric Methods and Chi-Square Tests Session 5

Using Statistics. The Sign Test. The Runs Test - A Test for Randomness. The Mann-Whitney U Test. The Wilcoxon Signed-Rank Test. Nonparametric Methods and Chi-Square Tests (1)

The Kruskal-Wallis Test - A Nonparametric Alternative to One-Way ANOVA. The Friedman Test for a Randomized Block Design. The Spearman Rank Correlation Coefficient. A Chi-Square Test for Goodness of Fit. Contingency Table Analysis - A Chi-Square Test for Independence. A Chi-Square Test for Equality of Proportions. Using the Computer. Summary and Review of Terms. Nonparametric Methods and Chi-Square Tests (2)

Parametric Methods Inferences based on assumptions about the nature of the population distribution. Usually: population is normal. Types of tests t-test » Comparing two population means or proportions. » Testing value of population mean or proportion. ANOVA » Testing equality of several population means. 5-1 Using Statistics (Parametric Tests)

Nonparametric Tests Distribution-free methods making no assumptions about the population distribution. Types of tests Sign tests » Sign Test: Comparing paired observations. » McNemar Test: Comparing qualitative variables. » Cox and Stuart Test: Detecting trend. Runs tests » Runs Test: Detecting randomness. » Wald-Wolfowitz Test: Comparing two distributions. Nonparametric Tests (1)

Nonparametric Tests – Ranks tests Mann-Whitney U Test: Comparing two populations. Wilcoxon Signed-Rank Test: Paired comparisons. Comparing several populations: ANOVA with ranks.  Kruskal-Wallis Test  Friedman Test: Repeated measures – Spearman Rank Correlation Coefficient. – Chi-Square Tests Goodness of Fit. Testing for independence: Contingency Table Analysis. Equality of Proportions. Nonparametric Tests (2)

Deal with enumerative (frequency counts) data. Do not deal with specific population parameters, such as the mean or standard deviation. Do not require assumptions about specific population distributions (in particular, the normality assumption). Nonparametric Tests (3)

Comparing paired observations Paired observations: X and Y p = P(X>Y) Two-tailed test H 0 : p = 0.50 H 1 : p  0.50 Right-tailed testH 0 : p  0.50 H 1 : p  0.50 Left-tailed testH 0 : p  0.50 H 1 : p  0.50 Test statistic: T = Number of + signs » Large sample 5-2 Sign Test

Small Sample: Binomial Test For a two-tailed test, find a critical point corresponding as closely as possible to  /2 (C 1 ) and define C 2 as n-C 1. Reject null hypothesis if T  C 1 or T  C 2. For a right-tailed test, reject H 0 if T  C, where C is the value of the binomial distribution with parameters n and p = 0.50 such that the sum of the probabilities of all values less than or equal to C is as close as possible to the chosen level of significance, . For a left-tailed test, reject H 0 if T  C, where C is defined as above. Sign Test Decision Rule

Cumulative Binomial Probabilities (n=15, p=0.5) x F(x) CEO Before After Sign n = 15 T = 12  C1=3 C2 = 15-3 = 12 H 0 rejected, since T  C2 C1 Example 5-1

A run is a sequence of like elements that are preceded and followed by different elements or no element at all. Case 1 : S|E|S|E|S|E|S|E|S|E|S|E|S|E|S|E|S|E|S|E : R = 20 Apparently nonrandom Case 2: SSSSSSSSSS|EEEEEEEEEE : R = 2 Apparently nonrandom Case 3: S|EE|SS|EEE|S|E|SS|E|S|EE|SSS|E : R = 12 Perhaps random A two-tailed hypothesis test for randomness: H 0 : Observations are generated randomly H 1 : Observations are not generated randomly Test Statistic: R=Number of Runs Reject H 0 at level  if R  C1 or R  C2, as given in Table 8, with total tail probability P(R  C 1 ) + P(R  C 2 ) =  5-3 The Runs Test - A Test for Randomness

Table 8: Number of Runs (r) (n 1,n 2 ) (10,10) Case 1: n 1 = 10 n 2 = 10 R= 20 p-value  0 Case 2: n 1 = 10 n 2 = 10 R = 2 p-value  0 Case 3: n 1 = 10 n 2 = 10 R= 12 p-value  P  R  F(11)] = (2)( ) = (2)(0.414) = H 0 not rejected Runs Test: Examples

Large-Sample Runs Test: Using the Normal Approximation

Example 14-2: n 1 = 27 n 2 = 26 R = 16 H 0 should be rejected at any common level of significance. Large-Sample Runs Test: Example 5-2

The null and alternative hypotheses for the Wald-Wolfowitz test: H 0 : The two populations have the same distribution. H 1 : The two populations have different distributions. The test statistic: R = Number of Runs in the sequence of samples, when the data from both samples have been sorted. Salesperson A: Salesperson B: Using the Runs Test to Compare Two Population Distributions (Means): the Wald-Wolfowitz Test

Table Number of Runs (r) (n 1,n 2 )2345. (9,10) Sales SalesSalesPerson SalesPerson(Sorted)(Sorted)Runs 35A13 B 44A16 B 39A17 B 48A21 B 60A24 B 1 75A29 A 2 49A32 B 66A33 B 3 17B35 A 23B39 A 13B44 A 24B48 A 33B49 A 21B50 A 18B60 A 16B66 A 32B75 A 4 n 1 = 10 n 2 = 9 R= 4 p-value  P  R  H 0 may be rejected The Wald-Wolfowitz Test: Example 5-3

Ranks tests –Mann-Whitney U Test: Comparing two populations. –Wilcoxon Signed-Rank Test: Paired comparisons. –Comparing several populations: ANOVA with ranks. Kruskal-Wallis Test. Friedman Test: Repeated measures. Ranks Tests

The null and alternative hypotheses: H 0 : The distributions of two populations are identical H 1 : The two population distributions are not identical The Mann-Whitney U statistic: where n 1 is the sample size from population 1 and n 2 is the sample size from population The Mann-Whitney U Test (Comparing Two Populations)

Cumulative Distribution Function of the Mann-Whitney U Statistic n 2 =6 n 1 =6 u Rank ModelTimeRankSum A35 5 A38 8 A4010 A4212 A4111 A B29 2 B27 1 B30 3 B33 4 B39 9 B P(u  5) The Mann-Whitney U Test: Example 5-4

Example 5-5: Large-Sample Mann-Whitney U Test Score Rank Score Program Rank Sum Score Rank Score Program Rank Sum Since the test statistic is z = -3.32, the p-value  , and H 0 is rejected.

The null and alternative hypotheses: H 0 : The median difference between populations are 1 and 2 is zero. H 1 : The median difference between populations are 1 and 2 is not zero. Find the difference between the ranks for each pair, D = x 1 -x 2, and then rank the absolute values of the differences. The Wilcoxon T statistic is the smaller of the sums of the positive ranks and the sum of the negative ranks: For small samples, a left-tailed test is used, using the values in Appendix C, Table 10. The large-sample test statistic: 5-5 The Wilcoxon Signed- Ranks Test (Paired Ranks)

Sold Sold Rank Rank Rank (1) (2) D=x 1 -x 2 ABS(D) ABS(D)(D>0) (D<0) **** Sum:8634 T=34 n=15 P= P= P= P= H 0 is not rejected (Note the arithmetic error in the text for store 13) Example 5-6

The spreadsheet implements a Wilcoxon sign rank test. The RANK function was used in column F (Rank Diff), however, the resulting values required adjustment due to the presence of tie in the rankng for Scores 10 and 16 (which Excel handles differently than in the Wilcoxon sign ranked procedure. Example 5-6 using Excel

Hourly Rank Rank Rank Messages Md 0 D=x 1 -x 2 ABS(D) ABS(D) (D>0) (D<0) Sum: Example 5-7

The Kruskal-Wallis hypothesis test: H 0 : All k populations have the same distribution. H 1 : Not all k populations have the same distribution. The Kruskal-Wallis test statistic: If each n j > 5, then H is approximately distributed as a  The Kruskal-Wallis Test - A Nonparametric Alternative to One-Way ANOVA

SoftwareTimeRank Group RankSum  2 (2,0.005) = , so H 0 is rejected. Example 5-8: The Kruskal- Wallis Test

If the null hypothesis in the Kruskal-Wallis test is rejected, then we may wish, in addition, compare each pair of populations to determine which are different and which are the same. Further Analysis (Pairwise Comparisons of Average Ranks)

Pairwise Comparisons: Example 5-8

A manager wants to explore upgrading the fleet of trucks. There are three new models to choose from. The manager is allowed to drive the trucks for a few days, and randomly picks 15 drivers to do so. Five drivers will test each truck. Conduct a Kruskal-Wallis rank test for differences in the three population medians for the MPGs. Pairwise Comparisons: Example 5-9 Using Excel

The Spearman Rank Correlation Coefficient is the simple correlation coefficient calculated from variables converted to ranks from their original values. 5-7 The Spearman Rank Correlation Coefficient

Table 11:  =0.005 n r s =1- (6)(4) (10)( ) = =0.9758>0.794     d i i n nn() H 0 rejected MMIS&P100R-MMIR-S&PDiffDiffsq Sum:4 Spearman Rank Correlation Coefficient: Example 5-10

Spearman Rank Correlation Coefficient: Example 5-10 Using Excel

l Steps in a chi-square analysis: Formulate null and alternative hypotheses Compute frequencies of occurrence that would be expected if the null hypothesis were true - expected cell counts Note actual, observed cell counts Use differences between expected and actual cell counts to find chi-square statistic: Compare chi-statistic with critical values from the chi-square distribution (with k-1 degrees of freedom) to test the null hypothesis 5-8 A Chi-Square Test for Goodness of Fit

The null and alternative hypotheses: H 0 : The probabilities of occurrence of events E 1, E 2...,E k are given by p 1,p 2,...,p k H 1 : The probabilities of the k events are not as specified in the null hypothesis Exit 14-11: Assuming equal probabilities, p 1 = p 2 = p 3 = p 4 =0.25 and n=80 PreferenceTanBrownMaroonBlackTotal Observed Expected(np) (O-E) Goodness-of-Fit Test for the Multinomial Distribution

z f ( z ) Partitioning the Standard Normal Distribution Use the table of the standard normal distribution to determine an appropriate partition of the standard normal distribution which gives ranges with approximately equal percentages. p(z<-1) = p(-1<z<-0.44)= p(-0.44<z<0)= p(0<z<0.44)= p(0.44<z<14)= p(z>1) = Given z boundaries, x boundaries can be determined from the inverse standard normal transformation: x =  +  z = z. 3. Compare with the critical value of the  2 distribution with k-3 degrees of freedom. Goodness-of-Fit for the Normal Distribution: Example 5-11

iO i E i O i - E i (O i - E i ) 2 (O i - E i ) 2 / E i  2 :  2 (0.10,k-3) = >  H 0 is not rejected at the 0.10 level Example 5-12: Solution

In lieu of all the recent mergers, companies have looked to employees for help in determining the new company name. When two prominent banks joined forces, 250 employees were chosen at random to evaluate (like or dislike) two names. 140 workers commented on Name A, of whom 85 liked the name. Out of the 110 workers who commented on Name B, 54 liked the name. Conduct a chi-square test. Example 5-13

Example 5-13: Solution

5-9 Contingency Table Analysis: A Chi-Square Test for Independence

Null and alternative hypotheses: H 0 : The two classification variables are independent of each other H 1 : The two classification variables are not independent Chi-square test statistic for independence: Degrees of freedom: df=(r-1)(c-1) Expected cell count: A and B are independent if: P(AUB) = P(A)P(B). If the first and second classification categories are independent:E ij = (R i )(C j )/n Contingency Table Analysis: A Chi- Square Test for Independence

ijOEO-E(O-E) 2 (O-E) 2 /E  2 :  2 (0.01,(2-1)(2-1)) = H 0 is rejected at the 0.01 level and it is concluded that the two variables are not independent. Contingency Table Analysis: Example 5-14

MTB > Unstack (C3) (c4) (c5) (c6) (c7); SUBC> Subscripts C2. MTB > chisquare c4-c7 Chi-Square Test Expected counts are printed below observed counts C4 C5 C6 C7 Total Total ChiSq = = df = 6, p = Given the p-value of 0.002, the null hypothesis of independence can be rejected at any common level of significance. Using the Computer: Example 5-15

MTB > ChiSquare C1 C2 C3. Chi-Square Test. Expected counts are printed below observed counts. C1 C2 C3 Total Total ChiSq = = df = 2, p = Chi-Square Test for Equality of Proportions

MTB > median c5 k1 Column Median Median of C5 = MTB > let c6=c5-k1 MTB > let c7=sign(c6) MTB > Table C4 C7; SUBC> Counts; SUBC> ChiSquare 2. Tabulated Statistics ROWS: C4 COLUMNS: C ALL ALL CHI-SQUARE = WITH D.F. = 2 Chi-Square Test for the Median: Example 5-16

Figure 5-12: MTB > RUNS ABOVE AND BELOW 30, C1 Runs Test C1 K = The observed no. of runs = 18 The expected no. of runs = Observations above K 28 below The test is significant at Figure 5-13: MTB > mann-whitney (alternative=1) c1 c2 Mann-Whitney Confidence Interval and Test C1 N = 10 Median = C2 N = 10 Median = Point estimate for ETA1-ETA2 is Percent C.I. for ETA1-ETA2 is (7.00,18.00) W = Test of ETA1 = ETA2 vs. ETA1 > ETA2 is significant at The test is significant at (adjusted for ties) 5-11 Using the Computer: Mann-Whitney Test

MTB > let c3=c1-c2 MTB > wtest c3 Wilcoxon Signed Rank Test TEST OF MEDIAN = VERSUS MEDIAN N.E N FOR WILCOXON ESTIMATED N TEST STATISTIC P-VALUE MEDIAN C C1C2C Table 5-14 Using the Computer: Wilcoxon Signed-Rank Test

MTB > Kruskal-Wallis C1 C2. Kruskal-Wallis Test LEVEL NOBS MEDIAN AVE. RANK Z VALUE OVERALL H = d.f. = 2 p = H = d.f. = 2 p = (adjusted for ties) C1C Table 5-15 Using the Computer: Kruskal-Wallis Test

MTB > print c1 c2 Row C1 C MTB > rank c1 c3 MTB > rank c2 c4 MTB > correlation c3 c4 Correlations (Pearson) Correlation of C3 and C4 = Table 5-16 Using the Computer: Rank Correlation

c1c2c MTB > Table 'c1' 'c2'; SUBC> Frequencies 'c3'; SUBC> ChiSquare 2. Tabulated Statistics ROWS: c1 COLUMNS: c Total Total CHI-SQUARE = WITH D.F. = 4 Table 5-17 Using the Computer: Chi-Square Test