Elementary Statistics

Slides:



Advertisements
Similar presentations
The t Test for Two Independent Samples
Advertisements

(Hypothesis test for small sample sizes)
Lecture 8: Hypothesis Testing
Introductory Mathematics & Statistics for Business
Prepared by Lloyd R. Jaisingh
STATISTICS HYPOTHESES TEST (I)
STATISTICS INTERVAL ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
STATISTICS HYPOTHESES TEST (II) One-sample tests on the mean and variance Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National.
David Burdett May 11, 2004 Package Binding for WS CDL.
CALENDAR.
Overview of Lecture Parametric vs Non-Parametric Statistical Tests.
Lecture 2 ANALYSIS OF VARIANCE: AN INTRODUCTION
SADC Course in Statistics Tests for Variances (Session 11)
Chapter 7 Sampling and Sampling Distributions
Hypothesis Test II: t tests
Break Time Remaining 10:00.
Factoring Quadratics — ax² + bx + c Topic
You will need Your text Your calculator
7 Elementary Statistics Larson Farber Hypothesis Testing.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 10-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
6. Statistical Inference: Example: Anorexia study Weight measured before and after period of treatment y i = weight at end – weight at beginning For n=17.
Chapter 13: Chi-Square Test
Chapter 16 Goodness-of-Fit Tests and Contingency Tables
Chi-Square and Analysis of Variance (ANOVA)
Hypothesis Tests: Two Independent Samples
Statistics Review – Part I
Quantitative Analysis (Statistics Week 8)
The Kruskal-Wallis H Test
Adding Up In Chunks.
You will need Your text Your calculator And the handout “Steps In Hypothesis Testing” Bluman, Chapter 81.
Please enter data on page 477 in your calculator.
25 seconds left…...
Statistical Inferences Based on Two Samples
© The McGraw-Hill Companies, Inc., Chapter 10 Testing the Difference between Means and Variances.
Copyright © 2014 Pearson Education, Inc. All rights reserved Chapter 10 Associations Between Categorical Variables.
© The McGraw-Hill Companies, Inc., Chapter 12 Chi-Square.
Chapter Thirteen The One-Way Analysis of Variance.
Ch 14 實習(2).
Clock will move after 1 minute
PSSA Preparation.
Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 11 Simple Linear Regression.
1 Chapter 20: Statistical Tests for Ordinal Data.
Simple Linear Regression Analysis
Copyright © 2012 by Nelson Education Limited. Chapter 13 Association Between Variables Measured at the Interval-Ratio Level 13-1.
Chapter 14 Nonparametric Statistics
Multiple Regression and Model Building
Select a time to count down from the clock above
January Structure of the book Section 1 (Ch 1 – 10) Basic concepts and techniques Section 2 (Ch 11 – 15): Inference for quantitative outcomes Section.
Chapter 16 Introduction to Nonparametric Statistics
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Nonparametric Methods Chapter 15.
Chapter 14 Analysis of Categorical Data
The Kruskal-Wallis Test The Kruskal-Wallis test is a nonparametric test that can be used to determine whether three or more independent samples were.
Chapter 15 Nonparametric Statistics
Chapter 11 Nonparametric Tests Larson/Farber 4th ed.
11 Chapter Nonparametric Tests © 2012 Pearson Education, Inc.
Correlation and Regression
Chapter 11 Nonparametric Tests.
Hypothesis Testing with One Sample Chapter 7. § 7.3 Hypothesis Testing for the Mean (Small Samples)
Hypothesis Testing with One Sample Chapter 7. § 7.2 Hypothesis Testing for the Mean (Large Samples)
Copyright © 2010, 2007, 2004 Pearson Education, Inc Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
1 Nonparametric Statistical Techniques Chapter 17.
CD-ROM Chap 16-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition CD-ROM Chapter 16 Introduction.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Chapter 10 Section 5 Chi-squared Test for a Variance or Standard Deviation.
1 Nonparametric Statistical Techniques Chapter 18.
Chapter 12 Chi-Square Tests and Nonparametric Tests
Chapter 8 Hypothesis Testing with Two Samples.
Lecture Slides Elementary Statistics Twelfth Edition
Nonparametric Statistics
Presentation transcript:

Elementary Statistics Chapter 11 Nonparametric Tests Elementary Statistics Larson Farber

Section 11.1 The Sign Test

Nonparametric Tests Hypotheses or or A nonparametric test is a hypothesis test that does not require any specific conditions about the shape of the populations or the value of any population parameters. Tests are often called “distribution free” tests. The Sign Test is a nonparametric test that can be used to test a population median against a hypothesized value, k. Hypotheses To test the value of a mean, you can use a z test (with a sample over 30) or a t-test. Using a t-test, however requires the population is normally distributed. If this population shape can not be determined, you can use a sign test and test the value of the population median. Left-tailed test: H0: median  k and Ha: median < k Right-tailed test: H0: median ≤ k and Ha: median > k Two-tailed test: H0: median = k and Ha: median  k or or

Sign Test To use the sign test, first compare each entry in the sample to the hypothesized median, k. If the entry is below the median, assign it a – sign. If the entry is above the median, assign it a + sign. If the entry is equal to the median, assign it a 0. Compare the number of + and – signs. (Ignore 0’s.) If the number of + signs and the number of – signs are approximately equal, the null hypothesis is not likely to be rejected. If they are not approximately equal, however, it is likely that the null hypothesis will be rejected. This is essentially a binomial distribution test where the hypothesized proportion is 0.5. If n is no more than 25, binomial probabilities are used to compute the values in the critical value table. With n >25, the normal approximation to the binomial is used .

Sign Test Test Statistic: When n ≤ 25, the test statistic is the smaller number of + or – signs. When n > 25, the test statistic is: This is essentially a binomial distribution test where the hypothesized proportion is 0.5. If n is no more than 25, binomial probabilities are used to compute the values in the critical value table. With n >25, the normal approximation to the binomial is used . For n > 25, you are testing the binomial probability that = 0.50.

Application 1. Write the null and alternative hypothesis. A meteorologist claims that the daily median temperature for the month of January in San Diego is 57º Fahrenheit. The temperatures (in degrees Fahrenheit) for 18 randomly selected January days are listed below. At = 0.01, can you support the meteorologist’s claim? 58 62 55 55 53 52 52 59 55 55 60 56 57 61 58 63 63 55 1. Write the null and alternative hypothesis. H0: median = 57º and Ha: median ≠ 57º Critical values, based on the binomial probabilities are given in the table. When n> 30, use the normal approximation and normal distribution critical values. 2. State the level of significance. = 0.01 3. Determine the sampling distribution. Binomial with p = 0.5

58 55 + – 62 60 + 55 56 – 55 57 – 53 61 – + 52 58 – + 52 63 – + 59 63 + 55 – Ignore the 0’s when determining n. There are 8 + signs and 9 – signs. So, n = 8 + 9 = 17. Since Ha contains the ≠ symbol, this is a two-tail test.

4. Find the critical value. With n = 17, use Table 8 Critical value is 2. 5. Find the rejection region. Reject H0 if the test statistic is less than or equal to 2. 6. Find the test statistic. The test statistic is the smaller number of + or – signs, so the test statistic is 8.

8. Interpret your decision. 7. Make your decision. The test statistic, 8, does not fall in the critical region. Fail to reject the null hypothesis. 8. Interpret your decision. There is not enough evidence to reject the meteorologist’s claim that the median daily temperature for January in San Diego is 57. The sign test can also be used with paired data (such as before and after). Find the difference between corresponding values and record the sign. Use the same procedure.

Section 11.2 The Wilcoxon Test

Wilcoxon Signed-Rank Test The Wilcoxon signed-rank test is a nonparametric test that can be used to determine whether two dependent samples were selected from populations with the same distribution. To find the test statistic, ws Find the difference for each pair: Sample 1 value – Sample 2 value Find the absolute value of the difference. Rank order these differences. Affix a + or – sign to each of the rankings. Find the sum of the positive ranks. Find the sum of the negative ranks. Select the smaller of the absolute values of the sums. Use this test when conditions for a t-test for paired differences cannot be met.

Application 1. Write the null and alternative hypothesis. The table shows the daily headache hours suffered by 12 patients before and after receiving a new drug for seven weeks. At = 0.01, is there enough evidence to conclude that the new drug helped to reduce daily headache hours? 1. Write the null and alternative hypothesis. H0: The headache hours after using the new drug are at least as long as before using the drug. Ha: The new drug reduces headache hours. (Claim) 2. State the level of significance. = 0.01

2.1 3.9 3.8 2.5 2.4 3.6 3.4 Before 2.2 2.8 2.5 2.6 1.9 1.8 2.0 1.6 After –0.1 1.1 1.3 0.5 1.8 1.4 0.8 Diff. 0.1 1.1 1.3 0.5 1.8 1.4 0.8 Abs 1.5 5.0 6.0 3.0 8.0 7.0 4.0 Rank –1.5 5.0 6.0 3.0 8.0 7.0 4.0 Sign Rank 1 2 3 4 5 6 7 8 Explain how to rank tied values. If two values are tied, assign the average rank to each. In this example, there are two absolute differences of 0.1. They would occupy ranks 1 and 2 so assign each a rank of 1.5.

The sum of the positive ranks is 5 + 6 + 3 + 8 + 7 + 4 = 33. The sum of the negative ranks is –1.5 + (–1.5) = –3. The test statistic is the smaller of the absolute value of these sums, ws = 3. There are 8 + and – signs, so n = 8. The critical value is 2. Because ws = 3 is greater than the critical value, fail to reject the null hypothesis. There is not enough evidence to conclude the new drug reduces headache hours. Explain how to rank tied values.

Wilcoxon Rank-Sum Test The Wilcoxon rank-sum test is a nonparametric test that can be used to determine whether two independent samples were selected from populations having the same distribution. Both samples must be at least 10. Then n1 represents the size of the smaller sample and n2 the size of the larger sample. When both samples are the same size it does not matter which is called n1. When the samples are the same size, it does not matter which is n1.

Wilcoxon Rank-Sum Test Test statistic: Combine the data from both samples and rank it. R = the sum of the ranks for the smaller sample. Find the z-score for the value of R. where

The Kruskal-Wallis Test Section 11.3 The Kruskal-Wallis Test

The Kruskal-Wallis Test The Kruskal-Wallis test is a nonparametric test that can be used to determine whether three or more independent samples were selected from populations having the same distribution. H0: There is no difference in the population distributions. Ha: There is a difference in the population distributions. Combine the data and rank the values. Then separate the data according to sample and find the sum of the ranks for each sample. Use this test when conditions for ANOVA cannot be satisfied. Ri = the sum of the ranks for sample i.

The Kruskal-Wallis Test Given three or more independent samples, the test statistic H for the Kruskal-Wallis test is: where k represents the number of samples, ni is the size of the i th sample, N is the sum of the sample sizes, and Ri is the sum of the ranks of the i th sample. The sampling distribution is a chi-square distribution with k – 1 degrees of freedom (where k = the number of samples). Reject the null hypothesis when H is greater than the critical number. (Always use a right-tail test.)

Application You want to compare the hourly pay rates of accountants who work in Michigan, New York and Virginia. To do so, you randomly select 10 accountants in each state and record their hourly pay rate as shown below. At the .01 level, can you conclude that the distributions of accountants’ hourly pay rates in these three states are different?

5. Find the rejection region. 1. Write the null and alternative hypothesis. H0 : There is no difference in the hourly pay rate in the 3 states. Ha : There is a difference in the hourly pay in the 3 states. 2. State the level of significance. = 0.01 3. Determine the sampling distribution. 4. Find the critical value. 5. Find the rejection region. X2 The sampling distribution is chi-square with d.f. = 3 – 1 = 2. From Table 6, the critical value is 9.210.

Test Statistic Michigan salaries are in ranks: 2, 3, 4, 5, 6, 7, 13, 15, 17.5, 22 The sum is 94.5. New York salaries are in ranks: 8, 14, 19, 21, 23, 24, 27, 28, 29, 30 The sum is 223. Virginia salaries are in ranks: 1, 9, 10, 11, 12, 16, 17.5, 20, 25, 26 The sum is 147.5.

Find the test statistic. R1 = 94.5, R2 = 223, R3 = 147.5 n1 = 10, n2 = 10 and n3 = 10, so N = 30 9.210 10.76 Make Your Decision The test statistic 10.76 falls in the rejection region, so reject the null hypothesis. Interpret your Decision There is a difference in the salaries of the 3 states.

Section 11.4 Rank Correlation

Rank Correlation The Spearman rank correlation coefficient, rs, is a measure of the strength of the relationship between two variables. The Spearman rank correlation coefficient is calculated using the ranks of paired sample data entries. The formula for the Spearman rank correlation coefficient is where n is the number of paired data entries and d is the difference between the ranks of a paired data entry. This is the nonparametric alternative to the Pearson Correlation Coefficient. It can be used for ordinal data. To calculate with Minitab, rank all values and calculate the Pearson Correlation Coefficient on the column with these ranks. The hypotheses: (There is no correlation between the variables.) (There is a significant correlation between the variables.)

Rank Correlation Seven candidates applied for a nursing position. The seven candidates were placed in rank order first by x and then by y. The results of the rankings are listed below. Using a .05 level of significance, test the claim that there is a significant correlation between the variables. x y 1 2 1 2 4 4 3 1 3 4 5 2 5 7 6 6 3 1 7 6 7 (There is no correlation between the variables.) (There is a significant correlation between the variables.)

Application x y d = x – y d2 1 2 1 1 1 2 4 4 0 0 3 1 3 –2 4 4 5 2 3 9 1 2 1 1 1 2 4 4 0 0 3 1 3 –2 4 4 5 2 3 9 5 7 6 1 1 6 3 1 2 4 7 6 7 –1 1 Critical Value = 0 .715 20 Since the statistic 0.643 does not fall in the rejection region, fail to reject H0. There is not enough evidence to support the claim that there is a significant correlation.