Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright © Cengage Learning. All rights reserved. 14 Elements of Nonparametric Statistics.

Similar presentations


Presentation on theme: "Copyright © Cengage Learning. All rights reserved. 14 Elements of Nonparametric Statistics."— Presentation transcript:

1 Copyright © Cengage Learning. All rights reserved. 14 Elements of Nonparametric Statistics

2 Copyright © Cengage Learning. All rights reserved. 14.2 The Sign Test

3 3 The sign test is a versatile and exceptionally easy-to-apply nonparametric method that uses only plus and minus signs. Three sign test applications are presented here: (1) a confidence interval for the median of one population, (2) a hypothesis test concerning the value of the median for one population, and (3) a hypothesis test concerning the median difference (paired difference) for two dependent samples.

4 4 The Sign Test These sign tests are carried out using the same basic confidence interval and hypothesis test procedures as described in earlier chapters. They are the nonparametric alternatives to the t-tests used for one mean and for the difference between two dependent means. Assumptions for inferences about the population single-sample median using the sign test The n random observations that form the sample are selected independently, and the population is continuous in the vicinity of the median M.

5 5 Single-Sample Confidence Interval Procedure

6 6 The sign test can be applied to obtain a confidence interval for the unknown population median, M. To accomplish this we will need to arrange the sample data in ascending order (smallest to largest). The data are identified as x 1 (smallest), x 2, x 3,…, x n (largest). The critical value, k (known as the “maximum allowable number of signs”) is obtained from Table 12 in Appendix B, tells us the number of positions to be dropped from each end of the ordered data.

7 7 Single-Sample Confidence Interval Procedure The remaining extreme values become the bounds of the 1 –  confidence interval. That is, the lower boundary for the confidence interval is x k + 1, the (k + 1)th data value; the upper boundary is x n – k, the (n – k)th data value. In general, the two data values that bound the confidence interval occupy positions k + 1 and n – k, where k is the critical value read from Table 12. Thus, x k + 1 to x n – k, 1 –  confidence interval for M The next example will clarify this procedure.

8 8 Example 1 – Constructing a Confidence Interval for a Population Median Suppose that we have a random sample of 12 daily high-temperature readings in ascending order, [50, 62, 64, 76, 76, 77, 77, 77, 80, 86, 92, 94], and we wish to form a 95% confidence interval for the population median. Table 12 shows a critical value of 2 (k = 2) for n = 12 and  = 0.05 for a two-tailed hypothesis test. This means that we drop the last two values on each end (50 and 62 on the left; 92 and 94 on the right). The confidence interval is bounded inclusively by the remaining end values, 64 and 86.

9 9 Example 1 – Constructing a Confidence Interval for a Population Median That is, the 95% confidence interval is 64 to 86 and is expressed as: 64  to 86 , the 95% confidence interval for the median daily high temperature cont’d

10 10 Single-Sample Hypothesis-Testing Procedure

11 11 Single-Sample Hypothesis-Testing Procedure The sign test can be used when the null hypothesis to be tested concerns the value of the population median M. The test may be either one- or two-tailed. This test procedure is presented in the following example.

12 12 Example 2 – Two-tailed Hypothesis Test A random sample of 75 students was selected, and each student was asked to carefully measure the amount of time it takes to commute from his or her front door to the college parking lot. The data collected were used to test the hypothesis “The median time required for students to commute is 15 minutes,” against the alternative that the median is unequal to 15 minutes. The 75 pieces of data were summarized as follows: Under 15: 1815:12Over 15: 45 Use the sign test to test the null hypothesis against the alternative hypothesis.

13 13 Example 2 – Solution The data are converted to + and – signs according to whether each data value is more or less than 15. A plus sign will be assigned to each larger than 15, a minus sign to each smaller than 15, and a zero to those equal to 15. The sign test uses only the plus and minus signs; therefore, the zeros are discarded and the usable sample size becomes 63. That is, n(+) = 45, n(–) = 18, and n = n(+) + n(–) = 45 + 18 = 63.

14 14 Example 2 – Solution Step 1 a. Parameter of interest: M, the population median time to commute b. Statement of hypotheses: H o : M = 15 H a : M ≠ 15 Step 2 a. Assumptions: The 75 observations were randomly selected, and the variable, commute time, is continuous. cont’d

15 15 Example 2 – Solution b. Test statistic: The test statistic that will be used is the number of the less frequent sign: the smaller of n(+) and n(–) which is n(–) for our example. We will want to reject the null hypothesis whenever the number of the less frequent sign is extremely small. Table 12 in Appendix B gives the maximum allowable number of the less frequent sign, k, that will allow us to reject the null hypothesis. That is, if the number of the less frequent sign is less than or equal to the critical value in the table, we will reject H o. cont’d

16 16 Example 2 – Solution If the observed value of the less frequent sign is larger than the table value, we will fail to reject H o. In the table, n is the total number of signs, not including zeros. The test statistic = x = n (least frequent sign). c. Level of significance:  = 0.05 for a two-tailed test Step 3 a. Sample information: n = 63; [n (–) = 18, n (+) = 45] b. Test statistic: The observed value of the test statistic is x = n (–) = 18. cont’d

17 17 Example 2 – Solution Step 4 Probability Distribution: p-Value: a. Since the concern is for values “not equal to,” the p-value is the area of both tails. We will find the left tail and double it: P = 2  P (x  18, for n = 63). cont’d Number of less frequent sign

18 18 Example 2 – Solution To find the p-value, you have two options: 1. Use Table 12 (Appendix B) to place bounds on the p-value. Table 12 lists only two-tailed values (do not double): P < 0.01. 2. Use a computer or calculator to find the p-value: P = 0.0011. b. The p-value is smaller than . cont’d

19 19 Example 2 – Solution Classical: a. The critical region is split into two equal parts because H a expresses concern for values related to “not equal to.” Since the table is for two tailed tests, the critical value is located at the intersection of the  = 0.05 column and the n = 63 row of Table 12: 23. b. x is in the critical region, as shown in the figure. cont’d

20 20 Example 2 – Solution Step 5 a. Decision: Reject H o. b. Conclusion: The sample shows sufficient evidence at the 0.05 level to conclude that the median commute time is not equal to 15 minutes. cont’d

21 21 Single-Sample Hypothesis-Testing Procedure Calculating the p-Value when Using the Sign Test Method 1: Use Table 12 in Appendix B to place bounds on the p-value. By inspecting the n = 63 row of Table 12, you can determine an interval within which the p-value lies. Locate the value of x along the n = 63 row and read the bounds from the top of the table. Table 12 lists only two-tailed values (therefore, do not double): P < 0.01.

22 22 Single-Sample Hypothesis-Testing Procedure Method 2: If you are doing the hypothesis test with the aid of a computer or graphing calculator, most likely it will calculate the p-value for you.

23 23 Two-Sample Hypothesis-Testing Procedure

24 24 Two-Sample Hypothesis-Testing Procedure The sign test may also be applied to a hypothesis test dealing with the median difference between paired data that result from two dependent samples. A familiar application is the use of before-and-after testing to determine the effectiveness of some activity. In a test of this nature, the signs of the differences are used to carry out the test. Again, zeros are disregarded. Assumptions for inferences about the median of paired differences using the sign test The paired data are selected independently, and the variables are ordinal or numerical.

25 25 Example 3 – One-tailed Hypothesis Test for the Median of Paired Differences A new no-exercise, no-starve weight-reducing plan has been developed and advertised. To test the claim that “you will lose weight within 2 weeks or...,” a local statistician obtained the before-and-after weights of 18 people who had used this plan. Sample Results on Weight-Reducing Plan [TA14-02] Table 14.2

26 26 Example 3 – One-tailed Hypothesis Test for the Median of Paired Differences Table 14.2 lists the people, their weights, and a minus (–) for those who lost weight during the 2 weeks, a 0 for those whose weight remained the same, and a plus (+) for those who actually gained weight. The claim being tested is that people lose weight. The null hypothesis that will be tested is, “There is no weight loss (or the median weight loss is zero),” meaning that only a rejection of the null hypothesis will allow us to conclude in favor of the advertised claim. cont’d

27 27 Example 3 – One-tailed Hypothesis Test for the Median of Paired Differences Actually we will be testing to see whether there are significantly more minus signs than plus signs. If the weight-reducing plan is of absolutely no value, we would expect to find an equal number of plus and minus signs. If it works, there should be significantly more minus signs than plus signs. Thus, the test performed here will be a one-tailed test. (We want to reject the null hypothesis in favor of the advertised claim if there are “many” minus signs.) cont’d

28 28 Example 3 – Solution Step 1 a. Parameter of interest: M, the median weight loss b. Statement of hypotheses: H o : M = 0 (no weight loss) H a : M  0 (weight loss) Step 2 a. Assumptions: The 18 observations were randomly selected, and the variables, weight before and weight after, are both continuous.

29 29 Example 3 – Solution b. Test statistic: The number of the less frequent sign: the test statistic = x = n (least frequent sign) c. Level of significance:  = 0.05 for a one-tailed test Step 3 a. Sample information: n = 16[n(+) = 5, n(–) = 11] b. Test statistic: The observed value of the test statistic is x = n(+) = 5. cont’d

30 30 Example 3 – Solution Step 4 Probability Distribution: p-Value: a. Since the concern is for values “less than,” the p- value is the area to the left: P = P (x  5, for n = 16). To find the p-value, you have two options: 1. Use Table 12 in Appendix B to estimate the p-value. Table 12 lists only two-tailed  (this is one-tailed, so divide  by two): P  0.125. Number of less frequent sign cont’d

31 31 Example 3 – Solution 2. Use a computer or calculator to find the p-value: P = 0.1051. b. The p-value is not smaller than . Classical: a. The critical region is one-tailed because H a expresses concern for values related to “less than.” Since the table is for two-tailed tests, the critical value is located at the intersection of the  = 0.10 column (  = 0.05 in each tail) and the n = 16 of Table 12: cont’d

32 32 Example 3 – Solution b. x is not in the critical region, as shown in the figure. Step 5 a. Decision: Fail to reject H o. b. Conclusion: The evidence observed is not sufficient to allow us to reject the no-weight-loss null hypothesis at the 0.05 level of significance. cont’d

33 33 Normal Approximation

34 34 Normal Approximation The sign test may be carried out by means of a normal approximation using the standard normal variable z. The normal approximation may be used if Table 12 does not show the particular levels of significance desired or if n is large. Notes 1. x may be the number of the less frequent sign or the more frequent sign. You will have to determine this in such a way that the direction is consistent with the interpretation of the situation.

35 35 Normal Approximation 2. x is really a binomial random variable, where p = 0.5. The sign test statistic satisfies the properties of a binomial experiment. Each sign is the result of an independent trial. There are n trials, and each trial has two possible outcomes (+ or –). Since the median is used, the probabilities for each outcome are both 0.5. Therefore, the mean,  x, is equal to

36 36 Normal Approximation and the standard deviation,  x, is equal to 3. x is a discrete variable. But we know that the normal distribution must be used only with continuous variables. However, although the binomial random variable is discrete, it does become approximately normally distributed for large n.

37 37 Normal Approximation Nevertheless, when using the normal distribution for testing, we should make an adjustment in the variable so that the approximation is more accurate. This adjustment is illustrated in Figure 14.1 and is called a continuity correction. Continuity Correction Figure 14.1

38 38 For this discrete variable, the area that represents the probability is a rectangular bar. Its width is 1 unit wide, from unit below to unit above the value of interest. Therefore, when z is to be used, we will need to make a – unit adjustment before calculating the observed value of z. Thus x will be the adjusted value for x. If x is larger than, then x = x –. If x is smaller than, then x = x +. The test is then completed by the usual procedure, using x. Normal Approximation

39 39 Normal Approximation Confidence Interval Procedure If the normal approximation is to be used (including the continuity correction), the position numbers for a 1 –  confidence interval for M are found using the formula: The interval is x L to x U, 1 –  confidence interval for M (median) (14.1)

40 40 Normal Approximation where and Note L should be rounded down and U should be rounded up to be sure that the level of confidence is at least 1 – .

41 41 Example 4 – Constructing a Confidence Interval for a Population Median Estimate the population median daily high temperature with a 95% confidence interval based on the following random sample of 60 daily high temperature readings. (Note: Temperatures have been arranged in ascending order.)

42 42 Example 4 – Solution When we use formula (14.1), the position numbers L and U are 30  (0.50 + 7.59) 30  (8.09) Thus, L = 30 – 8.09 = 21.91, rounded down becomes 21 (21st data value)

43 43 Example 4 – Solution U = 30 + 8.09 = 38.09, rounded up becomes 39 (39th data value) Therefore, 80 o to 85 o, the 95% confidence interval for the median high daily temperature. cont’d

44 44 Normal Approximation Hypothesis-Testing Procedure When a hypothesis test is to be completed using the standard normal distribution, z will be calculated with the formula: (14.2)

45 45 Example 5 – One-tailed Hypothesis Test Use the sign test to test the hypothesis that the median number of hours, M, worked by students at a certain college is at least 15 hours per week. A survey of 120 students was taken; a plus sign was recorded if the number of hours the student worked last week was equal to or greater than 15, and a minus sign was recorded if the number of hours was less than 15. Totals showed 80 minus signs and 40 plus signs.

46 46 Example 5 – Solution Step 1 a. Parameter of interest: M, the median number of hours worked by students b. Statement of hypotheses: H o : M = 15(  )(at least as many plus signs as minus signs) H a : M  15 (fewer plus signs than minus signs)

47 47 Example 5 – Solution Step 2 a. Assumptions: The random sample of 120 adults was independently surveyed, and the variable, hours worked, is continuous. b. Probability distribution and test statistic: The standard normal z and formula (14.2) c. Level of significance:  = 0.05 cont’d

48 48 Example 5 – Solution Step 3 a. Sample information: n(+) = 40 and n(–) = 80; therefore, n = 120 and x is the number of plus signs; x = 40. b. Test statistic: Using formula (14.2), we have cont’d

49 49 Example 5 – Solution = –3.562 = –3.56 Step 4 Probability Distribution: p-Value: a. Use the left-hand tail because H a expresses concern for values related to “fewer than.” P = P (z  –3.56), as shown in the figure. cont’d

50 50 Example 5 – Solution To find the p-value, you have three options: 1. Use Table 3 (Appendix B) to calculate the p-value: P = 0.0002. 2. Use Table 5 (Appendix B) to place bounds on the p-value: P = 0.0002. 3. Use a computer or calculator to find the p-value: P = 0.0002. b. The p-value is smaller than . cont’d

51 51 Example 5 – Solution Classical: a. The critical region is the left-hand tail because H a expresses concern for values related to “fewer than.” The critical value is obtained from Table 4A: b. z is in the critical region, as shown in red in the figure. cont’d

52 52 Example 5 – Solution Step 5 a. Decision: Reject H o. b. Conclusion: At the 0.05 level, there are significantly more minus signs than plus signs, thereby implying that the median is less than the claimed 15 hours. cont’d


Download ppt "Copyright © Cengage Learning. All rights reserved. 14 Elements of Nonparametric Statistics."

Similar presentations


Ads by Google