Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright (c) Bani Mallick1 STAT 651 Lecture 8. Copyright (c) Bani Mallick2 Topics in Lecture #8 Sign test for paired comparisons Wilcoxon signed rank.

Similar presentations


Presentation on theme: "Copyright (c) Bani Mallick1 STAT 651 Lecture 8. Copyright (c) Bani Mallick2 Topics in Lecture #8 Sign test for paired comparisons Wilcoxon signed rank."— Presentation transcript:

1 Copyright (c) Bani Mallick1 STAT 651 Lecture 8

2 Copyright (c) Bani Mallick2 Topics in Lecture #8 Sign test for paired comparisons Wilcoxon signed rank test for paired comparisons Comparing two population means: first pass

3 Copyright (c) Bani Mallick3 Book Sections Covered in Lecture #8 My own material (sign test) Chapter 6.5 (Wilcoxon signed rank test, although I think my explanation is better) Chapters 6.1-6.2 (comparing two population means: this lecture will not have any formulae)

4 Copyright (c) Bani Mallick4 Lecture 7 Review: Sample Size Calculations You want to test at level (Type I error)  the null hypothesis that the mean = 0 You want power 1 -  to detect a change of from the hypothesized mean by the amount  or more, i.e., the mean is greater than  or the mean is less than -  There is a formula for this, that I showed you in class.

5 Copyright (c) Bani Mallick5 Lecture 7 Review: Never Accept a Null Hypothesis Suppose we use a 95% confidence interval, it includes zero. Why do I say: with 95% confidence, I cannot reject that the population mean is zero. I never, ever say: I can therefore conclude that the population mean is zero.

6 Copyright (c) Bani Mallick6 Lecture 7 Review: Never Accept a Null Hypothesis If you pick a tiny sample size, there is no statistical power to reject the null hypothesis In particular, p-values are not the probability that the null hypothesis is true.

7 Copyright (c) Bani Mallick7 Lecture 7 Review: P-Values The p-value is NOT the probability that the null hypothesis is true. p-values are simply a mechanical way to understand what will happen to hypothesis tests when you go out and compute them. For example if you take n=2, you will have no power, hence you will have high p-values. Does this mean that the null hypothesis has a high probability of being correct? No! It means you have a rotten study.

8 Copyright (c) Bani Mallick8 Lecture 7 Review: Student’s t- Distribution The (1  100% CI when  was known was The (1  100% CI when is  unknown is You replace  by s and by

9 Copyright (c) Bani Mallick9 Lecture 7 Review: Student’s t- Distribution Take 95% confidence,  = 0.05 z  = 1.96 n = 3, n-1 = 2, t  (n-1) = 4.303 n = 10, n-1 = 9, t  (n-1) = 2.262 n = 30, n-1 = 29, t  (n-1) = 2.045 n = 121, n-1 = 120, t  (n-1) = 1.98

10 Copyright (c) Bani Mallick10 Paired Comparisons We have shown how to test : population mean difference = 0; : population mean difference 0; using t-statistics (confidence intervals and tests).

11 Copyright (c) Bani Mallick11 Paired Comparisons Unfortunately, it often arises (as it does for the hormone assay) that the differences between two variables can have many outliers. We know that outliers affect the sample mean and especially the sample standard deviation, making the latter larger. Larger standard deviations mean larger confidence intervals and hence less power.

12 Copyright (c) Bani Mallick12 Paired Comparisons There are two alternative methods that are not so affected by outliers These are the Wilcoxon signed rank test and the sign test Both are available in SPSS: “Analyze”, “Nonparametric Tests”, “2 Related Samples”, also click in “sign” test.

13 Copyright (c) Bani Mallick13 Paired Comparisons The sign test is simple: recode the data +1 = positive difference 0 = no difference -1 = negative difference Then run a t-test and compute the p-value Problem: No confidence intervals Serves as check in t-inferences

14 Copyright (c) Bani Mallick14 HAND EXAMPLE Data-2-13589 Signs -1 -11111

15 Copyright (c) Bani Mallick15 Paired Comparisons The Wilcoxon signed rank test is simple: recode the data! Take the absolute values of the data Order the absolute values from largest to smallest To the smallest absolute value, assign the number –1 if the actual difference is negative, 0 if there is no difference, +1 if the difference is positive

16 Copyright (c) Bani Mallick16 Paired Comparisons The Wilcoxon signed rank test is simple: To the jth absolute value in order, assign the number –j if the actual difference is negative, 0 if there is no difference, +j if the difference is positive Then run a t-test and compute the p-value Problem: No confidence intervals Serves as check on t-inferences

17 Copyright (c) Bani Mallick17 HAND EXAMPLE Data-2-13589 Absolute 2 13589 Rank 2 13456 Signed Rank-2-13456 (Run t-test on these guys)

18 Copyright (c) Bani Mallick18 Armspan Data Sign test p-value = 0.486 Wilcoxon signed rank test p-value = 0.281 t-test p-value = 0.282 All consistent!

19 Copyright (c) Bani Mallick19 Hormone Assay Data Remember that in the hormone assay data, we seemed to get different inferences based on whether we used the raw data or their logarithms The sign test is not affected by transformations The Wilcoxon test may be slightly affected by transformations when studying paired comparisons

20 Copyright (c) Bani Mallick20 Hormone Assay Data t-test on raw data, p = 0.244 t-test of log data, p = 0.000 Sign test, logged or raw data, p = 0.001 Wilcoxon signed rank test, raw data, p = 0.016, logged data p = 0.000 Remember, I claimed that the log data scale was most nearly bell-shaped, and hence thought there was a difference!

21 Copyright (c) Bani Mallick21 Comparing Two Population Means A great deal of our effort will go into comparing population means. Bluebonnet Heights on red petals: does environment matter? Are true building costs different in Bryan and College Station, after accounting for land valuaton?

22 Copyright (c) Bani Mallick22 Comparing Two Population Means We’ll use all our methods Histograms, boxplots, q-q plots, confidence intervals, nonparametric tests

23 Copyright (c) Bani Mallick23 Comparing Two Populations There a two populations Take a sample from each population The sample sizes need not be the same Population 1: Population 2:

24 Copyright (c) Bani Mallick24 Comparing Two Populations Each will have a sample standard deviation Population 1: Population 2:

25 Copyright (c) Bani Mallick25 Comparing Two Populations Each sample with have a sample mean Population 1: Population 2: That’s the statistics. What are the parameters?

26 Copyright (c) Bani Mallick26 Comparing Two Populations Each sample with have a population standard deviation Population 1: Population 2:

27 Copyright (c) Bani Mallick27 Comparing Two Populations Each sample with have a population mean Population 1: Population 2:

28 Copyright (c) Bani Mallick28 Comparing Two Populations How do we compare the population means and ???? The usual way is to take their difference: If the population means are equal, what is their difference?

29 Copyright (c) Bani Mallick29 Comparing Two Populations The usual way is to take their difference: If the population means are equal, their difference = 0 Suppose we form a confidence interval for the difference. What do we learn? Say a 95% CI is from 1 to 3?

30 Copyright (c) Bani Mallick30 Comparing Two Populations The usual way is to take their difference: Suppose we form a confidence interval for the difference. What do we learn? Say a 95% CI is from 1 to 3? Population 1 has a mean that is between 1 and 3 units larger than population 2, with 95% probability

31 Copyright (c) Bani Mallick31 Comparing Two Populations Before learning how this confidence interval is computed, let’s look at an example.

32 Copyright (c) Bani Mallick32 NHANES Comparison “Analyze”, “Compare Means”, “Independent Samples” will get you the analysis in SPSS You will get lots and lots of things, so we have to be a little careful First do the plots, then the analysis! You will get means and standard errors

33 Copyright (c) Bani Mallick33 NHANES Comparison

34 Copyright (c) Bani Mallick34 NHANES Comparison (Cancer Cases)

35 Copyright (c) Bani Mallick35 NHANES Comparison (Healthy Cases)

36 Copyright (c) Bani Mallick36 NHANES Comparison Healthy: Mean = 2.9905, s =.6173, se =.00769 Cancer: Mean = 2.6969, s =.6423, se =.00836 Note: The sample standard deviations are nearly numerically equal. This agree with the box plots, where the IQR’s are nearly equal Note how small the standard errors are

37 Copyright (c) Bani Mallick37 NHANES Comparison The next thing is that there will be two rows, one for “Equal Variances Assumed”, the other for “Equal Variances Not Assumed” Because we have been careful, the variability looks to be in the same ballpark. Thus I would conclude to assume equal variances

38 Copyright (c) Bani Mallick38 NHANES Comparison What happens if the variances do not look equal? Generally the results are not very different unless the sample sizes are quite small. Generally, people quote the “Variances assumed equal” p-values and CI You have a backup, nonparametric rank tests, that we will discuss later. It’s pretty hard to make a huge blunder

39 Copyright (c) Bani Mallick39 NHANES Comparison The “Mean Difference” is 0.2937. Since the healthy cases had a higher mean, this is Mean(Healthy) – Mean(Cancer) The 95% CI is from 0.065 to 0.5223 What is this a CI for?

40 Copyright (c) Bani Mallick40 NHANES Comparison The “Mean Difference” is 0.2937. Since the cancer cases had a higher mean, this is Mean(Healthy) – Mean(Cancer) The 95% CI is from 0.065 to 0.5223 What is this a CI for? In the log scale, healthy people eat between 0.065 and 0.5223 of saturated fat than women who developed breast cancer, with 95% probability.


Download ppt "Copyright (c) Bani Mallick1 STAT 651 Lecture 8. Copyright (c) Bani Mallick2 Topics in Lecture #8 Sign test for paired comparisons Wilcoxon signed rank."

Similar presentations


Ads by Google