Hypothesis Testing One-sample means and proportions Lecture 4.

Hypothesis Testing One-sample means and proportions Lecture 4

Background Research projects often start out with hypotheses such as: The average IQ for UMDNJ public health students is higher than the national average Are boy birth rates higher than the national average when both parents have high PCB levels? In the conclusions of such studies, it is natural to either reject or accept the initial hypotheses.

IQ for Public Health Students How unlikely would it be that the average IQ score of 10 public health students will be 10 points or more points above the national average (given that the average IQ scores are the same)? Sampling distribution is normal. Z-score = Approx. Probability = from Normal table, 3 rd column You have calculated your first one-sided p-value! If the probability is too small, i.e., the event is too unlikely given the hypothesized value, then we reject the hypothesized value.

Reading the Normal Table (A5.2, pp.366-367 ) 1 st column: z-score, # standard errors (deviations) 2 nd column: probability of being within z standard errors (deviations) of the mean 3 rd column: probability of being z standard errors (dev.’s) above the mean 4 th column: probability of being z standard errors (dev.’s) below the mean

Experiment to Test Hypothesis about Public Health Students’ IQ State Hypothesis: “Public Health students’ IQ is higher than national average.” Translate Hypothesis into statistical terminology: A null hypothesis, An alternative hypothesis, Plan sample selection (how & How many?) Collect Data Conduct hypothesis test.

Hypothesis Tests – Basic Concepts The alternative hypothesis is generally the hypothesis that the investigator has an interest in proving. The null hypothesis is the hypothesis that is assumed true if we fail to prove the alternative. The five steps to a hypothesis test revolve around assuming the null hypothesis to be true and then examining whether the data are consistent with that null hypothesis.

General Steps to a Hypothesis Test 1.State the necessary assumptions for the test, assessing whether they are true for the collected data. 2.State the null and alternative hypotheses. 3.Calculate test statistic from the data, assuming the null hypothesis to be true. 4.Calculate a p-value (based on the form of the alternative hypothesis). 5.State conclusion in lay-person’s terms.

P-Values A P-value is the probability of observing an event at least as extreme as what was observed (in the direction of the alternative hypothesis)

Hypothesis Test for the Population Mean 1.Assumptions: Independent, random sample Large Sample OR Population distribution is approximately normal 2.Hypotheses ( for one-sided/one-tailed and two-sided tests ) 3.Test Statistic: 4. P-value based on a normal OR t-distribution with the appropriate degrees of freedom (n-1). 5. Conclusion in lay-person’s terms.

Distribution of the Test Statistic (z-score) The test statistic is given as When the sample size is small and the population distribution is normal, this follows a t-distribution. When the sample size is large, we can approximate the distribution as normal, no matter what the population distribution. In both cases, if the null hypothesis is true, we would expect the z-score

Matching p-values with the Direction of the Alternative Hypothesis If H A : μ ≠ μ 0, p-value = probability that in any experiment the test statistic would be at least as far away from zero as in the current study given that μ 0 is the true population mean If H A : μ > μ 0, p-value = probability that in any experiment the test statistic would at least as great as in the current study given that μ 0 is the true population mean If H A : μ < μ 0, p-value = probability that in any experiment the test statistic would at least as small as in the current study given that μ 0 is the true population mean

More on p-values for Means If H A : μ ≠ μ 0, p-value = probability that in any experiment the sample mean is at least as many standard errors away from the population mean (in the null hypothesis) as in the current study given that μ 0 is the true population mean

Hypothesis concerning Public Health Students’ IQ: Step One: 1.Assumptions: Independent, random sample – depends on sampling design. Sample size is Small. Population distribution is approximately normal – check by looking at sample distribution via a histogram or box- plot.

Hypothesis concerning Public Health Students’ IQ: Step Two: 2.Hypotheses: We are interested in showing that Public Health students IQ’s are higher than the national average, thus H 0 : μ = 100 (convention says we use “=“) H A : μ > 100

Hypothesis concerning Public Health Students’ IQ: Step Three: 3.Test Statistic: Suppose that n = 10, sample mean = 110, and standard deviation = 10. The z-score with an estimated standard deviation:

Hypothesis concerning Public Health Students’ IQ: Step Four: 4.P-Value: The alternative hypothesis is that the true mean is larger than the national average. Therefore, we calculate, using the t-table, Prob(t>observed given that μ = 100), which is between.007 and.006 since the observed t is between 3.1 and 3.2.

Hypothesis concerning Public Health Students’ IQ: Step Five: 5.Conclusion: Assuming that the average IQ of Public Health Students is the same as the national average, one would only observe a sample as “extreme” as the one observed less than 1% of the time. Thus, we may conclude that is unlikely that the national average is the true mean for public health students. There is strong evidence to reject the null hypothesis in favor of the alternative which says that the average IQ for public health students is greater than the national average.

Exercise Capacity: Large Sample A study was conducted of 90 male patients following a new treatment for congestive heart failure. One of the variables measured was the increase in exercise capacity (measured in minutes) over a 4-week treatment period. The previous treatment regime had produced an average increase of 2 minutes. The researchers wanted to evaluate whether the new treatment had increased the exercise capacity even more than the previous treatment.

Exercise Capacity: Large Sample, cont. The sample mean and standard deviation were calculated to be 2.17 and 1.05, respectively. State the necessary assumptions for the test, assessing whether they are true for the collected data. 1.Assumptions: 2.Hypotheses vs.

Exercise Capacity: Large Sample, cont. 3. Test Statistic : z = (2.17-2.00) divided by 1.05/sqrt(90). 4. P-Value: Prob(at least 1.54 st.errors above) =.0606 5. Conclusion: If the true increase with this treatment were 2.00 minutes then just by chance we would expect the mean for a single sample of 90 subjects

Exercise Capacity: Large Sample,cont. Typically, since this is greater than 5%, we would say that this increase is not significantly bigger than the increase for the previous treatment.

Exercise Capacity: A Larger Sample Suppose in an identical study, the sample was twice as big (180 men) with the same calculated sample mean and standard deviation. The standard error would now be SEM=1.05/sqrt(180) And the z-score would equal with a p-value of Now, since the p-value is so much smaller we might reject the null hypothesis and conclude that the increase is significantly bigger than the increase for the previous treatment! The only difference between the experiments is the sample size!

Why we “fail to reject the null” If our sample size is not big enough, we may not be able to detect a small difference, even if such a difference truly exists. Alternatively, if the true difference is very small, we may not be able to detect it even with a large sample size. Therefore, if our p-value isn’t small enough to reject the null, we typically say that We don’t have enough evidence to reject the null hypothesis

Rejecting the null hypothesis What p-value is small enough to reject the null hypothesis? Depends. For pilot studies, For huge studies on which major changes in policy may result from rejecting the null hypothesis, one may pick cut-off’s Using a popular cut-off, most investigators reject the null hypothesis if the p-value ALWAYS report the p-value.

P-values and Significance Levels The threshold with respect to the p-value at which we reject the null hypothesis is called the significance level (α) for the test. Once the significance level is set, this is the Type I error, the probability of rejecting the null hypothesis when it is in fact true. The threshold with respect to the test statistic at which we reject the null hypothesis is called the critical value.

More on p-values and α In general, the smaller the significance level, the larger the critical value will be. If we reject the null hypothesis at a specified significance level, we say that the population mean is significantly different (or larger or smaller) than the null value.

Hypothesis Tests for Proportions Same idea as for means! Still use the z-score, but with a different standard error. Compare z-score to a standard normal distribution.

Proportions Example 1.Assumptions Independent, random sample Large sample (same criteria as for CI) 2.Hypotheses H 0 : π = π 0 vs. H A : π > π 0 or π < π 0 or π ≠ π 0 (π 0 =.5 used when testing for a majority or minority) 3.Test Statistic z=(p- π 0 )/SE(p) 4.P-value using normal tables 5.Conclusion referring to applied hypothesis

SE(p) under H 0 For confidence intervals, we had no pre- conceived notion of the specific value of p. The standard error was estimated as the square root of p(1-p)/n. Under H 0, we claim that π = π 0. Therefore, p is estimating π 0. Hence under H 0, the true standard error for p is

Example on Sleep Disorders N=1,010 adults polled in the Fall of 2001 Statement: “Seventy-four percent of respondents in the study reported experiencing at least one symptom of a sleep disorder a few nights a week or more. That number was up significantly from 62 percent in 1999 and 2000, and from 69 percent last year.” Presumably, when the researchers collected the data they didn’t have any pre-suppositions about whether the change, if any, would be larger or smaller. Hence we’ll use a “two-sided” hypothesis test: H A : sleep disorders changed significantly from previous year.

Sleep Disorders Hypothesis Test 1.Assumptions Independent, random sample, Large sample (n=1,010) 2.Hypotheses (with π 0 =.69) H 0 : π =.69 vs. H A : π ≠.69 in this example 3.Test Statistic SE(p) = sqrt[(.69)(1-.69)/(1010)] =.0146 z=(p- π 0 )/SE(p) = (.74-.69)/0.0146 = 3.425 4.P-value using normal tables and column 5. P-value is between 0.0006 and 0.0007 5.Conclusion With a p-value of 0.0006 there is overwhelming evidence that the proportion of the population polled with sleep disorders has changed from the previous year.

Type 1 and Type 2 errors For a fixed sample size, the larger the difference between the true parameter and the hypothesized parameter, the more likely we are to detect that difference. Recall: Type 1 error (α) is the probability of rejecting the null hypothesis when it is true. Type 2 error is the probability of failing to reject the null hypothesis when the alternative hypothesis is true.

Power The power of a test is one minus the probability of type 2 error or, equivalently, the probability of rejecting the null hypothesis when the alternative hypothesis is true. As we suggested previously, there are a number of things which can lead to a “high” power.

Power, cont. Type 1 and Type 2 error must be balanced. For fixed sample size and fixed parameter values, decreasing Type 1 error results in or, equivalently, results in In planning a study, one plans a sample size using guesses at the true parameter values and by fixing the levels of Type 1 and Type 2 errors. The smaller we want both errors, the larger the required sample size. Sample size calculations in the next lecture.

Two-sided Hypothesis Tests and Confidence Intervals These are in some sense “equivalent” If we fail to reject a two-sided hypothesis test (say that the proportion equals ½ or the mean equals 0) at the 0.05 level, then a 95% confidence interval would include the null value (½ or 0), and vice versa. If we reject a two-sided hypothesis test (say that the proportion equals ½ or the mean equals 0) at the 0.05 level, then a 95% confidence interval would not include the null value (½ or 0), and vice versa.

Small Sample, Non-normal Population If the sample was large, the Central Limit Theorem would be applicable for testing hypotheses about the mean. If the population was normal, the sampling distribution of the mean is exactly a normal distribution to start with. If the sample is small and the population non- normal, what do we do? Nonparametric statistics is a sub-field of statistics that creates inferences concerning populations that cannot be assumed to follow any particular distribution.

Example Suppose that a nurse has been instructed to perform a procedure in a new way. Researchers recorded the change in the number of minutes it took the nurse to perform the procedure. The data is 0.6, -0.5, 1.1, 2.4, 3.5, 2.0 -0.4, 1.0, 2.1, -0.6, -0.2 We would be hard pressed to say that this data even approximately follows a normal distribution.

Assumption of normality for small sample example There are only 11 observations and we might be uncomfortable claiming that this distribution looks normal. Instead, it looks more uniform.

The Sign Test – 5 Steps Assumptions: Random, independent sample Hypotheses: Null hypothesis: Median equals zero Alternative hypothesis: Median does not equal zero Test statistic: p=7/11, interested in comparing proportion that are greater than zero with one-half.

The Sign Test – 5 Steps, cont. P-value: Need exact calculation since CLT doesn’t apply with small samples. 95% CI for p with small samples: (0.308, 0.891) Conclusion: Since 0.5 is included in the 95% confidence interval, we can’t say that the median is significantly different than zero at the 0.05 level. (We fail to reject the null hypothesis.)

The Signed Rank Test – 5 steps Assumptions: The measurement is continuous Independent, random sample from the population Distribution is symmetric Hypotheses: H 0 : Median of the distribution is 0 H A : Median of distribution is non-zero Test Statistic: Minimum of the rank sums P-value: from the computer! For this example, p=0.0439 Conclusion: As per usual.

Calculation of Signed Rank Test Statistic Order observations from smallest to largest in absolute value |Y| (1) < |Y| (2) <…<|Y| (n) So from example, |-0.2| < |-0.4| < |-0.5| < |-0.6| < 0.6 < 1.0 < 1.1 < 2.0 < 2.1 < 2.4 < 3.5 Assign Ranks to these absolute values 1, 2, …, n In example, 1, 2, …, 11

Signed Rank Test Statistic, cont… Arrange the ranks into two groups: those with actual values that are smaller and those that are larger than zero. Sum the ranks for both the negative and positive valued observations, separately. Here, for negative values, sum of ranks = 1+2+3+4.5 = 10.5 For positive values sum of ranks = 4.5+6+7+8+9+10+11 = 56.5 Test Statistic = smallest rank sum

P-values for signed rank test For critical values and p-values, look at tables/computer generated p-values. This procedure is unavailable in the Student version of SPSS. It is available in SAS and the regular version of SPSS.

Comments on Signed Rank Test More “powerful” than the Sign Test, but requires more assumptions One-sided tests are possible Robust to outliers Some books/programs use the sum of the ranks of the positive values as the test statistic – p-values are always the same Nonparametric confidence intervals are also available from some software programs. For tied observations, use average rank for each tied observation.

Homework To be posted, not graded Solutions will be posted on Monday Read Chapters 10, 11 and 12

Hypothesis Testing One-sample means and proportions Lecture 4.

Similar presentations

Presentation on theme: "Hypothesis Testing One-sample means and proportions Lecture 4."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Hypothesis Testing One-sample means and proportions Lecture 4.

Similar presentations

Presentation on theme: "Hypothesis Testing One-sample means and proportions Lecture 4."— Presentation transcript:

Similar presentations

About project

Feedback