 # Hypothesis Testing: Type II Error and Power.

## Presentation on theme: "Hypothesis Testing: Type II Error and Power."— Presentation transcript:

Hypothesis Testing: Type II Error and Power

Actually True Actually False
Type I and Type II Error Revisited NULL HYPOTHESIS Actually True Actually False 1-a Type II error b Type I error a 1-b Fail to Reject DECISION Either type error is undesirable and we would like both a and b to be small. How do we control these?

A Type I error, or an a-error is made when a true hypothesis is rejected.
The letter “a” (alpha) is used to denote the probability related to a type I error a also represents the level of significance of the decision rule or test You, as the investigator, select this level

A Type II error, or an b-error is made when a false hypothesis is NOT rejected.
The letter “b” (beta) is used to denote the probability related to a type II error 1-b represents the POWER of a test: The probability of rejecting a false null hypothesis The value of b depends on a specific alternative hypothesis b can be decreased (power increased) by increasing sample size

Computing Power of a Test
Example: Suppose we have test of a mean with Ho: mo = vs Ha: mo 100 s = 10 n = 25 a = .05 If the true mean is in fact m = 105, what is b, the probability of failing to reject Ho when we should ? What is the power (1-b) of our test to reject Ho when we should reject it?

In this example, the standard error is s/n = 10/5=2, so that:
mo=100 (2) = 96.08 (2) = We will reject Ho if (x  96.08) or if (x  )

Suppose, in fact, that ma = 105.
We will reject Ho if x is greater than or x is less than 96.08 Let’s look at these decision points relative to our specific alternative. Suppose, in fact, that ma = 105. Distribution based on Ha 96.08 103.92 ma=105

ma=105 96.08 103.92 z

b note: a is fixed in advance by the investigator b depends on
the sample size  se = (s / n) the specific alternative, ma we assume that the variance s2 holds for both the null and alternative distributions a/2 a/2 b ma 105 m0 100 (se) = 96.08 (se) =

Again, looking at our specific alternative: ma = 105
b: area where we fail to reject Ho even though Ha is correct a/2: area where we reject Ho for Ha – Good! a/2 ma 105 m0 100 (se) = 96.08 (se) =

We define power as 1-b power = Pr(rejecting Ho | Ha is true) In our example, power = 1-b = 1 – = .8315 That is, with a = .05 a sample size of n=25 a true mean of ma= 105, the power to reject the null hypothesis (mo=100) is %.

To test this hypothesis we establish our critical region.
Example 2: Suppose we want to test, at the a = .05 level, the following hypothesis: Ho: m = 67 vs. Ha: m  67 We have n=25 and we know s = 3. To test this hypothesis we establish our critical region. a/2 a/2 ? ?

Here, we reject Ho, at the a=.05 level when:
or a/2: Rejection region a/2: Rejection region

Now, select a specific alternative to compute b: Let Ha1: ma=67.5
“fail-to-reject” region based on H0 z or Power = 1-b = 13%

Type II Error (b) and Power of Test for
Now look at the same thing for different values of ma: Type II Error (b) and Power of Test for a = .05, n=25, mo = 67, s = 3 ma zlower zupper b Power =1-b 68.5 - 4.47 - .53 .29 .71 68 - 3.36 0.30 .62 .38 67.5 - 2.80 1.13 .87 .13 67 - 1.96 1.96 .95 .05 66.5 - 1.13 2.80 66 - 0.30 3.36 65.5 +0.53 4.47 mo

ma Let us plot Power (1-b) vs. alternative mean (µa).
This plot will be called the power curve. Note: at ma= mo 1-b = a 1.00 The farther the alternative is from m0, the greater the power. 0.75 1 - b 0.50 0.25 0.00 65 66 67 68 69 m0 ma

We establish our critical region – now with sx= s / n = 3/10 = .3
Suppose we want to test, the same hypothesis, still at the a = .05 level, s = 3 : Ho: m = 67 vs. Ha: m  67 But we will now use n=100. We establish our critical region – now with sx= s / n = 3/10 = .3 a/2 a/2 ? ?

With n=100, we reject Ho, at the a=.05 level when:
or a/2: Rejection region a/2: Rejection region

Again, select a specific alternative to compute b: Let Ha: ma=67.5
“fail-to-reject” region based on H0 z or Power = 1-b = 38%

Type II Error (b) and Power of Test for
Now look at the same thing for different values of ma: Type II Error (b) and Power of Test for a = .05, n=100, mo = 67, s = 3 ma zlower zupper b Power =1-b 68.5 - 6.97 - 3.04 .00 1.00 68 - 5.30 - 1.37 .09 .91 67.5 - 3.63 0.30 .62 .38 67 - 1.96 1.96 .95 .05 66.5 - 0.30 3.63 66 1.37 5.30 65.5 3.04 6.97 mo

1 - b ma Power Curves: Power (1-b) vs. ma for n=25, 100
a = .05, mo = 67 0.00 0.25 0.50 0.75 1.00 – n = 100 – n = 25 1 - b For the same alternative ma, greater n gives greater power. 65 66 67 68 69 ma

Clearly, the larger sample size has resulted in
a more powerful test. However, the increase in power required an additional 75 observations. In all cases a = .05. Greater power means: we have a greater chance of rejecting Ho in favor of Ha even for alternatives that are close to the value of mo.

We will revisit our discussion of power when we discuss sample size in the context of hypothesis testing. Minitab allows you to compute power of a test for a specific alternative: You must supply: The difference between the null and a specific alternative mean: m0-ma The sample size, n The standard deviation, s

Using Minitab to estimate Sample Size:
Stat  Power and Sample Size  1-Sample Z Sample size (to specify several, separate with a space) Difference between mo and ma ( to specify several, separate with a space) 2-sided test s

Power and Sample Size 1-Sample Z Test Testing mean = null (versus not = null) Calculating power for mean = null + difference Alpha = Assumed standard deviation = 10 Sample Difference Size Power