Presentation on theme: "Statistical Techniques I EXST7005 Lets go Power and Types of Errors."— Presentation transcript:
Statistical Techniques I EXST7005 Lets go Power and Types of Errors
n Logic of Test of Hypothesis is based on the selected probability, (significance level) for the test statistic (Z) which determines the range of what would be expected due to chance alone assuming H0 is true. è Significance level notation, commonly used levels and terminology è "Significant" = 0.05 è "Highly significant" = 0.01 Hypothesis testing Concepts
Errors! n Is it possible, when we do a test of hypothesis that we are wrong? è Yes, unfortunately. It is always possible that we are wrong. è Furthermore, there are two types of error that we could make!
Data indicates: H0 is true Data indicates: H0 is false True result: H0 is true NO ERRORType I Error: Reject TRUE H0 True result: H0 is false Type II Error: Fail to Reject FALSE H0 NO ERROR Types of error
Type I Error n Type I error is the REJECTION of a true Null Hypothesis. This type of error is also called (alpha) error. è This is the value we choose as the "level of significance" è So we actually set the probability of making this type of error. The probability of a type I error =
Type II Error n Type II error is the FAILURE TO REJECT of a Null Hypothesis that is false. This type of error is also called (beta) error. We do not set this value, but we call the probability of a type II error = è Furthermore, in practice we will never know this value. This is another reason we cannot "accept" the null hypothesis, because it is possible that we are wrong and we cannot state the probability of this type of error.
So what can we do? n The good news, it is only possible to make one error at a time. è If you reject H0, then maybe you made a type I error, but you have not made a type II error. è If you fail to reject H0, then maybe you made a type II error, but you have not made a type I error.
The probability of Type II Error This is a probability that we will not know. This probability is called n However, we can do several things to make the error smaller. So this will be our objective. n First, let's look at how these errors occur.
Type II Error 101214161820222426 -4-3-2 01234 n Examine an hypothesized distribution (below) that we believe to have a mean of 18.
Type II Error We are going to do a 2 tailed test with an value of 0.05. Our critical limits will be ±1.96. 101214161820222426 -4-3-2 01234
10 1214161820222426 -4-3-201234 Type II Error n So we will reject any test statistic over 1.96 (or under -1.96). But lets suppose the null hypothesis is false!!! Lets suppose that the alternate hypothesis is true. Then the hypothesized distribution is not real, there is another "real" distribution that we are sampling from. What might it look like?
Type II Error 1012141618202224262728 n Here is the real distribution. It actually has a mean of 22, but we don't know that. If we did, we would not have hypothesized a mean of 18!
Type II Error 1012141618202224262728 n So where on the real distribution is our critical limit. This is the key question. 101214161820222426 -4-3-201234 Critical value
Type II Error n Now we draw our sample from the real distribution. If our result says reject the H0, we make no error. 101214161820222426 -4-3-201234 Critical value 1012141618202224262728
Type II Error n But if our result causes us to fail to reject, we err. 101214161820222426 -4-3-201234 Critical value 1012141618202224262728
Type II Error n And in this case it appears we have pretty close to a 50-50 chance of going either way. Critical value 1012141618202224262728
Type II Error n So we take our sample and do our test. Will we err? Maybe, and maybe not. If our sample happens to fall in the red area, we will not reject H0. n We would err. 1012141618202224262728
Type II Error n But if our sample happens to fall in the yellow area, we will reject H0. n No error, we draw the correct conclusion. 1012141618202224262728
Probability of error For = 0.05, our critical limit would be 1.96. The critical limit translates to a value on the original scale of Yi = + Zi = 18 + 1.96(2) = 18 ± 3.92. The lower bound is 14.08 and the upper bound is 21.92. The lower bound is so far down on the real distribution that the probability of getting a sample that falls there is near zero. The upper bound is the one that falls in the middle of the "real" distribution.
Probability of error (continued) n In this ficticious case we know that the true mean is 22 (we wouldn't normally know the true mean. n Since we know the true mean, we can calculate the probability of drawing a sample above and below the critical limit (21.92 on the Y scale, 0.04 on the Z scale of the real distribution).
Probability of error (continued) The probability of falling below this value, and of making a type II error is 0.484, or about 48.4%. This is beta ( ) n The probability of falling above this value, and of NOT making a type II error is 0.516, or 51.6%. So in this case we can calculate , the probability of a Type II error. In practice we cannot usually know these.
Type II Error Probability Since we don't actually know what the true mean is (or we wouldn't be hypothesizing something else), we cannot know in practice the type II error rate ( ). However, it is affected by a number of things, and we can know about these. We define a new term POWER, this is the probability of NOT making a type II error (=1- ). This was 0.516 in our example.
Power and Type II Error First, power is effected by the distance between the hypothesized mean ( ) and true mean ( ).
Power and Type II Error Second, power is effected by the value chosen for Type I error ( ). 1012141618202224262728101214161820222426 -4-3-201234 1012141618202224262728 101214161820222426 -4-3-201234
Power and Type II Error n And third, power is effected by the variability or spread of the distribution. 1012141618202224262728 1012141618202224262728
Controlling Power n Which of the 3 factors influencing power can you control? è For testing means you may be able to control sample size (n). This reduces the variability and increases power. You probably cannot influence the difference between and . You can choose any value of . However, this cannot be too small or Type II error becomes more likely. Too large and Type I error becomes likely.
POWER OF A TEST The capability of the test to reject H0 when it is false is called Power = 1 - n How would we use our knowledge of factors affecting power to increase the power of our tests of hypothesis?
METHODS OF INCREASING THE POWER OF A TEST n METHODS OF INCREASING THE POWER OF A TEST Increase the significance level (eg. from =0.01 to =0.05) – If H0 is true we would increase , the probability of a Type I error. – If H0 is false then we decrease , the probability of a Type II error, and by decreasing , we are increasing the POWER of test.
METHODS OF INCREASING THE POWER OF A TEST (continued) For a given , the POWER can be increased by.... increasing n, so Y (= / n) decreases, and the amount of overlap between the real and hypothesized distributions decreases. For example, lets suppose we are conducting a test of the hypothesis H0: = 0 against the alternative H0: 0. We believe 0 = 50 and we set = 0.05. We also know that 2 = 100 and that n = 25.
POWER OF A TEST (continued) From this information we can calculate that Y = / n = 10/5 = 2. Then the Critical Region is P(|Z| Z0) = 0.05 and Z0 = 1.96. And our critical value on the original scale is Yi = + Zi =50 + 1.96(2) = 53.92.
POWER OF A TEST (continued) Now lets suppose that the REAL population mean is 54. CALCULATE P(Y 53.92), given that the TRUE mean is 54. n Z = (53.92 - 54)/2 = -0.08/2 =- 0.04. The probability of a TYPE II error ( ) is the probability of not drawing a sample that falls above this value and not rejecting the false null hypothesis. The value of = is P(Z -0.04) = 0.4840.
POWER OF A TEST (continued) So for an experiment with n = 25, the power is 1 - = 1 - 0.4840 = 0.516. But suppose we had a larger n, say 100. Now Y = / n = 10/10 = 1. The Critical Region stays at Z0 = 1.96, but on the original scale this is now Yi = + Zi =50 + 1.96(1) = 51.96. n For a true mean of 54 we now get Z=(51.96- 54)/1 = -2.04/1 = -2.04.
POWER OF A TEST (continued) The value of = is P(Z -2.04) = 0.0207, and the power for this test is 1 - = 0.9793. n The result, è With n = 25, the power is 0.5160. è With n = 100, the power is 0.9793. n This is why statisticians recommend larger sample sizes so strongly. We may never really know what power is, but we know how to reduce it.
Summary Hypothesis testing is prone to two types of errors, one we control ( ) and one we do not ( ). è Type I error is the REJECTION of a true Null Hypothesis. è Type II error is the FAILURE TO REJECT a Null Hypothesis that is false. n Not only do we not control TYPE II error, we probably do not even know its value.
Summary (continued) The "Power" of a test is 1 = n We can influence power, and hopefully reduce it, by Controlling the distance between and (not really likely) Selecting a value of that is not too small (0.05 and 0.01 are the usual values) è Getting a larger sample size, usually the factor most under the control of the investigator.