Chapter 21: More About Tests

Chapter 21: More About Tests
AP Statistics

Null Hypothesis Stating it can sometimes be tricky
If event is random or result of “Guessing”, then many times your null is unusual, such as or Especially when testing to see if “coin is fair” or to see if “someone has ESP and can predict which closed hand contains a prize” or to see if “die is fair”

Alpha Levels (Significant Levels)
If P-Value is small, it tells us that our data is rare given the null hypothesis How rare is rare? How low must it be to reject the null hypothesis? We arbitrarily set a threshold for P-value and if it is below that value, we reject the null hypothesis This threshold is called the ALPHA LEVEL

Alpha Levels (Significant Levels)
We denote the Alpha Level: Also called the significant level because values under the are considered statistically significant. When we reject the null, “the test is significant at the =.05 level.” Always select before you look at data Always report P-value and level in conclusion

Common Alpha Levels  1-sided 2-sided 0.05 1.645 1.96 0.01 2.28 2.575
0.001 3.09 3.29

When the alternative is one-sided, the critical value puts all of  on one side:
When the alternative is two-sided, the critical value splits  equally into two tails:

Practical vs Statistical Significance
For larger sample sizes, small unimportant deviations from the null can be statistically significant. For smaller sample sizes, large seemingly important deviations from the null may not be statistically significant. Also, always do a reality check—what is the big deal about that difference?

Confident Intervals You can approximate a hypothesis by examining a confidence interval Just ask whether the null is consistent with a CI for the parameter at the corresponding confidence interval A 95% Confidence interval corresponds to a two-sided hypothesis test at (sum of the two tails) A 95% Confidence interval corresponds to a one-sided hypothesis test at

Errors Here’s some shocking news for you: nobody’s perfect. Even with lots of evidence we can still make the wrong decision. When we perform a hypothesis test, we can make mistakes in two ways: The null hypothesis is true, but we mistakenly reject it. (Type I error) The null hypothesis is false, but we fail to reject it. (Type II error)

Errors

Errors Type I: You are healthy, but a test says you have a disease (false positive) Type II: You are not healthy, but the test says you do not have a disease (false negative) ___________________________________ Type I: Jury convicts a innocent person Type II: Jury fails to convict a guilty person

Type I Errors How often does a Type I Error Occur?
Happens when null hypothesis is true, but you have the misfortune to draw the unusual sample. To reject null hypothesis, P-value must fall below When the null hypothesis is true, but we reject it the P-value is exactly When you set , you are setting the probability of a Type I error Remember, you can only have a Type I error if the null hypothesis is true

Type II Error What happens if Null Hypothesis is not true?
If null hypothesis is false and we reject if, we have done the correct thing. If the null hypothesis is false and we fail to reject it, we have committed a Type II Error. The probability that this error occurs is denoted by

Errors in General Review Probability of Type I Error =
Probability of Type II Error =

Reducing Error Neither Error is good
Difficulty is reducing one, without increasing the other. Imagine: To reduce Type I error, I reduce my ; however, then my , or Type II error would increase. The only way to reduce both errors is to collect more evidence—more data Many times studies fail because sample sizes are too small to detect the change they are looking for When designing a survey or experiment it is a good idea to calculate , for a reasonable

Power—related to reducing error
It is natural to think that if we failed to reject the null hypothesis we did not look hard enough and made the wrong decision. Is the null hypothesis really false and our test was too weak to detect the strength of the difference? We want a test that is strong enough to make the right decision when should be rejecting the null hypothesis when it really is false The POWER of the test tells us how strong our test is in rejecting a false null hypothesis.

Power When power is high, we can be confident that we looked hard enough. High Power tells us that our test is strong and has a very good chance of detecting a false null hypothesis (very good chance of NOT making a Type II Error) Power is calculated by: This is the complement of making a Type II Error

Power Whenever a study fails to reject the Null Hypothesis, the Power of the test comes into play. When we calculate Power, we imagine the null hypothesis is FALSE. The value of Power depends on how far the truth lies from the null hypothesis—this distance is called the “effect size”

Notice from visual: Power = Reducing to lower Type I error will move the critical value, to the right and have the effect of increasing the probability of a Type II error and consequentially reducing Power Notice that the large the effect size, the smaller the chance of making a Type II Error and the greater that power of the test.

Reducing Both Type I and Type II Error
This was discussed earlier—increase the sample size. The effect of a larger sample size can be seen below. An increased sample size will reduce the standard deviation, thus making the curves narrower.

Reducing Both Type I and Type II Error (cont.)
Original comparison of errors: Comparison of errors with a larger sample size:

Chapter 21: More About Tests

Similar presentations

Presentation on theme: "Chapter 21: More About Tests"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 21: More About Tests

Similar presentations

Presentation on theme: "Chapter 21: More About Tests"— Presentation transcript:

Similar presentations

About project

Feedback