Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 8 Hypothesis Testing

Similar presentations


Presentation on theme: "Chapter 8 Hypothesis Testing"— Presentation transcript:

1 Chapter 8 Hypothesis Testing

2 8-2 Basics of Hypothesis Testing

3 Definitions A hypothesis is a claim or statement about a property of a population. A hypothesis test is a procedure for testing a claim about a property of a population. If, under a given assumption, the probability of a particular observed event is exceptionally small, we conclude that the assumption is probably not correct.

4 Alternative Hypothesis
Null Hypothesis The null hypothesis (denoted by H0) is a statement that the value of a population parameter (such as proportion, mean, or standard deviation) is equal to some claimed value. We test the null hypothesis directly in the sense that we assume it is true and reach a conclusion to either reject H0 or fail to reject H0. Alternative Hypothesis The alternative hypothesis (denoted by H1 or HA) is the statement that the parameter has a value that somehow differs from the null hypothesis. The symbolic form of the alternative hypothesis must use one of these symbols: <, >, ≠.

5 Steps 1, 2, 3 Identifying H0 and H1

6 Step 4 Select the Significance Level α
The significance level (denoted by α) is the probability that the test statistic will fall in the critical region when the null hypothesis is actually true (making the mistake of rejecting the null hypothesis when it is true). This is the same α introduced in Section 7-2. Common choices for α are 0.05, 0.01, and 0.10.

7 Step 5 Identify the Test Statistic and Determine its Sampling Distribution

8 Test Statistic The test statistic is a value used in making a decision about the null hypothesis, and is found by converting the sample statistic to a score with the assumption that the null hypothesis is true.

9 Step 6 Find the Value of the Test Statistic, Then Find Either the P-Value or the Critical Value(s)
First transform the relevant sample statistic to a standardized score called the test statistic. Then find the P-Value or the critical value(s).

10 Types of Hypothesis Tests: Two-tailed, Left-tailed, Right-tailed
The tails in a distribution are the extreme regions bounded by critical values. Determinations of P-values and critical values are affected by whether a critical region is in two tails, the left tail, or the right tail. It, therefore, becomes important to correctly ,characterize a hypothesis test as two-tailed, left-tailed, or right-tailed.

11  α is divided equally between the two tails of the critical region
Two-tailed Test  α is divided equally between the two tails of the critical region

12 Left-tailed Test  All α in the left tail

13 Right-tailed Test  All α in the right tail

14 P-Value The P-value (or probability value) is the probability of getting a value of the test statistic that is at least as extreme as the one representing the sample data, assuming that the null hypothesis is true. P-value = area to the left of the test statistic Critical region in the left tail: Critical region in the right tail: P-value = area to the right of the test statistic Critical region in two tails: P-value = twice the area in the tail beyond the test statistic The null hypothesis is rejected if the P-value is very small, such as 0.05 or less.

15 Procedure for Finding P-Values

16 Critical Region Critical Value
The critical region (or rejection region) is the set of all values of the test statistic that cause us to reject the null hypothesis. For example, see the red-shaded region in the previous figures. Critical Value A critical value is any value that separates the critical region (where we reject the null hypothesis) from the values of the test statistic that do not lead to rejection of the null hypothesis. The critical values depend on the nature of the null hypothesis, the sampling distribution that applies, and the significance level α.

17 Caution Step 7 : Make a Decision: Reject H0 or Fail to Reject H0
Don’t confuse a P-value with a proportion p. Know this distinction: P-value = probability of getting a test statistic at least as extreme as the one representing sample data p = population proportion Step 7 : Make a Decision: Reject H0 or Fail to Reject H0 The methodologies depend on if you are using the P-Value method or the critical value method.

18 Decision Criterion P-value Method: Using the significance level α:
If P-value ≤ α, reject H0. If P-value > α, fail to reject H0. Critical Value Method: If the test statistic falls within the critical region, reject H0. If the test statistic does not fall within the critical region, fail to reject H0.

19 Step 8 : Restate the Decision Using Simple and Nontechnical Terms
State a final conclusion that addresses the original claim with wording that can be understood by those without knowledge of statistical procedures.

20 STEP 1, 2, 3 Assume that 100 babies are born to 100 couples treated with the XSORT method of gender selection that is claimed to make girls more likely. We observe 58 girls in 100 babies. Write the hypotheses to test the claim the “with the XSORT method, the proportion of girls is greater than the 50% that occurs without any treatment”.

21 Step 5 and 6 We know from previous chapters that a z score of 1.60 is not “unusual”. At first glance, 58 girls in 100 births does not seem to support the claim that the XSORT method increases the likelihood a having a girl (more than a 50% chance).

22 Step 6 and 7 The claim that the XSORT method of gender selection increases the likelihood of having a baby girl results in the following null and alternative hypotheses: The test statistic was : The test statistic of z = 1.60 has an area of to its right, so a right-tailed test with test statistic z = 1.60 has a P-value of

23 Step 4 and 7 For the XSORT birth hypothesis test, the critical value and critical region for an α = 0.05 test are shown below:

24 Step 4 and 7 For the XSORT baby gender test, the test had a test statistic of z = 1.60 and a P-Value of We tested: Using the P-Value method, we would fail to reject the null at the α = 0.05 level. Using the critical value method, we would fail to reject the null because the test statistic of z = 1.60 does not fall in the rejection region. (You will come to the same decision using either method.)

25 Step 8 For the XSORT baby gender test, there was not sufficient evidence to support the claim that the XSORT method is effective in increasing the probability that a baby girl will be born.

26 Wording of Final Conclusion

27 Caution Some texts use “accept the null hypothesis.”
Never conclude a hypothesis test with a statement of “reject the null hypothesis” or “fail to reject the null hypothesis.” Always make sense of the conclusion with a statement that uses simple nontechnical wording that addresses the original claim. Some texts use “accept the null hypothesis.” We are not proving the null hypothesis. Fail to reject says more correctly that the available evidence is not strong enough to warrant rejection of the null hypothesis.

28 Type I Error Type II Error
A Type I error is the mistake of rejecting the null hypothesis when it is actually true. The symbol α is used to represent the probability of a type I error. Type II Error A Type II error is the mistake of failing to reject the null hypothesis when it is actually false. The symbol β (beta) is used to represent the probability of a type II error. page 398 of Elementary Statistics, 10th Edition

29 Type I and Type II Errors

30 Example Assume that we are conducting a hypothesis test of the claim that a method of gender selection increases the likelihood of a baby girl, so that the probability of a baby girls is p > 0.5. Here are the null and alternative hypotheses: a) Identify a type I error. b) Identify a type II error.

31 Example - Continued a) A type I error is the mistake of rejecting a true null hypothesis: We conclude the probability of having a girl is greater than 50%, when in reality, it is not. Our data misled us. b) A type II error is the mistake of failing to reject the null hypothesis when it is false: There is no evidence to conclude the probability of having a girl is greater than 50% (our data misled us), but in reality, the probability is greater than 50%.

32 Parallel Example 3: Type I and Type II Errors
For each of the following claims, explain what it would mean to make a Type I error. What would it mean to make a Type II error? In 2008, 62% of American adults regularly volunteered their time for charity work. A researcher believes that this percentage is different today. According to a study published in March, 2006 the mean length of a phone call on a cellular telephone was 3.25 minutes. A researcher believes that the mean length of a call has increased since then. 32

33 Solution In 2008, 62% of American adults regularly volunteered their time for charity work. A researcher believes that this percentage is different today. A Type I error is made if the researcher concludes that p ≠ 0.62 when the true proportion of Americans 18 years or older who participated in some form of charity work is currently 62%. A Type II error is made if the sample evidence leads the researcher to believe that the current percentage of Americans 18 years or older who participated in some form of charity work is still 62% when, in fact, this percentage differs from 62%. 33

34 Solution b) According to a study published in March, 2006 the mean length of a phone call on a cellular telephone was minutes. A researcher believes that the mean length of a call has increased since then. A Type I error occurs if the sample evidence leads the researcher to conclude that μ > 3.25 when, in fact, the actual mean call length on a cellular phone is still minutes. A Type II error occurs if the researcher fails to reject the hypothesis that the mean length of a phone call on a cellular phone is 3.25 minutes when, in fact, it is longer than 3.25 minutes. 34

35 Controlling Type I and Type II Errors
For any fixed α, an increase in the sample size n will cause a decrease in β For any fixed sample size n, a decrease in α will cause an increase in β. Conversely, an increase in α will cause a decrease in β. To decrease both α and β, increase the sample size. The power of a hypothesis test is the probability 1 – β of rejecting a false null hypothesis. The value of the power is computed by using a particular significance level α and a particular value of the population parameter that is an alternative to the value assumed true in the null hypothesis.

36 8-3 Testing a Claim about a Proportion

37 Notation n = sample size or number of trials p = population proportion
q = 1 – p

38 Requirements for Testing Claims About a Population Proportion p
1) The sample observations are a simple random sample. 2) The conditions for a binomial distribution are satisfied. The conditions np ≥ 5 and nq ≥ 5 are both satisfied, so the binomial distribution of sample proportions can be approximated by a normal distribution with μ = np and Note: p is the assumed proportion not the sample proportion.

39 Test Statistic for Testing a Claim About a Proportion
P-values: Critical Values: Use the standard normal distribution (Table A-2) and refer to Figure 8-1. Use the standard normal distribution (Table A-2).

40 Caution Don’t confuse a P-value with a proportion p.
P-value = probability of getting a test statistic at least as extreme as the one representing sample data p = population proportion

41 Caution When testing claims about a population proportion, the traditional method and the P-value method are equivalent and will yield the same result since they use the same standard deviation based on the claimed proportion p. However, the confidence interval uses an estimated standard deviation based upon the sample proportion . Consequently, it is possible that the traditional and P-value methods may yield a different conclusion than the confidence interval method. A good strategy is to use a confidence interval to estimate a population proportion, but use the P-value or traditional method for testing a claim about the proportion.

42 Example Based on information from the National Cyber Security Alliance, 93% of computer owners believe they have antivirus programs installed on their computers. In a random sample of 400 scanned computers, it is found that 380 of them (or 95%) actually have antivirus software programs. Use the sample data from the scanned computers to test the claim that 93% of computers have antivirus software.

43 Example - Continued Requirement check:
The 400 computers are randomly selected. There is a fixed number of independent trials with two categories (computer has an antivirus program or does not). The requirements np ≥ 5 and nq ≥ 5 are both satisfied with n = 400

44 Example - Continued P-Value Method:
The original claim that 93% of computers have antivirus software can be expressed as p = 0.93. The opposite of the original claim is p ≠ 0.93. The hypotheses are written as:

45 Example - Continued P-Value Method:
For the significance level, we select α = 0.05. Because we are testing a claim about a population proportion, the sample statistic relevant to this test is:

46 Example - Continued P-Value Method:
The test statistic is calculated as:

47 Example - Continued P-Value Method:
Because the hypothesis test is two-tailed with a test statistic of z = 1.57, the P-value is twice the area to the right of z = 1.57. The P-value is twice , or

48 Example - Continued P-Value Method:
Because the P-value of is greater than the significance level of α = 0.05, we fail to reject the null hypothesis. We fail to reject the claim that 93% computers have antivirus software. We conclude that there is not sufficient sample evidence to warrant rejection of the claim that 93% of computers have antivirus programs.

49 Example - Continued Critical Value Method: Steps 1 – 5 are the same as for the P-value method. The test statistic is computed to be z = We now find the critical values, with the critical region having an area of α = 0.05, split equally in both tails.

50 Example - Continued Critical Value Method:
Because the test statistic does not fall in the critical region, we fail to reject the null hypothesis. We fail to reject the claim that 93% computers have antivirus software. We conclude that there is not sufficient sample evidence to warrant rejection of the claim that 93% of computers have antivirus programs.

51 Example - Continued Confidence Interval Method:
The claim of p = 0.93 can be tested at the α = 0.05 level of significance with a 95% confidence interval. Using the methods of Section 7-2, we get: 0.929 < p < 0.971 This interval contains p = 0.93, so we do not have sufficient evidence to warrant the rejection of the claim that 93% of computers have antivirus programs.

52 Example When asked about gun ownership, 46% of all Americans said protecting the right to own guns is more important then controlling gun ownership. The Pew Research Center surveyed 1267 randomly selected Americans with at least a bachelor’s degree and found that 559 believed that protecting the right to own guns is more important. Does this result suggest the proportion of Americans with at least a bachelor’s degree feel different than the general American population when it comes to gun control? Use α = level of significance. 52

53 Two-Tailed Hypothesis Testing Using Confidence Intervals
When testing H0: p = p0 versus H1: p ≠ p0 if a (1 – α) • 100% confidence interval contains p0, do not reject the null hypothesis. However, if the confidence interval does not contain p0, conclude that p ≠ p0 at the level of significance, α. 53

54 Example A study found that 34% of teenagers text while driving. Does a recent survey conducted, which found that 353 of 1200 randomly selected teens had texted while driving, suggest that the proportion of teens who text while driving has changed? Use 95% confidence interval to answer the question. 54

55 8-4 Testing a Claim About a Mean

56 Part 1 Notation = sample mean = population mean
When σ is not known, we use a “t test” that incorporates the Student t distribution. Notation = sample mean = population mean

57 Test Statistic Requirements The sample is a simple random sample.
Either or both of these conditions is satisfied: The population is normally distributed or n > 30. Test Statistic

58 Running the Test P-values: Use technology or use the Student t distribution in Table A-3 with degrees of freedom df = n – 1. Critical values: Use the Student t distribution with degrees of freedom df = n – 1.

59 Example Listed below are the measured radiation emissions (in W/kg) corresponding to a sample of cell phones. Use a 0.05 level of significance to test the claim that cell phones have a mean radiation level that is less than 1.00 W/kg. The summary statistics are: 0.38 0.55 1.54 1.55 0.50 0.60 0.92 0.96 1.00 0.86 1.46

60 Example - Continued Requirement Check:
We assume the sample is a simple random sample. The sample size is n = 11, which is not greater than 30, so we must check a normal quantile plot for normality.

61 Example - Continued The points are reasonably close to a straight line and there is no other pattern, so we conclude the data appear to be from a normally distributed population.

62 Example - Continued Step 1: The claim that cell phones have a mean radiation level less than W/kg is expressed as μ < 1.00 W/kg. Step 2: The alternative to the original claim is μ ≥ 1.00 W/kg. Step 3: The hypotheses are written as:

63 Example - Continued Step 4: The stated level of significance is α = 0.05. Step 5: Because the claim is about a population mean μ, the statistic most relevant to this test is the sample mean: . Step 6: Calculate the test statistic and then find the P-value or the critical value from Table A-3:

64 Example - Continued Step 7: Critical Value Method: Because the test statistic of t = – does not fall in the critical region bounded by the critical value of t = –1.812, fail to reject the null hypothesis.

65 Example - Continued Step 7: P-value method: Technology, such as a TI-83/84 Plus calculator can output the P-value of Since the P-value exceeds α = 0.05, we fail to reject the null hypothesis.

66 Example Step 8: Because we fail to reject the null hypothesis, we conclude that there is not sufficient evidence to support the claim that cell phones have a mean radiation level that is less than 1.00 W/kg.

67 Example – Confidence Interval Method
We can use a confidence interval for testing a claim about μ. For a two-tailed test with a 0.05 significance level, we construct a 95% confidence interval. For a one-tailed test with a 0.05 significance level, we construct a 90% confidence interval.

68 Example – Confidence Interval Method
Using the cell phone example, construct a confidence interval that can be used to test the claim that μ < 1.00 W/kg, assuming a 0.05 significance level. Note that a left-tailed hypothesis test with α = 0.05 corresponds to a 90% confidence interval. Using methods described in Section 7.3, we find: 0.707 W/kg < μ < W/kg

69 Example – Confidence Interval Method
Because the value of μ = 1.00 W/kg is contained in the interval, we cannot reject the null hypothesis that μ = 1.00 W/kg . Based on the sample of 11 values, we do not have sufficient evidence to support the claim that the mean radiation level is less than 1.00 W/kg.

70 Parallel Example 1: Testing a Hypothesis about a
Parallel Example 1: Testing a Hypothesis about a Population Mean, Large Sample The mean height of American males is 69.5 inches. The heights of the 43 male U.S. presidents have a mean inches and a standard deviation of 2.77 inches. Treating the 43 presidents as a simple random sample, determine if there is evidence to suggest that U.S. presidents are taller than the average American male. Use the α = 0.05 level of significance. 70

71 8-5 Testing a Claim About a Standard Deviation or Variance

72 Requirements for Testing Claims About σ or σ2
= sample size = sample standard deviation = sample variance  = claimed value of the population standard deviation = claimed value of the population variance

73 Requirements Test Statistic 1. The sample is a simple random sample.
2. The population has a normal distribution. (This is a much stricter requirement than the requirement of a normal distribution when testing claims about means.) Test Statistic

74 P-Values and Critical Values for Chi-Square Distribution
P-values: Use technology or Table A-4. Critical Values: Use Table A-4. In either case, the degrees of freedom = n –1. Caution The χ2 test of this section is not robust against a departure from normality, meaning that the test does not work well if the population has a distribution that is far from normal. The condition of a normally distributed population is therefore a much stricter requirement in this section than it was in Section 8-4.

75 Example Listed below are the heights (inches) for a simple random sample of ten supermodels. Consider the claim that supermodels have heights that have much less variation than the heights of women in the general population. We will use a 0.01 significance level to test the claim that supermodels have heights with a standard deviation that is less than 2.6 inches. Summary Statistics: 70 71 69.25 68.5 69 69.5

76 Example - Continued Requirement Check:
The sample is a simple random sample. We check for normality, which seems reasonable based on the normal quantile plot.

77 Example - Continued Step 1: The claim that “the standard deviation is less than 2.6 inches” is expressed as σ < 2.6 inches. Step 2: If the original claim is false, then σ ≥ 2.6 inches. Step 3: The hypotheses are: Step 4: The significance level is α = 0.01. Step 5: Because the claim is made about σ, we use the chi-square distribution.

78 Example - Continued Step 6: The test statistic is calculated as follows: with 9 degrees of freedom.

79 Example - Continued Step 6: The critical value of χ2 = is found from Table A-4, and it corresponds to 9 degrees of freedom and an “area to the right” of 0.99.

80 Example - Continued Step 7: Because the test statistic is in the critical region, we reject the null hypothesis. There is sufficient evidence to support the claim that supermodels have heights with a standard deviation that is less than 2.6 inches. Heights of supermodels have much less variation than heights of women in the general population.

81 Example - Continued Confidence Interval Method:
Since the hypothesis test is left-tailed using a 0.01 level of significance, we can run the test by constructing an interval with 98% confidence. Using the methods of Section 7-4, and the critical values found in Table A- 4, we can construct the following interval:

82 Example - Continued Based on this interval, we can support the claim that σ < 2.6 inches, reaching the same conclusion as using the P-value method and the critical value method.

83 Example The quality control engineer for M&M-Mars tested whether the mean weight of fun-size Snickers was 20.1 grams. Suppose that the standard deviation of the weight of the candy was 0.75 gram before a machine recalibration. The engineer wants to know if the recalibration resulted in more consistent weights. Conduct the appropriate test at α = 0.05 level of significance. 83

84 Example A machine fills bottles with 64 fluid ounces of liquid. The quality control manager determines that the fill levels are normally distributed with a mean of 64 ounces and a standard deviation of 0.42 ounce. He has an engineer recalibrate the machine to try to lower the standard deviation. After the recalibration, the manager selects 19 bottles from the line and determines that the standard deviation is 0.38 ounce. Is there less variability in the filling machine? Conduct the appropriate test at α = 0.01 level of significance. 84


Download ppt "Chapter 8 Hypothesis Testing"

Similar presentations


Ads by Google