Lecture 3 Miscellaneous details about hypothesis testing Type II error
Published byModified over 5 years ago
Presentation on theme: "Lecture 3 Miscellaneous details about hypothesis testing Type II error"— Presentation transcript:
1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significanceChapter 12.2: Inference about mean when s.d. is unknown.
2 Relation between p-value and rejection region methods Compare the p-value to a. Reject the null hypothesis only if p-value <aEx. 11.1:
3 Null Hypothesis in One-Sided Test We start by defining H1 because this is the focus of our test.Example 11.1: H1: m > 170The null hypothesis is more logically satisfying thanHowever, only the parameter value in H0 that is closest to H1 influences the form of the test.We therefore take for simplicity.
4 Calculating the Probability of a Type II Error To properly interpret the results of a test of hypothesis, we need tospecify an appropriate significance level or judge the p-value of a test;understand the relationship between Type I and Type II errors.How do we compute a type II error?
5 Calculation of the Probability of a Type II Error A Type II error occurs when a false H0 is not rejected.To calculate Type II error we need to…express the rejection region directly, in terms of the parameter hypothesized (not standardized).specify the alternative value under H1.Let us revisit Example 11.1
6 Calculation of the Probability of a Type II Error Let us revisit Example 11.1The rejection region was with a = .05.Let the alternative value be m = 180 (rather than just m>170)
7 Judging the TestA hypothesis test is effectively defined by the significance level a and by the sample size n.A measures of effectiveness is the probability of Type II error. Typically we want to keep the probability of Type II error as small as possible.If the probability of a Type II error b is judged to be too large, we can reduce it byincreasing a, and/orincreasing the sample size.
8 Judging the Test Increasing the sample size reduces b By increasing the sample size the standard deviation of the sampling distribution of the mean decreases. Thus, decreases.
9 Judging the TestIn Example 11.1, suppose n increases from 400 to 1000.a remains 5%, but the probability of a Type II drops dramatically.
10 Judging the TestAnother way of expressing how well a test performs is to report its powerThe power of a test is defined as 1 - b.It represents the probability of rejecting the null hypothesis when it is false.
11 Planning Studies Power calculations are important in planning studies. Using a hypothesis test with low power makes it unlikely that you will reject H0 even if the truth is far from the null hypothesis.Operating characteristic curve is a plot of versus the alternative for a fixed sample size n and a fixed significance level
12 Operating Characteristic Curve for Example 11.1
13 Problem 11.54Many Alpine ski centers base their projections of revenuesand profits on the assumption that the average Alpineskier skis 4 times per year.To investigate the validity of this assumption, a randomsample of 63 skiers is drawn and each is asked to reportthe number of times they skied the previous year.Assume that the population standard deviation is 2, andthe sample mean is Can we infer at the 10% levelthat the assumption is wrong?
15 Problem follow-upWhat is the probability of making a Type II error if the average Alpine skier skis 4.2 times per year?
16 Problem: Effects of SAT Coaching Suppose that SAT mathematics scores in the absence of coaching have a normal distribution with 475 and standard deviation Suppose further that coaching may change the mean but not the standard deviation. Calculate the p-value for the test of versusfor each of the following three situations:(a) A coaching service coaches 100 students; their SAT-M scores average(b) By the next year, the coaching service has coached 1000 students; their SAT-M scores average(c) An advertising campaign brings the total number of students coached to 10,000; their average score is still
18 Practical Significance vs. Statistical Significance An increase in the average SAT-M score from 475 to 478 is of little importance in seeking admission to college, but a large enough sample size will always declare very small effects statistically significant.A confidence interval provides information about the size of the effect and should always be reported. The two-sided 95% confidence intervals for the SAT coaching problem are Thus, for (a) - (458.4,497.6); (b) – (471.8,484.2); (c) – (476.04,479.96).For large samples, the CI says “Yes, the mean score is higher after coaching but only by a small amount.”
19 Chapter 12In this chapter we utilize the approach developed before to describe a population.Identify the parameter to be estimated or tested.Specify the parameter’s estimator and its sampling distribution.Construct a confidence interval estimator or perform a hypothesis test.
20 12.2 Inference About a Population Mean When the Population Standard Deviation Is Unknown Recall that when s is known we use the followingstatistic to estimate and test a population meanWhen s is unknown, we use its point estimator s, and the z-statistic is replaced then by the t-statistic
21 t-StatisticWhen the sampled population is normally distributed, the t statistic is Student t distributed with n-1 degrees of freedom.Confidence Interval: where is the quantile of the Student t-distribution with n-1 degrees of freedom.
22 The t - Statistic t s The “degrees of freedom”, (a function of the sample size)determine how spread thedistribution is (compared to thenormal distribution)The t distribution is mound-shaped,and symmetrical around zero.d.f. = v2d.f. = v1v1 < v2
24 Testing m when s is unknown Example 12.1In order to determine the number of workers required to meet demand, the productivity of newly hired trainees is studied.It is believed that trainees can process and distribute more than 450 packages per hour within one week of hiring.Fifty trainees were observed for one hour. In this sample of 50 trainees, the mean number of packages processed is and s=38.82.Can we conclude that the belief is correct, based on the productivity observation of 50 trainees?
26 Checking the required conditions In deriving the test and confidence interval, we have made two assumptions: (i) the sample is a random sample from the population; (ii) the distribution of the population is normal.The t test is robust – the results are still approximately valid as long as the population is not extremely nonnormal. Also if the sample size is large, the results are approximately valid.A rough graphical approach to examining normality is to look at the sample histogram.
27 JMP ExampleProblem 12.45: Companies that sell groceries over the Internet are called e-grocers. Customers enter their orders, pay by credit card, and receive delivery by truck. A potential e-grocer analyzed the market and determined that to be profitable the average order would have to exceed $85. To determine whether an e-grocer would be profitable in one large city, she offered the service and recorded the size of the order for a random sample of customers. Can we infer from the data than e-grocery will be profitable in this city at significance level 0.05?