Topic 6: Introduction to Hypothesis Testing

Presentation on theme: "Topic 6: Introduction to Hypothesis Testing"— Presentation transcript:

Topic 6: Introduction to Hypothesis Testing
CEE 11 Spring 2002 Dr. Amelia Regan These notes draw liberally from the class text, Probability and Statistics for Engineering and the Sciences by Jay L. Devore, Duxbury 1995 (4th edition) Additional material is taken from

Hypothesis testing A statistical hypothesis is a claim either about the value of a single population statistic or about the values of several population statistics A population statistic might be the mean diameter, weight, time to failure, proportion of defective units, etc. One example of a hypothesis is that m = c, where c is some constant. We might also hypothesize that m1 > m2 or that p, a population proportion = c

Hypothesis testing In any hypothesis testing problem there are two contradictory hypotheses under consideration these are known as the Null and Alternative Hypotheses A test procedure is specified by the following: A test statistic, a function of the sample data on which the decision reject H0 or do not reject H0 is based. A rejection region -- the set of all test statistic values for this H0 will be rejected.

Hypothesis testing The null hypothesis will be rejected if and only if the observed or computed test statistic value falls into the rejection region. A type I error consists of rejecting the null hypothesis when it is true. A type II error consists of not rejecting the null hypothesis when it is false. a = the probability of a type I error -- P(H0 is rejected when it is in fact true) b = the probability of a type II error -- P(H0 is not rejected when it is in fact false) a is often called the significance level of a test

Hypothesis testing For a given rejection region – decreasing the probability of a Type I error increases the probability of a Type II error. Similarly, decreasing the probability of a Type I error increases the probability of a Type II error.

Hypothesis testing When an independent variable appears to have an effect, it is very important to be able to state with confidence that the effect was really due to the variable and not just due to chance. For instance, consider a hypothetical experiment on a new antidepressant drug. Ten people suffering from depression were sampled and treated with the new drug (the experimental group); an additional 10 people were sampled from the same population and were treated only with a placebo (the control group).

Hypothesis testing After 12 weeks, the level of depression in all subjects was measured and it was found that the mean level of depression (on a 10-point scale with higher numbers indicating more depression) was 4 for the experimental group and 6 for the control group. The most basic question that can be asked here is: "How can one be sure that the drug treatment rather than chance occurrences were responsible for the difference between the groups?" It could be, that by chance, the people who were randomly assigned to the treatment group were initially somewhat less depressed than those randomly assigned to the control group.

Hypothesis testing The null hypotheses in this case would be
H0: m1 = m2,meaning that the mean level of depression for the two groups is not different in a statistically significant way. The alternative hypothesis would be Ha: m1 < m2

Hypothesis testing -- example
Problem 12, Page 321 A new design on the braking system on a certain type of new car has been proposed. For the current system, the true average braking distance at 40 mph (under specified conditions) is known to be 120 feet. It is proposed that the new design be implemented only if sample data strongly indicates that the new design leads to a reduction in the braking distance. a) Define the parameter of interest Mean braking distance under the new system

Hypothesis testing -- example
b) Suppose that braking distance for the new system is normally distributed with s = 10. Let represent the mean braking distance for a sample of 36 observations. Which of the following rejection regions is appropriate? To answer this, we need to identify the null and alternative hypothesis.

Hypothesis testing -- example
b) The test statistic is normally distributed N(120,1.667) therefore, the tests below correspond to a values of

Hypothesis testing -- example
c) The significance level = To obtain a = we need to change the rejection region so that it begins with z = -3.08

Hypothesis testing -- example
d) The probability that the new design is not implemented when its true average braking distance is actually 115 and the appropriate region from b) is used is Draw a picture!

Sampling Distributions
When s for a population is known, then the mean of a sample from the population is normally distributed However, we rarely know s. Instead, we must estimate the standard deviation from the sample data. In that case, the mean follows a similar but different distribution (known as the student’s t distribution or the t distribution)

Tests Statistics and Rejection Regions
CASE I: A Normal Population with known standard deviation Null hypothesis Test statistic value Alternative hypothesis Rejection region for level a test

Tests Statistics and Rejection Regions
CASE II: Large Sample Test (n >30 – the average is approximately normal and s is a good approximation for s) Null hypothesis Test statistic value Alternative hypothesis Rejection region for level a test

Tests Statistics and Rejection Regions
CASE III: A Normal Population with unknown standard deviation Null hypothesis Test statistic value Alternative hypothesis Rejection region for level a test