Presentation is loading. Please wait.

Presentation is loading. Please wait.

STAT 111 Introductory Statistics Lecture 10: Confidence Intervals and Hypothesis Tests June 8, 2004.

Similar presentations


Presentation on theme: "STAT 111 Introductory Statistics Lecture 10: Confidence Intervals and Hypothesis Tests June 8, 2004."— Presentation transcript:

1 STAT 111 Introductory Statistics Lecture 10: Confidence Intervals and Hypothesis Tests June 8, 2004

2 Today’s Topics Confidence intervals revisited Margin of error for confidence intervals Introduction to hypothesis testing

3 Confidence Intervals Revisited A level C confidence interval for some population parameter θ is an interval [L, U] computed from sample data by a method that has probability C of producing an interval containing the true value of the parameter In other words, P(L ≤ θ ≤ U) = C, C can be 90%, 95%, 99%, etc.

4 Confidence Intervals The general form of a confidence interval is given by estimate ± margin of error The estimate is our guess for the value of the unknown population parameter θ. The margin of error shows how accurate we believe our guess is, based on the variability of the estimate.

5 Confidence Interval for a Population Mean Suppose we choose a simple random sample of size n from a population with unknown mean µ and known standard deviation σ. Then a level C confidence interval for µ is z * satisfies –P(-z * ≤ Z ≤ z * ) = C –P(Z z * ) = (1 – C)/2

6 Confidence Interval for a Parameter Z 0

7 Confidence Interval for a Population Mean Recall the Central Limit Theorem. Suppose we have any population whose distribution has mean µ and standard deviation σ. If we draw a large enough SRS from this population, then This is true regardless of what the actual population distribution is.

8 Confidence Interval for a Population Mean Hence, if the population follows a normal distribution, or the sample size is sufficiently large, we have This leads to

9 Confidence Intervals for a Population Mean For any confidence interval, there are two possibilities: –The interval contains the true value of the parameter (in this case, µ). –Our SRS was one of the few samples for which µ is not contained in the interval. It is incorrect to say that there is probability C that the unknown population parameter (µ) lies within our particular confidence interval.

10 It means that if we repeatedly sample from the population, then the true population mean µ will be covered by the constructed confidence intervals (100C)% of the time. Remember! It is incorrect to say that the probability that the true population mean µ lies within the confidence interval is C. JAVA Applet for demonstrating confidence intervalsJAVA Applet Confidence Interval for a Population Mean

11 Upper Confidence LimitLower Confidence Limit Width on each side(Margin of error)

12 Commonly Used Confidence Levels Confidence level(C) 1-C(1-C)/2z* (z (1-C)/2 ) 99%.010.0052.575 98%.020.012.33 95%.050.0251.96 90%.100.051.645 80%.200.11.28

13 Example 1 The number and the types of television programs and commercials targeted at children is affected by the amount of time children watch TV. A survey was conducted among 100 American children, in which they were asked to record the number of hours they watched TV per week. The sample mean is 27.191. The known population standard deviation is 8. Estimate the average watch time at a 95% confidence level.

14 Example 2 A study of preferred height for an experimental keyboard with large forearm-wrist support was conducted. 31 trained typists were selected, and the preferred keyboard height was determined for each of them. The resulting sample average height was 80 cm. Assume the preferred height is normally distributed with σ = 2 cm. Calculate a 90% confidence interval for µ, the true average preferred height for the population.

15 Example 3 Suppose we desire a confidence interval for the true average stray-load loss µ (in watts) for a certain type of induction motor when the line current is held at 10 amps for a speed of 1500 rpm. Assume that stray-load loss is normally distributed with σ = 3.0 If the a sample of size 100 produces a mean stray- load loss of 58.3, compute a 99% confidence interval for µ.

16 Example 4 The yield point of a particular type of mild steel- reinforcing bar is known to be normally distributed with σ = 100. The composition of the bar has been slightly modified without affecting either the normality or the value of σ. If a sample of 25 modified bars results in a sample average yield point of 8439 lb, compute a 92% confidence interval for the true average yield point of the modified bar.

17 Confidence Intervals (cont.) Confidence intervals for other parameters in a population can also be constructed. In particular, confidence intervals can be constructed on the standard deviation/variance of a population whose distribution has known mean µ. Also on populations in which some event occurs with proportion p. (More on this one later on.)

18 Margin of Error of a Confidence Interval The margin of error m is Margin of error measures precision of our estimate, but covers only random sampling errors. The size of the margin of error depends on –Confidence level –Sample size –Population standard deviation

19 Confidence Interval The length (width) of a confidence interval is The length (width) of a confidence interval increases if the margin of error increases. The width of a confidence interval increases if –Confidence level increases –Sample size decreases –Population standard deviation increases

20 Choosing the Sample Size Fixing the confidence level, a confidence interval for a population mean will have a specified margin of error m when the sample size is By achieving a specified margin of error, we can estimate the mean to within that margin of error units.

21 Example 1 To estimate the amount of lumber that can be harvested in a tract of land, the mean diameter of trees in the tract must be estimated to within one inch with 99% confidence. What sample size should be taken? (Assume diameters are normally distributed with σ = 6 inches.)

22 Example 2 Suppose that the standard deviation of the salaries of a population of individuals is 30K, how many individuals do we need to sample so that the 90% CI has a margin of error no more than 5K?

23 Example 3 Monitoring of a computer time-sharing system has suggested that response time to a particular command is normally distributed with σ = 25 ms. A new operating system is installed, and we wish to estimate the true average response time µ for the new environment. Assuming that response times are still normally distributed with σ = 25, what sample size is necessary to ensure that the resulting 95% confidence interval has a width of at most 10?

24 Cautions on CI for Population Mean The data must be an SRS from the population. Formula is incorrect for more complex probability sampling designs. Formula requires carefully produced data. Confidence interval is not resistant to outliers. When sample size is small, examine data for skewness and other signs of non-normality. Formula requires standard deviation of population to be known, which is not realistic in practice.

25 Introduction: Hypothesis Testing Confidence intervals are one of the two most common types of formal statistical inference. We prefer confidence intervals when our goal is to estimate a population parameter. Second common type of inference is used when we want to assess the evidence provided by the data in favor of some claim (hypothesis) about the population.

26 Hypothesis Testing Examples of claims to which hypothesis testing can be applied: –Are less than 10% of all circuit boards produced by a particular manufacturer defective? –Is the true average inside diameter of a certain type of pipe 0.75 cm? –Does one type of twine have a higher average breaking strength than a second type of twine? –For a pharmaceutical company, is a new drug effective for a certain disease?

27 Hypothesis Testing The hypothesis is a statement about the parameters in a population or model. The results of a test are expressed in terms of a probability that measures how well the data and the hypothesis agree. In hypothesis testing, we need to set up two hypotheses: –The null hypothesis H 0 –The alternative hypothesis H a (sometimes denoted H 1 )

28 Hypothesis Testing The null hypothesis is the claim which is initially favored or believed to be true. The null hypothesis is also the claim that we will try to find evidence against. Usually the null hypothesis is a statement of “no effect” or “no difference.” The test of significance is designed to assess the strength of the evidence against the null hypothesis.

29 Hypothesis Testing The alternative hypothesis is the claim that we hope or suspect is true instead of H 0. We often begin with the alternative hypothesis H a and then set up H 0 as the statement that the hoped-for effect is not present. Stating H a is often a difficult task. Hypotheses in general refer to some population or model and not to any particular outcome.

30 Hypothesis Testing The alternative hypothesis H a can be either one- sided or two-sided. One-sided alternative hypotheses: –μ > 0 –p ≤ 0.5 –σ < 2 Two-sided alternative hypotheses: –μ ≠ 0 –p ≠ 0.5 –σ ≠ 2

31 Example Experiments on learning in animals sometimes measure how long it takes a mouse to find its way through a maze. The mean time is 18 second for one particular maze. A researcher thinks that a loud noise will cause the mice to complete the maze faster. She measures how long each of 10 mice takes with a noise as stimulus. Let μ be the mean time of mice to find their way through a particular maze when noise is presented as a stimulus. –H 0 : μ = 18 –H a : μ < 18 One-sided H a

32 Example One-sided H a Does more than half of the American population have faith in the economy? 100,000 Americans are sampled. Let p be the population proportion of people who have faith in the economy. –H 0 : p ≤ 0.5 –H a : p > 0.5

33 Example The Census Bureau reports that households spend an average of 31% of their total spending on housing. A homebuilders association in Cleveland wonders if the national finding applies in their area. They interview a sample of 40 households in the Cleveland metropolitan area to learn what percent of their spending goes toward housing. Let μ be the mean percent of spending of households in Cleveland on housing. –H 0 : μ = 0.31 –H a : μ ≠ 0.31 Two-sided H a

34 Example Does one type of twine have a higher average breaking strength than a second type of twine? Let μ 1 be the average breaking strength of the first type of twine, and let μ 2 be the average breaking strength of the second type. –H 0 : μ 1 = μ 2 –H a : μ 1 ≠ μ 2 Two-sided H a

35 Hypothesis Testing The alternative hypothesis in general should express the hopes or suspicions we bring to the data. We should not, however, look first at the data and then frame H a to fit what the data show. Use a two-sided alternative unless you have a specific direction firmly in mind beforehand. In some circles, it is argued that the two-sided alternative should always be used in testing.


Download ppt "STAT 111 Introductory Statistics Lecture 10: Confidence Intervals and Hypothesis Tests June 8, 2004."

Similar presentations


Ads by Google