# Estimating Proportions with Confidence

## Presentation on theme: "Estimating Proportions with Confidence"— Presentation transcript:

Estimating Proportions with Confidence

Principal Idea: Survey 150 randomly selected students and 41% think marijuana should be legalized. If we report between 33% and 49% of all students at the college think that marijuana should be legalized, how confident can we be that we are correct? Confidence interval: an interval of estimates that is likely to capture the population value. Objective: how to calculate and interpret a confidence interval estimate of a population proportion. Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

10.1 The Language and Notation of Estimation
Unit: an individual person or object to be measured. Population (or universe): the entire collection of units about which we would like information or the entire collection of measurements we would have if we could measure the whole population. Sample: the collection of units we will actually measure or the collection of measurements we will actually obtain. Sample size: the number of units or measurements in the sample, denoted by n. Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

More Language and Notation of Estimation
Population proportion: the fraction of the population that has a certain trait/characteristic or the probability of success in a binomial experiment – denoted by p. The value of the parameter p is not known. Sample proportion: the fraction of the sample that has a certain trait/characteristic – denoted by The statistic is an estimate of p. The Fundamental Rule for Using Data for Inference is that available data can be used to make inferences about a much larger group if the data can be considered to be representative with regard to the question(s) of interest. Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

10.2 Margin of Error Media Descriptions of Margin of Error:
The difference between the sample proportion and the population proportion is less than the margin of error about 95% of the time, or for about 19 of every 20 sample estimates. The difference between the sample proportion and the population proportion is more than the margin of error about 5% of the time, or for about 1 of every 20 sample estimates Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

Example 10.1 Teens and Interracial Dating
1997 USA Today/Gallup Poll of teenagers across country: 57% of the 497 teens who go out on dates say they’ve been out with someone of another race or ethnic group. Reported margin of error for this estimate was about 4.5%. In surveys of this size, the difference between the sample estimate of 57% and the true percent is likely* to be less than 4.5% one way or the other. There is, however, a small chance that the sample estimate might be off by more than 4.5%. * The value of how ‘likely’ is often 95%. Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

Example 10.2 If I Won the Lottery …
If you won 10 million dollars in the lottery, would you continue to work or stop working? 1997 Gallup Poll: 59% of the 616 employed respondents said they would continue to work. Reported information about this poll: Results based on telephone interviews with a randomly selected sample of 1014 adults, conducted Aug 22–25, ‘97. Among this group, 616 are employed full-time/part-time. For results based on this sample of “workers,” one can say with 95% confidence that the error attributable to sampling could be plus or minus 4 percentage points. Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

10.3 Confidence Intervals Interpreting the Confidence Level
Confidence interval: an interval of values computed from sample data that is likely to include the true population value. Interpreting the Confidence Level The confidence level is the probability that the procedure used to determine the interval will provide an interval that includes the population parameter. If we consider all possible randomly selected samples of same size from a population, the confidence level is the fraction or percent of those samples for which the confidence interval includes the population parameter. Note: Often express the confidence level as a percent. Common levels are 90%, 95%, 98%, and 99%. Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

Constructing a 95% Confidence Interval for a Population Proportion
Sample estimate  Margin of error In the long run, about 95% of all confidence intervals computed in this way will capture the population value of the proportion, and about 5% of them will miss it. Be careful: The confidence level only expresses how often the procedure works in the long run. Any one specific interval either does or does not include the true unknown population value. Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

Example 10.1 Teens and Interracial Dating (cont)
Poll: 57% of dating teens sampled had gone out with somebody of another race/ethnic group. Margin of error was 4.5%. 95% Confidence Interval: 57%  4.5%, or 52.5% to 61.5% We have 95% confidence that somewhere between 52.5% and 61.5% of all American teens who date have gone out with somebody of another race or ethnic group. Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

Example 10.2 Winning the Lottery and Work (cont)
Poll: 40% of employed workers sampled would quit working if they won the lottery. Margin of error was 4%. 95% Confidence Interval Estimate: Sample estimate  Margin of error 40%  4% 36% to 44% With 95% confidence, somewhere between 36% and 44% of working Americans would say they would quit working if they won \$10 million in the lottery. Interval does not cover 50% => Appears that fewer than half of all working Americans think they would quit if won lottery. Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

10.4 Calculating A Margin of Error for 95% Confidence
For a 95% confidence level, the approximate margin of error for a sample proportion is Note: The “95% margin of error” is simply two standard errors, or 2 s.e.( ). Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

Factors that Determine Margin of Error
1. The sample size, n. When sample size increases, margin of error decreases. 2. The sample proportion, If the proportion is close to either 1 or 0 most individuals have the same trait or opinion, so there is little natural variability and the margin of error is smaller than if the proportion is near 0.5. 3. The “multiplier” 2. Connected to the “95%” aspect of the margin of error. Later you’ll learn: the exact value for 95% is 1.96 and how to change the multiplier to change the level. Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

Example 10.3 Pollen Count Must Be High
Poll: Random sample of 883 American adults. “Are you allergic to anything?” Results: 36% of the sample said “yes”, = .36 95% Confidence Interval: .36  .032, or about .33 to .39 We can be 95% confident that somewhere between 33% and 39% of all adult Americans have allergies. Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

The Conservative Estimate of Margin of Error
Conservative estimate of the margin of error = It usually overestimates the actual size of the margin of error. It works (conservatively) for all survey questions based on the same sample size, even if the sample proportions differ from one question to the next. Obtained when = .5 in the margin of error formula. Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

Example 10.3 Really Bad Allergies (cont)
Poll: Random sample of 883 American adults 3% of the sample experience “severe” symptoms 95% (conservative) Confidence Interval: 3%  3.4%, or -0.4% to 6.4% When is far from .5, the conservative margin of error is too conservative. The 95% margin of error using = .03 is just .011 or 1.1%, for an interval from 1.9% to 4.1%. Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

10.5 General Theory of CIs for a Proportion
Developing the 95% Confidence Level From the sampling distribution of we have: For 95% of all samples, -2 standard deviations < – p < 2 standard deviations Don’t know true standard deviation, so use standard error. For approximately 95% of all samples, -2 standard errors < – p < 2 standard errors which implies for approximately 95% of all samples, – 2 standard errors < p < standard errors Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

General Description of the Approximate 95% CI for a Proportion
Approximate 95% CI for the population proportion:  2 standard errors The standard error is Interpretation: For about 95% of all randomly selected samples from the population, the confidence interval computed in this manner captures the population proportion. Necessary Conditions: and are both greater than 10, and the sample is randomly selected. Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

General Format for Confidence Intervals
For any confidence level, a confidence interval for either a population proportion or a population mean can be expressed as Sample estimate  Multiplier  Standard error The multiplier is affected by the choice of confidence level. Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

Note: Increase confidence level => larger multiplier. Multiplier, denoted as z*, is the standardized score such that the area between -z* and z* under the standard normal curve corresponds to the desired confidence level. Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

Formula for a Confidence Interval for a Population Proportion p
is the sample proportion. z* denotes the multiplier. where is the standard error of . Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

Example 10.6 Intelligent Life Elsewhere?
Poll: Random sample of 935 Americans Do you think there is intelligent life on other planets? Results: 60% of the sample said “yes”, = .60 90% Confidence Interval: .60  1.65(.016), or .60  .026 98% Confidence Interval: .60  2.33(.016), or .60  .037 Note: entire interval is above 50% => high confidence that a majority believe there is intelligent life. Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

Example 10.6 Intelligent Life Elsewhere?
Poll: Random sample of 935 Americans “Do you think there is intelligent life on other planets? Results: 60% of the sample said “yes”, = .60 We want a 50% confidence interval. If the area between -z* and z* is .50, then the area to the left of z* is From Table A.1 we have z*  .67. 50% Confidence Interval: .60  .67(.016), or .60  .011 Note: Lower confidence level results in a narrower interval. Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

Conditions for Using the Formula
1. Sample is randomly selected from the population. Note: Available data can be used to make inferences about a much larger group if the data can be considered to be representative with regard to the question(s) of interest. 2. Normal curve approximation to the distribution of possible sample proportions assumes a “large” sample size. Both and should be at least 10 (although some say these need only to be at least 5). Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

10.6 Choosing a Sample Size Table provides 95% conservative margin of error for various sample sizes n Important features: 1. When sample size is increased, margin of error decreases. 2. When a large sample size is made even larger, the improvement in accuracy is relatively small. Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

The Effect of Population Size
For most surveys, the number of people in the population has almost no influence* on the accuracy of sample estimates. Margin of error for a sample size of 1000 is about 3% whether the number of people in the population is 30,000 or 200 million. * As long as the population is at least ten times as large as the sample. Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

10.7 Using Confidence Intervals to Guide Decisions
Principle 1. A value not in a confidence interval can be rejected as a possible value of the population proportion. A value in a confidence interval is an “acceptable” possibility for the value of a population proportion. Principle 2. When the confidence intervals for proportions in two different populations do not overlap, it is reasonable to conclude that the two population proportions are different. Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

Example 10.7 Which Drink Tastes Better?
Taste Test: A sample of 60 people taste both drinks and 55% like taste of Drink A better than Drink B. Makers of Drink A want to advertise these results. Makers of Drink B make a 95% confidence interval for the population proportion who prefer Drink A. 95% Confidence Interval: Note: Since .50 is in the interval, there is not enough evidence to claim that Drink A is preferred by a majority of population represented by the sample. Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

Case Study 10.1 ESP Works with Movies
ESP Study by Bem and Honorton (1994) Subjects (receivers) described what another person (sender) was seeing on a screen. Receivers shown 4 pictures, asked to pick which they thought sender had actually seen. Actual image shown randomly picked from 4 choices. Image was either a single, “static” image or a “dynamic” short video clip, played repeatedly (additional three choices shown were always of the same type as actual. Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

Case Study 10.1 ESP Works (cont)
Bem and Honorton (1994) ESP Study Results Is there enough evidence to say that the % of correct guesses for dynamic pictures is significantly above 25%? 95% CI: Can claim the true % of correct guesses is significantly better than what would occur from random guessing. Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

Case Study 10.2 Nicotine Patches vs Zyban
Study: New England Journal of Medicine 3/4/99) 893 participants randomly allocated to four treatment groups: placebo, nicotine patch only, Zyban only, and Zyban plus nicotine patch. Participants blinded: all used a patch (nicotine or placebo) and all took a pill (Zyban or placebo). Treatments used for nine weeks. Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

Case Study 10.2 Nicotine (cont)
Conclusions: Zyban is effective (no overlap of Zyban and no Zyban CIs) Nicotine patch is not particularly effective (overlap of patch and no patch CIs) Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

Case Study 10.3 What a Great Personality
Would you date someone with a great personality even though you did not find them attractive? Women: 61.1% of 131 answered “yes.” 95% confidence interval is 52.7% to 69.4%. Men: 42.6% of 61 answered “yes.” 95% confidence interval is 30.2% to 55%. Conclusions: Higher proportion of women would say yes. CIs slightly overlap Women CI narrower than men CI due to larger sample size Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

In Summary: Confidence Interval for a Population Proportion p
General CI for p: Approximate 95% CI for p: Conservative 95% CI for p: Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc.

Similar presentations