# Confidence Intervals for Proportions

## Presentation on theme: "Confidence Intervals for Proportions"— Presentation transcript:

Confidence Intervals for Proportions

QTM1310/ Sharpe 11.1 A Confidence Interval Example: In March 2010, a Gallop Poll found that 1012 out of 2976 respondents thought economic conditions were getting better – a sample proportion of = 1012/2976 = 34.0%. We’d like use this sample proportion to say something about what proportion, p, of the entire population thinks the economic conditions are getting better. 2

QTM1310/ Sharpe 11.1 A Confidence Interval Example (continued): We know that our sampling distribution model is centered at the true proportion, p, and we know the standard deviation of the sampling distribution is given by the formula below. We also know from the Central Limit Theorem that the shape of the sampling distribution is approximately Normal and we can use to find the standard error. 3

QTM1310/ Sharpe 11.1 A Confidence Interval Example (continued): The sampling distribution model for is Normal with mean p and standard deviation estimated to be Because the distribution is Normal, we expect that about 95% of all samples of 2976 U.S. adults would have had sample proportions within two SEs of p. That is, we are 95% sure that is within 2 × (0.009) of p. 4

11.1 A Confidence Interval What Can We Say about a Proportion?
QTM1310/ Sharpe 11.1 A Confidence Interval What Can We Say about a Proportion? Here’s what we would like to be able to say: “34.0% of all U.S. adults thought the economy was improving.” There is no way to be sure that the population proportion is the same as the sample proportion. “It is probably true that 34.0% of all U.S. adults thought the economy was improving.” We can be pretty certain that whatever the true proportion is, it’s probably not exactly 34.0%. 5

11.1 A Confidence Interval What Can We Say about a Proportion?
QTM1310/ Sharpe 11.1 A Confidence Interval What Can We Say about a Proportion? “We don’t know the exact proportion of U.S. adults who thought the economy was improving but we know it is between 32.2% and 35.8%.” We can’t know for sure that the true proportion is in this interval. “We don’t know the exact proportion of U.S. adults who thought the economy was improving but the interval from 32.2% to 35.8% probably contains the true proportion.” This is close to correct, but what is meant by probably? 6

11.1 A Confidence Interval What Can We Say about a Proportion?
QTM1310/ Sharpe 11.1 A Confidence Interval What Can We Say about a Proportion? An appropriate interpretation of our confidence interval would be, “We are 95% confident that between 32.2% to 35.8% of U.S. adults thought the economy was improving.” The confidence interval calculated and interpreted here is an example of a one-proportion z-interval. 7

11.1 A Confidence Interval What Does “95% Confidence” Really Mean?
QTM1310/ Sharpe 11.1 A Confidence Interval What Does “95% Confidence” Really Mean? What does it mean when we say we have 95% confidence that our interval contains the true proportion? Our uncertainty is about whether the particular sample we have at hand is one of the successful ones or one of the 5% that fail to produce an interval that captures the true value. We know the sample proportion varies from sample to sample. If other pollsters would have collected samples, their confidence intervals would have been centered at the proportions they observed. 8

11.1 A Confidence Interval What Does “95% Confidence” Really Mean?
QTM1310/ Sharpe 11.1 A Confidence Interval What Does “95% Confidence” Really Mean? Below we see the confidence intervals produced by simulating 20 samples. The purple dots are the simulated proportions of adults who thought the economy was improving. The orange segments show each sample’s confidence intervals. The green line represents the true proportion of the entire population. Note: Not all confidence intervals capture the true proportion. 9

11.2 Margin of Error: Certainty vs. Precision
QTM1310/ Sharpe 11.2 Margin of Error: Certainty vs. Precision Our confidence interval can be expressed as below. The extent of that interval on either side of is called the margin of error (ME). The general confidence interval can now be expressed in terms of the ME. 10

11.2 Margin of Error: Certainty vs. Precision
QTM1310/ Sharpe 11.2 Margin of Error: Certainty vs. Precision The more confident we want to be, the larger the margin of error must be. We can be 100% confident that any proportion is between 0% and 100%, but we can’t be very confident that the proportion is between 14.98% and 34.02%. Every confidence interval is a balance between certainty and precision. Fortunately, we can usually be both sufficiently certain and sufficiently precise to make useful statements. 11

11.2 Margin of Error: Certainty vs. Precision
QTM1310/ Sharpe 11.2 Margin of Error: Certainty vs. Precision Critical Values To change the confidence level, we’ll need to change the number of SEs to correspond to the new level. For any confidence level the number of SEs we must stretch out on either side of is called the critical value. Because a critical value is based on the Normal model, we denote it z*. 12

11.2 Margin of Error: Certainty vs. Precision
QTM1310/ Sharpe 11.2 Margin of Error: Certainty vs. Precision Critical Values A 90% confidence interval has a critical value of That is, 90% of the values are within standard deviations from the mean. 13

11.3 Assumptions and Conditions
QTM1310/ Sharpe 11.3 Assumptions and Conditions Is using a Normal model for the sampling distribution appropriate? Are the assumptions used reasonable? We must check our assumptions and the corresponding conditions before creating a confidence interval about a proportion. 14

11.3 Assumptions and Conditions
QTM1310/ Sharpe 11.3 Assumptions and Conditions Independence Assumption Is there any reason to believe that the data values somehow affect each other? Randomization Condition: Proper randomization can help ensure independence. 10% Condition: If the sample exceeds 10% of the population, the probability of a success changes so much during the sampling that a Normal model may no longer be appropriate. 15

11.3 Assumptions and Conditions
QTM1310/ Sharpe 11.3 Assumptions and Conditions Sample Size Assumption The sample size must be large enough for the Normal sampling model to be appropriate. Success/Failure Condition: We must expect our sample to contain at least 10 “successes” and at least 10 “failures”. So we check that both and 16

QTM1310/ Sharpe Guided Example In the spring of 2009 workers at Sony France protesting layoffs, took the boss hostage for a night and barricaded their factory entrance. He was released only after he agreed to reopen talks on severance packages. Similar incidents occurred at 3M and Caterpillar plants in France. These incidents have been nicknamed “bossnapping.” What did other French adults think of this practice? Where they sympathetic? Understanding? Approving? 17

QTM1310/ Sharpe Guided Example A poll taken by Le Parisien in April 2009 found 45% of the French “supportive” of such action. A similar poll taken by Paris Match, April 2–3, 2009, found 30% “approving” and 63% were “understanding” or “sympathetic” of the action. Only 7% condemned the practice of “bossnapping.” The Paris Match poll was based on a random representative sample of 1010 adults. 18

Guided Example (continued):
QTM1310/ Sharpe Guided Example (continued): What can we conclude about the proportion of all French adults who sympathize with the practice of “bossnapping?” First, check conditions. Randomization Condition: The sample was selected randomly. 10% Condition: The sample is certainly less than 10% of the population. Success/Failure Condition: The conditions are satisfied so a one-proportion z-interval using the Normal model is appropriate. 19

Guided Example (continued):
QTM1310/ Sharpe Guided Example (continued): A poll taken by Paris Match found 63% of 1010 French adults sympathized with the practice of “bossnapping.” What can we conclude about the proportion of all French adults who sympathize with the practice of “bossnapping?” Construct the 95% confidence interval. 20

Guided Example (continued):
QTM1310/ Sharpe Guided Example (continued): A poll taken by Paris Match found 63% of 1010 French adults sympathized with the practice of “bossnapping.” What can we conclude about the proportion of all French adults who sympathize with the practice of “bossnapping?” Report conclusions. The polling agency l’lfop surveyed 1010 French adults and asked whether they approved, were sympathetic to or disapproved of recent bossnapping actions. Although we can’t know the true proportion of French adults who were sympathetic (without supporting outright), based on the survey we can be 95% confident that between 60.1% and 65.9% of all French adults were. 21

11.4 Choosing the Sample Size
QTM1310/ Sharpe 11.4 Choosing the Sample Size To get a narrower confidence interval without giving up confidence, we must choose a larger sample. Suppose a company wants to offer a new service and wants to estimate, to within 3%, the proportion of customers who are likely to purchase this new service with 95% confidence. How large a sample do they need? To answer this question, we look at the margin of error. We see that this question can’t be answered because there are two unknown values, and n. 22

11.4 Choosing the Sample Size
QTM1310/ Sharpe 11.4 Choosing the Sample Size We proceed by guessing the worst case scenario for . We guess is 0.50 because this makes the SD (and therefore n) the largest. We may now compute n. We can conclude that the company will need at least 1068 respondents to keep the margin of error as small as 3% with confidence level 95%. 23

11.4 Choosing the Sample Size
QTM1310/ Sharpe 11.4 Choosing the Sample Size Usually a margin of error of 5% or less is acceptable. However, to cut the margin of error in half, you will have to quadruple the sample size. The sample size in a survey is the number of respondents, not the number of questionnaires sent or phone numbers dialed, so increasing the sample size can dramatically increase the cost and time needed to collect the data. 24

*11.5 A Confidence Interval for Small Samples
QTM1310/ Sharpe *11.5 A Confidence Interval for Small Samples When the Success/Failure condition fails, we make a simple adjustment to the calculation that lets us make a confidence interval anyway. We add four synthetic observations, two to the successes and two to the failures, and use the adjusted proportion. 25

*11.5 A Confidence Interval for Small Samples
QTM1310/ Sharpe *11.5 A Confidence Interval for Small Samples Including the synthetic observations leads to a new adjusted interval. This form gives better performance for proportions near zero or one. It also has the advantage that we do not need to check the Success/Failure condition. 26

*11.5 A Confidence Interval for Small Samples
QTM1310/ Sharpe *11.5 A Confidence Interval for Small Samples A student studying the impact of Super Bowl ads wants to know what proportion of students on campus watched the Super Bowl. A random sample of 25 students reveals that all 25 watched the Super Bowl. This gives a of 100% and a 95% confidence interval of (1.0, 1.0). Can she conclude that every student on her campus watched the Super Bowl? 27

*11.5 A Confidence Interval for Small Samples
QTM1310/ Sharpe *11.5 A Confidence Interval for Small Samples Obviously the Success/Failure condition is violated, but she can use synthetic observations. Adding two successes and failures, she can calculate and the standard error. She can find the 95% confidence interval: 0.931 ± 1.96(0.047) = (0.839, 1.023). She can conclude with 95% confidence that between 83.9% and 102.3% (or 100%) of all students watched the Super Bowl. 28

QTM1310/ Sharpe Be sure to use the right language to describe your confidence intervals. Your uncertainty is about the interval, not the true proportion. Don’t suggest that the parameter varies. The population parameter is fixed, it is the interval that varies from sample to sample. Don’t claim that other samples will agree with yours. There is nothing special about your sample; it doesn’t set the standard for other samples. Don’t be certain about the parameter. Do not assert that the population parameter cannot be outside an interval. 29

Don’t forget: It’s about the parameter. We are interested in p, not
QTM1310/ Sharpe Don’t forget: It’s about the parameter. We are interested in p, not Don’t claim to know too much. Do take responsibility. You must accept the responsibility and consequences of the fact that not all the intervals you compute will capture the true population value. 30

Violations of Assumptions
QTM1310/ Sharpe Violations of Assumptions Watch out for biased sampling. Don’t forget the sources of bias in surveys. Think about independence. It is tough to check the assumption that values in a sample are mutually independent, but it pays to think about it. Be careful of sample size. The validity of the confidence interval for proportions may be affected by sample size. 31

QTM1310/ Sharpe What Have We Learned? Construct a confidence interval for a proportion, p, as the statistic, plus and minus a margin of error. • The margin of error consists of a critical value based on the sampling model times a standard error based on the sample. • The critical value is found from the Normal model. • The standard error of a sample proportion is calculated as 32

What Have We Learned? Interpret a confidence interval correctly.
QTM1310/ Sharpe What Have We Learned? Interpret a confidence interval correctly. • You can claim to have the specified level of confidence that the interval you have computed actually covers the true value. Understand the importance of the sample size, n, in improving both the certainty (confidence level) and precision (margin of error). • For the same sample size and proportion, more certainty requires less precision and more precision requires less certainty. 33

QTM1310/ Sharpe What Have We Learned? Know and check the assumptions and conditions for finding and interpreting confidence intervals. • Independence Assumption or Randomization Condition • 10% Condition • Success/Failure Condition Be able to invert the calculation of the margin of error to find the sample size required, given a proportion, a confidence level, and a desired margin of error 34

QTM1310/ Sharpe Exercise 7 A consumer group hoping to assess customer experiences with auto dealers surveys 167 people who recently bought new cars; 3% of them expressed dissatisfaction with the salesperson. Identify the population, the sample, p, , and check conditions for creating a confidence interval. Population – Sample – p – 35

QTM1310/ Sharpe Exercise 7 A consumer group hoping to assess customer experiences with auto dealers surveys 167 people who recently bought new cars; 3% of them expressed dissatisfaction with the salesperson. Identify the population, the sample, p, , and check conditions for creating a confidence interval. Population – All customers who recently bought new cars Sample – 167 people surveyed about their experience p – the true proportion of new car buyers who are dissatisfied with the sales person – the proportion of new car buyers surveyed who are dissatisfied with the sales person (3%) 36

QTM1310/ Sharpe Exercise 7 (continued) Check conditions for creating a confidence interval. Randomization Condition: It is unknown if the sample was selected randomly. The auto dealer may have used sampling methods with voluntary response or nonresponse bias 10% Condition: The sample is certainly less than 10% of the population. Success/Failure Condition: Cannot use confidence interval methods introduced in Chapter 10 because the Success/Failure condition is not met. 37

QTM1310/ Sharpe Exercise 19 Several factors are involved in the creation of a confidence interval. Among them are the sample size, the level of confidence, and the margin of error. Which of the following statements are true? For a given sample size, the higher confidence means a smaller margin of error. For a specified confidence level, larger samples provide smaller margins of error. 38

Exercise 19 Which of the following statements are true?
QTM1310/ Sharpe Exercise 19 Which of the following statements are true? For a given sample size, the higher confidence means a smaller margin of error. This statement is false. If you desire higher confidence, the interval will be wider, providing a wider range of plausible values for the parameter. For a specified confidence level, larger samples provide smaller margins of error. This statement is true. 39

QTM1310/ Sharpe Exercise 19 (continued) Several factors are involved in the creation of a confidence interval. Among them are the sample size, the level of confidence, and the margin of error. Which of the following statements are true? For a fixed margin of error, larger samples provide greater confidence. For a given confidence level, halving the margin of error requires a sample twice as large. 40

QTM1310/ Sharpe Exercise 19 (continued) Several factors are involved in the creation of a confidence interval. Among them are the sample size, the level of confidence, and the margin of error. Which of the following statements are true? For a fixed margin of error, larger samples provide greater confidence. This statement is true. d) For a given confidence level, halving the margin of error requires a sample twice as large. This statement is false. A sample size four times as large would be needed to produce a confidence interval half as wide. 41