# Inference for a Population Proportion

## Presentation on theme: "Inference for a Population Proportion"— Presentation transcript:

Inference for a Population Proportion
We are now interested in the unknown proportion p of a population that has some outcome – call the outcome we are looking for a “success.” The statistic that estimate the parameter p is the sample proportion p-hat = count of successes count of observations

Reminders on the sampling distribution of p-hat…
The mean of the sampling distribution is p. The sample proportion p-hat is an unbiased estimator of the population proportion p. If the sample size is large enough that both np and n(1-p) are at least 10, the distribution of p-hat is approximately Normal. The standard deviation of p-hat is

Confidence Intervals for p
will be close to p if n is large, so replace the std. dev. by the standard error of .

Do smokers realize that smoking is bad for their health? Have most smokers tried to quit? The Harris Poll addressed smoking in a sample survey conducted by telephone in January 2000. Because Harris called residential telephone numbers at random, the sample (ignoring practical problems) was an SRS of smokers living in the US in households with telephone service. The sample size was n = Here is a finding from this sample survey: “Do you believe that smoking will probably shorten your life, or not?” 848 of 1010 said “Yes.” Construct and interpret a 95% confidence interval for the proportion of all American smokers who think that smoking will probably shorten their lives.

Caution Estimate may be biased if people being surveyed didn’t answer honestly Usual problem: Nonresponse. Did those who didn’t answer the survey have different habits than those who did?

2 ways to get p-star: Use a p-star based on pilot or past studies Use p-star = .5; gives largest margin of error (.5*.5 = .25, other combos will not give you as big of a result) NOTE: .5 is a conservative guess since if you choose any other values, the margin or error will be smaller.

Gloria Chavez and Ronald Flynn are the candidates for mayor
Gloria Chavez and Ronald Flynn are the candidates for mayor. You are planning a sample survey to determine what % of the voters plan to vote for Chavez. You will contact an SRS of voters, and you want to estimate p (the population proportion) with 95% confidence and a margin of error no greater than 3%. How large a sample do you need?

A New York Times poll on women’s issues interviewed 1025 women randomly selected from the United States, excluding Alaska and Hawaii. The poll found that 47% of the women said they do not get enough time for themselves. 1. The poll announced a margin of error of ±3 percentage points for 95% confidence in its conclusions. Do you agree with this margin of error? If so, explain why. If not, tell why not.  2. What is the 95% confidence interval from this poll result? Interpret the interval in context. 3. What conditions must be met in order for the confidence interval in Question 2 to be valid? Check whether each of those conditions is satisfied in this case. 4. Explain to someone who knows no statistics why we can’t just say that 47% of all adult women do not get enough time for themselves. 5. Explain clearly what “95% confidence” means.