# Healey Chapter 7 Estimation Procedures

## Presentation on theme: "Healey Chapter 7 Estimation Procedures"— Presentation transcript:

Healey Chapter 7 Estimation Procedures
Using the Sampling Distribution to Construct Confidence Intervals

Outline: The logic of estimation
How to construct and interpret confidence interval estimates for: Sample means Sample Proportions

The Logic Behind Estimation
In estimation procedures, statistics calculated from random samples are used to estimate the value of population parameters. Example: If we know that 42% of a random sample drawn from a city vote Liberal, we can estimate the percentage of all city residents who vote Liberal.

Logic (cont.) Information from samples is used to estimate information about the population. Statistics are used to estimate parameters. POPULATION SAMPLE PARAMETER STATISTIC

SAMPLING DISTRIBUTION
Logic (cont.) Sampling Distribution is the link between sample and population. The value of the parameters is unknown but characteristics of the Sampling Distribution are defined by theorems. POPULATION SAMPLING DISTRIBUTION SAMPLE

Two Estimation Procedures
1. A point estimate is a sample statistic used to estimate a population value: The London Free Press reports that “42% of a sample of randomly selected city residents voted Liberal.” 2. Confidence intervals (for means or proportions) consist of a range of values: …”between 38% and 46% of city residents voted Liberal.”

Bias and Efficiency Bias: Efficiency:
An estimator of a mean (or a proportion) is unbiased if the mean of its sampling distribution is equal to the population mean. Efficiency: The smaller the standard error (S.D. of the sampling distribution,) the more the samples are clustered about the mean of the sampling distribution This is known as efficiency.

Sample Size and Efficiency
Standard error of sampling distribution: = In looking at the formula, we can see that as sample size N increases, the standard error ( ) will decrease. The larger N is, the more efficient the estimate will be. A larger sample size means that the estimate is closer to the real population mean.

Confidence Levels Our level of confidence has to be converted into a Z-score that we will then use in our formula to find the confidence interval. The 95% confidence level means that we are willing to accept a probability of being wrong 5% of the time (or alpha (α) = .05) This probability (the area under the curve) will be divided evenly between the upper and lower tail of the distribution (.025 on either side of the curve.)

Confidence Levels (cont.)
When α = .05… …then .025 of the area is distributed on either side (C ) The .95 in the middle section is our confidence level. The cut-off between our confidence level and +/ is represented by a Z-value of +/

Z-values for Various Alpha Levels
Confidence Level α α/2 Z-score 90% /-1.65 95% /-1.96 99% /-2.58 99.9% /-3.29 (Note: Z-scores are found in Appendix A using the area for α/2)

Confidence Intervals For Means
Procedure: 1. Set the alpha (the probability that the interval will be wrong). Note that the symbol for alpha is a. Setting alpha equal to 0.05, a 95% confidence level, means the researcher is willing to be wrong 5% of the time. 2. Find the Z-value associated with alpha. If alpha is equal to 0.05, we would place half (0.025) of this probability in the lower tail and half in the upper tail of the distribution. 3. Substitute values into formula and solve. Formula: c.i. =

Example: Confidence Intervals For Means
Question: For a random sample of 178 Canadian households, average television viewing time was 6 hours/day with s = 3. What would be your estimate of the population mean viewing time, at the 95% confidence level (Alpha (α) = .05)

Example: Confidence Intervals For Means
Z-score for 95% confidence level (α+.05) is +/-1.96 Substitute all information into formula and solve: c.i. = = 6.0 ±1.96(3/√177) = 6.0 ±1.96(3/13.30) = 6.0 ±1.96(.23) = 6.0 ± .44

Example (cont.) We can estimate that households in this community average 6.0 ± .44 hours of TV watching each day. Another way to state the interval: 5.56 ≤ μ ≤ 6.44 Interpretation: We estimate, with 95% confidence, that the population mean for TV watching is greater than or equal to 5.56 and less than or equal to 6.44. (This interval has a .05 chance of being wrong.)

Example (cont.) In other words:
Even if the statistic is as much as ±1.96 standard deviations from the mean of the sampling distribution the confidence interval will still include the value of μ. Only rarely (5 times out of 100) will the interval not include μ.

Confidence Intervals For Proportions
Procedure: Set alpha = .05. Find the associated Z score. Substitute the sample information into formula: c.i. = Note: s = sample proportion u (when population proportion is not known,) is set to .50

Example: Confidence Intervals For Proportions
Question: If 42% of a random sample of 764 people from an Ontario city vote Liberal, what % of the entire city vote Liberal? Hint: Don’t forget to change the % to a proportion.

Example for Proportions (cont.)
c.i. = = .42 ±1.96 (√.25/764) = .42 ±1.96 (√.00033) = .42 ±1.96 (.018) = .42 ±.04

Confidence Intervals For Proportions
Changing back to %, we estimate that 42% ± 4% of the city residents vote Liberal. Another way to state the interval: 38% ≤ Pu ≤ 46% Interpretation: We estimate that the population value is greater than or equal to 38% and less than or equal to 46% for city residents who vote Liberal. (This interval has a .05 chance of being wrong.)

Calculating Sample Sizes (note: Formula 6.4 and 6.5 in 2nd edition)

Sample sizes (cont) These formulae can be used to estimate the minimum required sample size for means or proportions. Where….. n = minimum required sample size Z = determined by your alpha level σ or Pu = population standard deviation (use s if unknown) or population proportion ME = margin of error (in +/- actual units of your desired estimate)

Practice Questions: Healey 1st Cdn #7.5, 7.7, 7.9
Healey 2nd Cdn #6.5, 6.7, 6.9