Chapter 4: Sampling and Statistical Inference Part 2: Estimation
Types of Estimates Point estimate – a single number used to estimate a population parameter Interval estimate – a range of values between which a population parameter is believed to be
Common Point Estimates
Theoretical Issues Unbiased estimator – one for which the expected value equals the population parameter it is intended to estimate The sample variance is an unbiased estimator for the population variance
Confidence Intervals Confidence interval (CI) – an interval estimated that specifies the likelihood that the interval contains the true population parameter Level of confidence (1 – a) – the probability that the CI contains the true population parameter, usually expressed as a percentage (90%, 95%, 99% are most common).
Confidence Intervals for the Mean - Rationale
Confidence Interval for the Mean – s Known A 100(1 – a)% CI is: x z/2(/n) z/2 may be found from Table A.1 or using the Excel function NORMSINV(1-a/2)
Confidence Interval for the Mean, s Unknown A 100(1 – a)% CI is: x t/2,n-1(s/n) t/2,n-1 is the value from a t-distribution with n-1 degrees of freedom, from Table A.3 or the Excel function TINV(a, n-1)
Relationship Between Normal Distribution and t-distribution The t-distribution yields larger confidence intervals for smaller sample sizes.
PHStat Tool: Confidence Intervals for the Mean PHStat menu > Confidence Intervals > Estimate for the mean, sigma known…, or Estimate for the mean, sigma unknown…
PHStat Tool: Confidence Intervals for the Mean - Dialog Enter the confidence level Choose specification of sample statistics Check Finite Population Correction box if appropriate
PHStat Tool: Confidence Intervals for the Mean - Results
Confidence Intervals for Proportions Sample proportion: p = x/n x = number in sample having desired characteristic n = sample size The sampling distribution of p has mean p and variance p(1 – p)/n When np and n(1 – p) are at least 5, the sampling distribution of p approach a normal distribution
Confidence Intervals for Proportions A 100(1 – a)% CI is: PHStat tool is available under Confidence Intervals option
Confidence Intervals and Sample Size CI for the mean, s known Sample size needed for half-width of at most E is n (z/2)2(s2)/E2 CI for a proportion Sample size needed for half-width of at most E is Use p as an estimate of p or 0.5 for the most conservative estimate
PHStat Tool: Sample Size Determination PHStat menu > Sample Size > Determination for the Mean or Determination for the Proportion Enter s, E, and confidence level Check Finite Population Correction box if appropriate
Confidence Intervals for Population Total A 100(1 – a)% CI is: N x tn-1,a/2 PHStat tool is available under Confidence Intervals option
Confidence Intervals for Differences Between Means Population 1 Population 2 Mean m1 m2 Standard deviation s1 s2 Point estimate x1 x2 Sample size n1 n2 Point estimate for the difference in means, m1 – m2, is given by x1 - x2
Independent Samples With Unequal Variances A 100(1 – a)% CI is: x1 -x2 (ta/2, df*) Fractional values rounded down df* =
Independent Samples With Equal Variances A 100(1 – a)% CI is: x1 -x2 (ta/2, n1 + n2 – 2) where sp is a common “pooled” standard deviation. Must assume the variances of the two populations are equal.
Paired Samples A 100(1 – a)% CI is: D (tn-1,a/2) sD/n Di = difference for each pair of observations D = average of differences PHStat tool available in the Confidence Intervals menu
Differences Between Proportions A 100(1 – a)% CI is: Applies when nipi and ni(1 – pi) are greater than 5
Sampling Distribution of s The sample standard deviation, s, is a point estimate for the population standard deviation, s The sampling distribution of s has a chi-square (c2) distribution with n-1 df See Table A.4 CHIDIST(x, deg_freedom) returns probability to the right of x CHIINV(probability, deg_freedom) returns the value of x for a specified right-tail probability
Confidence Intervals for the Variance A 100(1 – a)% CI is:
PHStat Tool: Confidence Intervals for Variance - Dialog PHStat menu > Confidence Intervals > Estimate for the Population Variance Enter sample size, standard deviation, and confidence level
PHStat Tool: Confidence Intervals for Variance - Results
Time Series Data Confidence intervals only make sense for stationary time series data
Probability Intervals A 100(1 – a)% probability interval for a random variable X is an interval [A,B] such that P(A X B) = 1 – a Do not confuse a confidence interval with a probability interval; confidence intervals are probability intervals for sampling distributions, not for the distribution of the random variable.