Confidence intervals for means and proportions FETP India How sure we really are Confidence intervals for means and proportions FETP India
Competency to be gained from this lecture Calculate 95% confidence intervals for means and proportions
Key issues Concept of confidence interval Confidence interval for means Confidence interval for proportions
What we learnt so far (1/3) Population parameters are fixed We can take samples from the population Several samples of size ‘n’ are possible Each sample give estimates (e.g., means) called “statistics” Statistics vary from sample to sample This is called “Sampling fluctuation” Concept of confidence interval
What we learnt so far (2/3) The distribution of a statistic for all possible samples of given size ‘n’ is called “sampling distribution” For large ‘n’, the sampling distribution is ‘normal’ even if the original distribution is not If the original distribution is normal, the result is true even for small ‘n’ Concept of confidence interval
What we learnt so far (3/3) The mean of the sampling distribution is the ‘population mean’ The standard deviation of the sampling distribution is known as standard error SE= Population SD /√n Concept of confidence interval
Concept of confidence interval Easy to estimate the standard deviation, difficult to estimate the mean Samples generate sample means and standard error The usefulness of these parameters vary: The standard deviation from a single sample as an estimate of population SD for large ‘n’ is fair The mean from a single sample as an estimate of population mean may not be Concept of confidence interval
How can the population mean be estimated? It is desirable to give a range of values with a specific level of confidence that the true population mean is one of the values in the range We can obtain this using the sampling distribution – which is ‘normal’ using the properties of ‘normal’ distribution Mean Standard deviation Concept of confidence interval
From the standard error (SE) to the confidence interval The point estimate x (mean in the sample) is a point in the sampling distribution and there is a 95% chance that it lies in the µ1.96 SE interval But µ is not known Interchanging µ and x we can infer that there is a 95% chance that µ lies in the interval x 1.96 SE Concept of confidence interval
Inference using various levels of confidence Using the properties of the normal distribution, we can infer what proportion of the values lie between values Considering the distribution of the means: 68% of sample means will lie within 1 standard deviation above or below the sample mean 95% of sample means will lie within 1.96 standard deviation above or below the sample mean “1.96” come from the standard z table for alpha=0.05 Concept of confidence interval
Confidence interval for a mean The confidence interval of the mean gives the range of plausible values for the true population mean Confidence interval for a mean
Example of a calculation of a confidence interval for a mean Sample of 100 observations, Mean height is 68” SD: 10” Standard error of the mean = 10 / 100 = 1 95% confidence limits for population mean are 68 1.96 x (1) Approximately 66” to 70” Confidence interval for a mean
Confidence interval for a mean Interpretation of the calculation of the confidence interval for a mean The 95% confidence interval for the mean of 68 is (66, 70) This means that with repeated random sampling, 95% of the intervals will contain the true mean (µ) Since we have one of these intervals, we can be 95% confident that this interval contains the true mean Confidence interval for a mean
Calculating a 95% confidence interval for a mean in practice Epi-Info, “Epitable” module Open-Epi calculator (Open source) www.openepi.com Excel Confidence interval for a mean
Confidence interval for a mean Calculating a 95% confidence interval for a mean in OpenEpi: 1/2 (Methods) 4. Click “calculate” 2. Click “Enter” 3. Enter data 1. Choose “Mean, CI” Confidence interval for a mean
Confidence interval for a mean Calculating a 95% confidence interval for a mean in OpenEpi: 1/2 (Results) Confidence interval for a mean
Exercise to calculate the 95% confidence interval for a mean Study of gestational age at birth in the past month in a sample of health care facilities Results of the study n=350 births Sample mean= 37.5 weeks s=12.2 What is the 95% confidence interval? Confidence interval for a mean
Confidence interval for a proportion Applying the same methods to generate confidence intervals for proportions The central limit theorem also applies to distribution of sample proportions when the sample size is large enough The population proportion replaces the population mean The binomial distribution replaces the normal distribution Confidence interval for a proportion
Using the binomial distribution The binomial distribution is a sampling distribution for p Formula of the standard error: Where n = Sample size, p = proportion Confidence interval for a proportion
Using the central limit theorem As the sample n increases, the binomial distribution becomes very close to a normal distribution (Central limit theorem) Thus, we can use the normal distribution to calculate confidence intervals and test hypotheses If np and n (1-p) and equal to 10 or more, then the normal approximation may be used Confidence interval for a proportion
Confidence interval for a proportion Applying the concept of the confidence interval of the mean to proportions For means, the 95% confidence interval was: For proportions, we just replace the formula of the standard error of the mean by the standard error of the proportion that comes from the binomial distribution Confidence interval for a proportion
Calculation of a confidence interval for a proportion: Prevalence of goiter in Solan, Himachal Pradesh, India, 2005 Sample of 363 children: 63 (17%) present with goiter Standard error of the proportion 95% confidence limits for the proportion are 0.17 1.96 x (0.019) Approximately 13% to 21%
Confidence interval for a proportion Interpretation of the calculation of the confidence interval for the proportion The 95% confidence interval for the proportion of 17% is (13%, 21%) This means that with repeated random sampling, 95% of the intervals will contain the true proportion Since we have one of these intervals, we can be 95% confident that this interval contains the true proportion Confidence interval for a proportion
Calculating a 95% confidence interval for a proportion in practice Epi-Info, “Epitable” module Open-Epi calculator (Open source) www.openepi.com Confidence interval for a proportion
Confidence interval for a proportion Calculating a 95% confidence interval for a proportion in OpenEpi: 1/2 (Methods) 2. Click “Enter” 4. Click “calculate” 1. Choose “Proportion” 3. Enter data Confidence interval for a proportion
Confidence interval for a proportion Calculating a 95% confidence interval for a proportion in OpenEpi: 1/2 (Results) Confidence interval for a proportion
Exercise to calculate the 95% confidence interval for a proportion In a sample of 250 HIV infected persons with AIDS, 116 are positive for tuberculosis What is the 95% confidence interval? Confidence interval for a proportion
From estimation to testing Confidence interval is about estimating The sampling distribution can also be used to test hypotheses Statistical testing
Dealing with non-normal parent population If sample size exceeds 30, we are safe because the sampling distribution will approach the normal distribution If the sample size is smaller than 30, the distribution is different The 1.96 value will be replaced by another value coming from the t-distribution Slightly different from the normal distribution Depends upon the sample size The degrees of value will be n-1
Take home messages Confidence intervals use the central limit theorem to estimate a range of possible values for the population parameter on the basis of the sample estimate, the standard deviation and the sample size The 95% confidence intervals lies at +/- 1.92 the standard error, that is calculated using different methods for means (s/√n) and proportions (√[p(1-p)/n)]