2 MICS Sample Design MICS is a complex survey (Multi-stage stratified). MICS is a worldwide program, consistence & comparability are important issues.We will discuss only a few of the highlights including:Sample size determinationStratification and sample allocationNumber of Primary Sampling Units and cluster sizesUse of existing sample or new sampleA few special topics
3 Sample Size for MICSMost important feature of MICS with respect to survey costs. We will discuss:DETERMINANTS – factors, constraintsINDICATORS to useFORMULA to calculate sample size
4 Determinants of Sample Size (Factors and Constraints) Sample size (households) depends on many factors:Expected size estimate of indicatorsExpected size estimate of target population(s)Average household sizeMargin of error wantedLevel of confidence wanted“Design effect” (increase in sample error due to use of cluster survey instead of simple random sample)Expected non-response rateNumber of clusters or PSUsCluster size (number of households per sample cluster)Number of sub-national areas for separate estimates (domains)Survey budget and implementing capability
5 MICS Recommendations on Sample Size Determinants FACTOR RECOMMENDATION1.Expected size estimate of indicators (next slide)2.Expected size estimate of target population mos [3%]3.Average household size 6 persons4.Relative margin of error wanted 12% of coverage rate5.Level of confidence wanted 95 percent6.Design effect in cluster surveys 1.57.Expected non-response rate 10 percent8.Number of clusters or PSUs - minimum [ ]9.Cluster size [15-35]10.Number of estimation “domains” wanted [5 or fewer]11.Survey budget (country specific)For items 2, 3, 6, 7 use available country data (recent survey or census); if not available, use value above.
6 Indicators for Sample Size Determination Sample size is different for each MICS indicator.Must choose a key indicator, since only one sample size can be used in MICS.Recommendations for choosing key indicator:Choose from among main indicators of interest in your country.Choose the one which will yield largest sample size.Usually for a single-year age group, andUsually DPT, measles, polio or tuberculosis immunization - or birth weight below 2.5 kgExceptions: Do not choose infant or maternal mortality rates as the key indicators. Do not choose a low coverage indicator that is desirably low (such as malnutrition prevalence). Do not choose breast-feeding indicators for 4-month age groups.
7 Checklist for Target Group and Indicator To decide on the appropriate target group and indicator that you need to determine your sample size:1. Pick children months old - the target population that comprises the smallest percentage of the total population – probably about 3 percent.2. For that target group, pick the lowest from among the following coverage rates:- DPT immunization level- Measles immunization level- Polio immunization level- Tuberculosis immunization level3. Do not pick from the desirably low coverage indicators that is already acceptably low.
8 Formula for Sample Size Different formula than MICS2000MICS2005 formula emphasizes relative margin of error* instead of 5% absolute error (high coverage indicator) or 3% for low coverage indicator.Less confusingDoes not depend on high or low coverage* The Relative Margin of Error is the percentage of tolerable difference that the estimated proportion can differ from its true value with a given confidence level. It determines the relative length of the confidence interval.
9 Formula n = [4 (r) (1 - r) (deff) (1.1)] / [(.12r )2(p)(ave-size)] wheren is the required sample size, expressed as number of households, for the KEY indicator4 is factor to achieve 95 percent level of confidence,r is anticipated prevalence (coverage) rate for key indicator,1.1 is factor to raise sample size by 10 percent for potential nonresponse,deff is shortened symbol for design effect,0.12r is margin of error to be tolerated, defined as 12 percent of r (12 percent thus represents the relative sampling error of r),p is proportion of total population that smallest group comprises, andave-size is average household size.You may use the table on the next page instead of formula if all conditions are satisfied for that table in your country.
10 Sample Size (Households) Calculation for Proportion Estimation Using Smallest Target Population
11 Example 1 Target group: Children 12 to 23 months old Percent of population: 3 percentKey indicator: DPT immunization coveragePrevalence (Coverage): 30 percentDeff: No informationNon-response: No informationAverage household size: 6Checking table => n = 5941
12 Checklist for Use of Sample Size formula The formula to determine your sample size :n = [4 (r) (1 - r) (f) (1.1)] / [(.12r)2 (p) (nh)].Use it if any (one or more) of the following applies in your country:p – the proportion of one-year-old children is other than 3%nh – the average household size is less than 4.5 persons or greater than 6.5r – the coverage rate of your key indicator is under 20 or over 40 percentf - the sample design effect for your key indicator is different from 1.5, according to accepted estimates from other surveys in your countryyour anticipated non-response rate is more or less than 10 percent.
13 Example 2 Target group: Children 12 to 23 months old Percent of population: 3.5 percentKey indicator: DPT immunization coveragePrevalence (Coverage): 25 percentDeff: 1.6Non-response adjustment = 1.05 (response rate 95%)Average household size: 6n = [4 (.25) (.75) (1.6) (1.05)] / [(.12*.25)2 (.035) (6)] = 1.26/ = 6667.
14 Stratification & Sample Allocation Stratification is the process of regrouping similar PSUs into sub-groups (strata).Effects: better precision, flexible design, small sub-population coverage (or over sampling).How to do stratification? (region) X (residence type)Sample allocation: proportional, power allocation, equal size allocation (if budget is too tight).Implicit stratification: sort the sampling frame according to certain characters such as regions, urban-rural residence, sub-regions, districts, etc.., then select a pps sample.There is no unique rule for stratification, it depends on country situation
15 Number of PSUs and Cluster Size Survey costs depend not only on number of households but their distribution among Primary Sampling Units (PSUs).In general, the more PSUs the better for reliability but the greater the cost (usually travel costs).We recommend 300 to 400 PSUs or more.Number of PSUs also depends on cluster size.Cluster size should be as small as practical for reliability.Example: 8000 households selected in 400 PSUs of 20 households each is much more reliable sample than 200 PSUs of 40 each, but more expensive.
16 MICS Sampling Option 1 USE AN EXISTING SAMPLE Piggy-back MICS onto DHS or other survey if timely and feasible.Or, use sample from a previous survey and re-interview households for MICS.Or, use old survey sample EAs and construct new listing of households to select for MICS.Old sample must be probability-based, national in scope.Possibilities – DHS, other national health survey, recent labour force surveyPossibilities – DHS, other national health survey, recent labour force or household expenditure surveysImportant: design parameters must be known (such as selection probability, stratification, etc..)
17 OPTION 1 - USE OF AN EXISTING SAMPLE, continued Advantages of old sample- cost savings- maps available for interviewers- design rigor- simplicityLimitations of old sample- burden on respondents- sample design may need modification* sample size* sub-national coverage* number of PSUs or clusters=> Balance between loss and gain
18 MICS Sampling Option 2 USE NEW SAMPLE WITH HOUSEHOLD LISTING OPERATION Design new MICS sample based on prototypeTwo stages with census as frame (see comprehensive discussion in Chapter 4 on frame construction and up-dating old frames)Use of implicit stratification, systematic selection of census EAs at first stage with ppsCreate standard segments (DHS approach)List households in selected segmentsSelect households systematically from listInterview only the selected households, no replacement will be allowed
19 OPTION 2 - NEW SAMPLE WITH HOUSEHOLD LISTING, continued Advantages of option 2- simple design- probability-based- if possible self-weighting (national level)Limitations of option 2- expense of listing households- time necessary to list households[Example, sample size of 5000 households may need to households to be listed.]
20 DHS Method - Option 2 Create “standard” segments. Divide census population in each EA by 500 to determine number of standard segments.Map sketch segments in each EA.Choose 1 segment at random.List households in selected segment only (instead of entire EA).Purpose is to reduce listing workload to a manageable size.
21 MICS Sampling Option 3USE NEW SAMPLE WITHOUT HOUSEHOLD LISTING OPERATION(Modified Segment, or Cluster, Design)Design new MICS sample based on prototype.Two stages with census as frameUse of implicit stratification, systematic selection of census EAs at first stage with ppsPre-determine number of segments based on desired cluster size.Map sketch segments in each EA.Choose 1 segment at random.Interview all households in selected segment
22 OPTION 3 - NEW SAMPLE WITHOUT HOUSEHOLD LISTING, continued Illustration:Suppose desired cluster size is 20 households.Suppose first sample EA contains 112 census households (according to frame).Divide 112 by 20 = 5.6 (round to 6).Map sketch exactly 6 segments based on canvass of EA.Select one segment at random.Interview all households (no matter how many are currently in the selected segment).
23 OPTION 3 - NEW SAMPLE WITHOUT HOUSEHOLD LISTING, continued Advantages of option 3avoids listing completelyprobability-basedself-weighting (national level)Limitations of option 3less reliable than option 2 (households are “clustered” together in compact segments)segmentation itself can be time-consuming and complicateddifficult to control sample size
24 Special Topics Sub-national estimates, domains Water and sanitation estimatesSurvey weighting, sampling errorsOther – sample frame construction, selection techniquesCountry examples
25 Sub-national Estimates, Domains Number of separate areas (domains) for which separate, equally reliable estimates are wanted affects sample size.If, say, 5 regional estimates are wanted, then, theoretically, sample should be increased by factor of 5.Must be careful therefore in producing separate estimates for domains.Either limit number of domains to avoid large increase in sample size,Or be prepared to accept domain estimates with much higher sampling errors than national.
26 Water and Sanitation Estimates These are an important component of MICS.Sampling errors will be high, however (extremely high in some cases).MICS sample is design primarily for person variables rather than household variables such as water/sanitation.Sample design effects for water and sanitation indicators will be much higher than for other indicators.Consequently, sampling reliability is very low.Estimates can nevertheless be useful to estimate trends in water/sanitation if previous surveys exist upon which to make comparison.
27 Survey Weighting and Sampling Errors All analysis based on survey data must apply survey weights in order to prevent biased results.Survey weighting is design-specific. Non-response must be taken into account.Formulas for calculating weights depend on the exact sample design used in each country.
28 Sampling Error Estimation Calculation of sampling errors necessary to evaluate reliability of survey estimatesShould be done for important indicatorsMethodology is complex and design-specificThere are several options for sampling error calculations:May use existing software (Clusters, WesVar, CenVar, PCCarp, etc.)Latest version of SPSS currently evaluated whether new routines on sampling error are appropriate for MICS3 surveysRoutines in CSPro can be usedOr use simple, variance spreadsheet that will be available on the MICS website,
29 Sampling Error Estimation, continued With spreadsheet, only necessary to enter:Survey weights for each clusterUnweighted indicator estimate for each clusterSampling error automatically calculatedConfidence limits, design effect automatically calculated
30 Other TopicsOther key information to be included in the MICS3 manual for the sampling statistician to review:Sample frame constructionWhen new sample is used for MICSEspecially important if frame is oldSelection techniquesDetails of systematic samplingPPS sampling (probability proportionate to size)Country examples from MICS2000Papua New Guinea, Lebanon, Angola