 Multistage Sampling.

Presentation on theme: "Multistage Sampling."— Presentation transcript:

Multistage Sampling

Outline Features of Multi-stage Sample Designs
Selection probabilities in multi-stage sampling Estimation of parameters Calculation of standard errors Efficiency of multi-stage samples

Introduction Multi-stage sampling means what its name suggests -> there are multiple stages in the sampling process The number of stages can be numerous, although it is rare to have more than 3 For this topic we will concentrate on two-stage sampling Also known as subsampling

Sampling Units in Multi-stage Sampling
First-stage sampling units are called primary sampling units or PSUs. Second-stage sampling units are called secondary sampling units or SSUs. Last-stage sampling units are called ultimate sampling units or USUs.

4-stage Sampling (example)
Villages EAs Dwelling Persons A B C A

Your Examples Estimation Domains Stratification Number of stages
Sampling units for each stage Sample selection scheme in each stage Sampling frames used in each stage

Example: Maldives HIES 2002

Two-Stage Sampling Stage One. Select sample of clusters from population of clusters. Using any sampling scheme, usually: SRSWOR, PPSWR, LSS Stage Two. Select sample of elements within each of the sample clusters. Language: also referred to as ‘subsample’ of elements within a cluster Subsampling can be done also using any sampling scheme

Most Large-Scale Surveys Use Multi-stage Sampling Because …
Sampling frames are available at higher stages but not for the ultmate sampling units. Construction of sampling frames at each lower stage becomes less costly. Cost efficiency with use of clusters at higher stages of selection Flexibility in choice of sampling units and methods of selection at different stages Contributions of different stages towards sampling variance may be estimated separately

Probabilities of Selection
Probability that an element in the population is selected in a 2-stage sample is the product of Probability that the cluster to which it belongs is selected at the first stage Probability that the element is selected at the second stage given that the cluster to which it belongs is selected at the first stage

Example: Two-Stage Samples

Estimation Procedures: Illustrations
Multistage Sampling Estimation Procedures: Illustrations SRS at stage 1 and SRS at stage 2 SRS at stage 1 and LSS at stage 2 (b from B) PPSWR at stage 1 and SRS at stage 2 (b from B)

SRS – SRS: Estimation of Total
Estimator of Total Variance of Estimator

SRS – SRS: Variance of Estimator
Sources of Variation = {PSUs} + {SSUs} Total variability = Variability among PSUs + Variability of SSUs

SRS-SRS: Estimating Variance
Estimator of Variance of Estimator for Total

SRS-SRS: Estimating a Mean
Each PSU has same number of elements, B Subsample of b elements is selected where

… with variance estimate

SRS-SRS: Population Mean (1) PSU’s have unequal sizes

SRS-SRS: Population Mean (2) PSU’s have unequal sizes

SRS-SRS: Population Mean (3) PSU’s have unequal sizes

SRS-LSS: Estimation of Mean

PPSWR-SRS: Estimation of Total

Design Effect for 2-stage Sample
If  is positive, the design effect decreases as the subsample size b decreases. For fixed n=ab, the smaller the sub-sample size and, hence, the larger the number of clusters included in the sample, the more precise is the sample mean.

Designing a Cluster Sample
What overall precision is needed? What size should the psus be? How many ssus should be sampled in each psu selected for the sample? How many psus should be sampled?

Choosing psu Size Often a natural unit– not much choice
Larger the psu size, more variability within a psu ICC is smaller for large psu compared to small psu but, if psu size is too large, less cost efficient Need to study relationship between psu sizes and ICC and costs

Optimum Sample Sizes (1)
Goal: get the most information (and hence, more statistically efficient) for the least cost Illustrative example: PSUs with equal sizes, SRSWOR at both stages

Optimum Sample Sizes (2)
Variance function Cost function Minimize V subject to given cost C*

Optimum Sample Sizes (3)
Minimize V subject to given cost C* Optimum a=a* and b=b*

Optimum Sample Sizes (4)
Optimum b=b*