1
Sampling Strategy for Establishment Surveys International Workshop on Industrial Statistics Beijing, China, 8-10 July 2013

2
Annual and Quarterly/Monthly Surveys Annual establishment surveys Annual establishment surveys Generally larger sample size to cover all sectorsGenerally larger sample size to cover all sectors Results needed for national accounts, planning for economic developmentResults needed for national accounts, planning for economic development Quarterly/monthly surveys Quarterly/monthly surveys Smaller sample sizeSmaller sample size May be based on subsample of annual survey, or cut-off samplingMay be based on subsample of annual survey, or cut-off sampling Designed to measure short-term trendsDesigned to measure short-term trends

3
Stratification of Sampling Frame Stratification is key concept in sampling Stratification is key concept in sampling Sampling frame divided into homogeneous groups (strata) Sampling frame divided into homogeneous groups (strata) Sample selection is independent within each stratum Sample selection is independent within each stratum Variance is calculated only within strata, not between strata Variance is calculated only within strata, not between strata Stratification can reduce the sampling error of survey estimates considerably, especially for economic surveys Stratification can reduce the sampling error of survey estimates considerably, especially for economic surveys

4
Stratification criteria For business register or list frame of establishments, main variables used for stratification are: For business register or list frame of establishments, main variables used for stratification are: ISIC groupISIC group Number of employeesNumber of employees Other measure of size (such as revenue)Other measure of size (such as revenue) ISIC groups can be defined at different levels (2 to 4 digits) ISIC groups can be defined at different levels (2 to 4 digits) Some 4-digit ISICs with many establishments may be individual stratumSome 4-digit ISICs with many establishments may be individual stratum Group similar ISICs that have less establishmentsGroup similar ISICs that have less establishments

5
Stratification criteria (continued) Stratification by region also possible depending on domains of analysis Stratification by region also possible depending on domains of analysis ISIC and size strata within each regionISIC and size strata within each region Increases sample size considerably – minimum sample size required for each regionIncreases sample size considerably – minimum sample size required for each region

6
Stratification by size Size of establishment important to provide efficient allocation of sample Size of establishment important to provide efficient allocation of sample Size strata defined within ISIC group Size strata defined within ISIC group Establish certainty stratum for largest establishments, such as 20+ or 50+ employees Establish certainty stratum for largest establishments, such as 20+ or 50+ employees Certainty cut-off may vary by ISIC groupCertainty cut-off may vary by ISIC group

7
Stratification by size (continued) Establish other strata by reasonable ranges of employment size, for example: Establish other strata by reasonable ranges of employment size, for example: 1 employee1 employee 2-4 employees2-4 employees 5-9 employees5-9 employees 10-19 employees10-19 employees 20-49 employees20-49 employees 50+ employees50+ employees Sampling rates will vary by size stratum Sampling rates will vary by size stratum

8
Misclassification or Change in ISIC or Size Group Sometimes economic activity may be misclassified in sampling frame Sometimes economic activity may be misclassified in sampling frame It is possible that an establishment may change the predominant activity It is possible that an establishment may change the predominant activity Number of employees can vary over time Number of employees can vary over time Probabilities of selection based on original frame Probabilities of selection based on original frame

9
Change in ISIC or size group (continued) Important to calculate weights and sampling errors based on ORIGINAL stratification Important to calculate weights and sampling errors based on ORIGINAL stratification Possible to post-stratify the sample by the correct ISIC and size groups for the tabulations Possible to post-stratify the sample by the correct ISIC and size groups for the tabulations Both sets of classification codes should be included in the economic survey data file Both sets of classification codes should be included in the economic survey data file

10
Sample size Sample size depends on the level of precision required for survey estimates as well as resource constraints Sample size depends on the level of precision required for survey estimates as well as resource constraints Precision measured by sampling error or coefficient of variation (CV) Precision measured by sampling error or coefficient of variation (CV) Sampling errors inversely proportional to square root of sample sizeSampling errors inversely proportional to square root of sample size

11
Sample size (continued) Required level of precision should be specified for each domain that will be included in the tabulations Required level of precision should be specified for each domain that will be included in the tabulations Accuracy also depends on bias from nonsampling errors Accuracy also depends on bias from nonsampling errors Nonsampling errors generally increase with the sample size, since quality and operational control become more difficult Nonsampling errors generally increase with the sample size, since quality and operational control become more difficult

12
Calculate sampling errors for estimates from previous surveys Previous economic survey data very useful for determining required sample size Previous economic survey data very useful for determining required sample size Estimate sampling errors for most important indicators by domain Estimate sampling errors for most important indicators by domain Identify domains where it is necessary to increase or decrease the sample size Identify domains where it is necessary to increase or decrease the sample size

13
Use of sampling frame data for estimating expected precision If sampling frame has data on number of employees, revenue or other variables, possible to estimate expected level of precision If sampling frame has data on number of employees, revenue or other variables, possible to estimate expected level of precision Simulation study to calculate the sampling error for each domain based on a specific sample size Simulation study to calculate the sampling error for each domain based on a specific sample size Use the information from the frame for the final sample to estimate the sampling errors Use the information from the frame for the final sample to estimate the sampling errors Possible to adjust the final sample size for some domains if necessaryPossible to adjust the final sample size for some domains if necessary

14
Sample allocation First level of sample allocation depends on the domains of analysis for the survey tables First level of sample allocation depends on the domains of analysis for the survey tables Minimum sample required for each ISIC group that will appear in tables Minimum sample required for each ISIC group that will appear in tables Examine the distribution of frame of establishments by ISIC group and size Examine the distribution of frame of establishments by ISIC group and size Any rare but important ISIC group can be included in the sample with certainty Any rare but important ISIC group can be included in the sample with certainty

15
Sample allocation (continued) After establishing size cut-off for certainty stratum (for example, 50+ employees, determine total number of certainty establishments After establishing size cut-off for certainty stratum (for example, 50+ employees, determine total number of certainty establishments Subtract number of certainty establishments from maximum sample size to determine sample to be allocated to remaining strata Subtract number of certainty establishments from maximum sample size to determine sample to be allocated to remaining strata

16
Sample allocation (continued) Some countries determine a practical and approximately optimum sampling rate by size stratum Some countries determine a practical and approximately optimum sampling rate by size stratum Example: Example: Sampling Rate Employment Size Stratum (Number of Persons engaged) 12-45-910-1920-4950+ 2%5%10%25%50%100%

17
Optimum allocation Neyman optimum allocation Neyman optimum allocation Used when costs per unit are similar for each stratum Used when costs per unit are similar for each stratum Formula: Formula:

18
Example of Neyman allocation ParameterSize Stratum (Number of Persons Engaged) 12-45-9Total NhNh 3,26311,3318,06022,654 ShSh 0.260.791.28 N h x S h 861.68975.9510333.620171.2 Optimum n h 484985741,120 Sampling Rate1.5%4.4%7.1%

19
Sample allocation (continued) Sampling rates by stratum from Neyman allocation used as guideline to determine sampling rates by size for each ISIC group Sampling rates by stratum from Neyman allocation used as guideline to determine sampling rates by size for each ISIC group Ensure that each ISIC group has sufficient number of observations for reliable estimates Ensure that each ISIC group has sufficient number of observations for reliable estimates

20
Cut-off sampling Select only establishments above a certain size category Select only establishments above a certain size category Remaining establishments below cut-off excluded from sampling frame Remaining establishments below cut-off excluded from sampling frame Not a probability sample Not a probability sample The establishments above cut-off may represent 90% or more of the employment in the frame The establishments above cut-off may represent 90% or more of the employment in the frame Sometimes used for monthly indicators of trends in production for largest manufacturing establishments, for example Sometimes used for monthly indicators of trends in production for largest manufacturing establishments, for example

21
Sample selection procedures – list frame from census or business register Separate all the establishments to be included in the sample with certainty Separate all the establishments to be included in the sample with certainty Sort the remaining establishments by ISIC group and size stratum, perhaps also by geography or number of employees to provide further implicit stratification Sort the remaining establishments by ISIC group and size stratum, perhaps also by geography or number of employees to provide further implicit stratification Separate individual ISIC group and size strata Separate individual ISIC group and size strata Select sample systematically with equal probability Select sample systematically with equal probability Use sampling rate established for the stratum Use sampling rate established for the stratum

22
Selection of establishments with probability proportional to size PPS sampling procedure used less frequently for selecting establishments PPS sampling procedure used less frequently for selecting establishments Within each ISIC group stratum, establishments can be selected with PPS, based on employment, revenue or another measure of size Within each ISIC group stratum, establishments can be selected with PPS, based on employment, revenue or another measure of size

23
PPS selection (continued) Advantage – larger establishments selected with higher probability Advantage – larger establishments selected with higher probability Disadvantage – weighting procedures more complex Disadvantage – weighting procedures more complex Weights vary by establishmentWeights vary by establishment Stratification by size may be simpler to implement Stratification by size may be simpler to implement

24
Sample selection procedures – area sample Determine measure of size for each segment (enumeration area) Determine measure of size for each segment (enumeration area) Total number of establishments (excluding those covered in list frame)Total number of establishments (excluding those covered in list frame) Total number of employeesTotal number of employees For each ISIC group, calculate the proportion of employees or revenue in each EA, and sum across EAs to obtain measure of sizeFor each ISIC group, calculate the proportion of employees or revenue in each EA, and sum across EAs to obtain measure of size Within each geographic stratum, select sample of EAs systematically with PPS Within each geographic stratum, select sample of EAs systematically with PPS

25
Selection of areas (continued) Conduct a listing of all establishments in each sample EA Conduct a listing of all establishments in each sample EA Match listing to business register to exclude establishments included in list frame Match listing to business register to exclude establishments included in list frame Select sample establishments systematically with equal probability within each EA, stratified at second stage by major economic activities Select sample establishments systematically with equal probability within each EA, stratified at second stage by major economic activities

A new sampling method: stratified sampling

A new sampling method: stratified sampling

