1 Aspects of Sampling for Household Surveys Kathleen Beegle Workshop 17, Session 1c Designing and Implementing Household Surveys March 31, 2009.

Slides:



Advertisements
Similar presentations
Multiple Indicator Cluster Surveys Survey Design Workshop
Advertisements

Sampling.
AADAPT Workshop South Asia Goa, December 17-21, 2009 Kristen Himelein 1.
Sampling Strategy for Establishment Surveys International Workshop on Industrial Statistics Beijing, China, 8-10 July 2013.
Statistics for Managers Using Microsoft® Excel 5th Edition
Multiple Indicator Cluster Surveys Survey Design Workshop
NLSCY – Non-response. Non-response There are various reasons why there is non-response to a survey  Some related to the survey process Timing Poor frame.
QBM117 Business Statistics Statistical Inference Sampling 1.
Chapter 7 Sampling Distributions
Determining the Size of
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Fundamentals of Sampling Method
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 7-1 Chapter 7 Sampling Distributions Basic Business Statistics 10 th Edition.
The Excel NORMDIST Function Computes the cumulative probability to the value X Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc
A new sampling method: stratified sampling
Stratified Simple Random Sampling (Chapter 5, Textbook, Barnett, V
7-1 Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall Chapter 7 Sampling and Sampling Distributions Statistics for Managers using Microsoft.
Determining the Size of a Sample
Formalizing the Concepts: Simple Random Sampling.
Error and Sample Sizes PHC 6716 June 1, 2011 Chris McCarty.
Formalizing the Concepts: STRATIFICATION. These objectives are often contradictory in practice Sampling weights need to be used to analyze the data Sampling.
Impact Evaluation Session VII Sampling and Power Jishnu Das November 2006.
SAMPLING AND STATISTICAL POWER Erich Battistin Kinnon Scott Erich Battistin Kinnon Scott University of Padua DECRG, World Bank University of Padua DECRG,
FINAL REPORT: OUTLINE & OVERVIEW OF SURVEY ERRORS
Determining the Size of
Sampling Concepts Population: Population refers to any group of people or objects that form the subject of study in a particular survey and are similar.
Marketing Research Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides.
Sampling Theory and Surveys GV917. Introduction to Sampling In statistics the population refers to the total universe of objects being studied. Examples.
Chapter 5: Descriptive Research Describe patterns of behavior, thoughts, and emotions among a group of individuals. Provide information about characteristics.
Sample Design.
United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan,
The new HBS Chisinau, 26 October Outline 1.How the HBS changed 2.Assessment of data quality 3.Data comparability 4.Conclusions.
Determining Sample Size
Chapter 12: AP Statistics
Copyright 2010, The World Bank Group. All Rights Reserved. Agricultural Census Sampling Frames and Sampling Section A 1.
IB Business and Management
Sampling: Theory and Methods
Sampling: What you don’t know can hurt you Juan Muñoz.
Multiple Indicator Cluster Surveys Survey Design Workshop Sampling: Overview MICS Survey Design Workshop.
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
Copyright ©2011 Pearson Education 7-1 Chapter 7 Sampling and Sampling Distributions Statistics for Managers using Microsoft Excel 6 th Global Edition.
Scot Exec Course Nov/Dec 04 Survey design overview Gillian Raab Professor of Applied Statistics Napier University.
Variables, sampling, and sample size. Overview  Variables  Types of variables  Sampling  Types of samples  Why specific sampling methods are used.
Population and Sampling
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Lecture 9 Prof. Development and Research Lecturer: R. Milyankova
Sampling Design and Analysis MTH 494 LECTURE-12 Ossam Chohan Assistant Professor CIIT Abbottabad.
United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Bangkok,
Sampling Techniques 19 th and 20 th. Learning Outcomes Students should be able to design the source, the type and the technique of collecting data.
5-4-1 Unit 4: Sampling approaches After completing this unit you should be able to: Outline the purpose of sampling Understand key theoretical.
Copyright 2010, The World Bank Group. All Rights Reserved. Part 1 Sample Design Produced in Collaboration between World Bank Institute and the Development.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 7-1 Chapter 7 Sampling Distributions Basic Business Statistics.
Confidence Intervals (Dr. Monticino). Assignment Sheet  Read Chapter 21  Assignment # 14 (Due Monday May 2 nd )  Chapter 21 Exercise Set A: 1,2,3,7.
Targeting of Public Spending Menno Pradhan Senior Poverty Economist The World Bank office, Jakarta.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 7-1 Chapter 7 Sampling and Sampling Distributions Basic Business Statistics 11 th Edition.
Sampling and Statistical Analysis for Decision Making A. A. Elimam College of Business San Francisco State University.
Basic Business Statistics
Errors in Sampling Objective: to identify the types of nonsampling errors.
Chapter 7 Introduction to Sampling Distributions Business Statistics: QMIS 220, by Dr. M. Zainal.
Chapter 4: Designing Studies... Sampling. Convenience Sample Voluntary Response Sample Simple Random Sample Stratified Random Sample Cluster Sample Convenience.
Sampling Design and Analysis MTH 494 LECTURE-11 Ossam Chohan Assistant Professor CIIT Abbottabad.
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
As a data user, it is imperative that you understand how the data has been generated and processed…
United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Addis.
Random Sampling Error and Sample Size are Related
Chapter 7 Sampling Distributions
Random sampling Carlo Azzarri IFPRI Datathon APSU, Dhaka
Sampling and Power Slides by Jishnu Das.
SAMPLING AND STATISTICAL POWER
Sample Sizes for IE Power Calculations.
Presentation transcript:

1 Aspects of Sampling for Household Surveys Kathleen Beegle Workshop 17, Session 1c Designing and Implementing Household Surveys March 31, 2009

2 Overview Households should be selected through a documented process that gives each household in the population of interest a probability of being chosen that is positive and known This permits making inferences from the sample to the entire population with known margins of error Household samples are generally not simple random samples. They are instead Stratified  by Region, by Urban/Rural, by Intervention/Control, … Selected in two stages or more  Area Units in the first stage/s (enumeration areas, clusters)  Households in the last stage (within areas, randomly drawn)

3 Outline Sampling error Stages & stratification Design effect Implementing a sample design Non-response Non-sampling error Sampling for impact evaluation

4 Sampling Error Sampling error is the result of observing a sample of n households (the sample size) rather than all N households in the country The standard error e is a measure of a sample’s precision.  The chances for the true value of an indicator being farther than 2e apart from its sampling estimate are about 95 percent.  The standard error e decreases with the square root of the sample size n.  To reduce the error to one half, the sample size must be quadrupled.

5 Sampling Error, cont. The size of the population N has almost no influence on the size of the sample that is needed to achieve a given precision!!  To obtain national estimates, big countries and small countries require samples of about the same size. Increasing the sample size will generally reduce sampling errors  However, it is also likely to increase non- sampling errors

6 Stages & stratification Why two stages?  An updated list of all households in the country is generally unavailable  A single-stage sample would be too scattered in the territory  2 stage sampling solves these problems, but the sample becomes less precise as a result of clustering Why stratification?  In order to potentially improve precision, by gaining control of the composition of the sample  In order to provide estimates for subgroups that would otherwise be poorly represented (small regions, female-headed households, etc.)  Most stratified samples select households with unequal probabilities. This implies that the survey needs to be analyzed with weights.

7 Design Effect Any 2 households from the same area/location are more homogenous than any 2 from different locations (across areas) The combined result of clustering and stratification is called design effect. Design effect (deff) depends on the cluster size: the number of households in each enumeration area/cluster  LSMS and HIES surveys try not to exceed households per cluster.  Demographic surveys may occasionally do more Tradeoff between # of households per cluster (& lower survey costs) and error (which goes up with clustering)

8 Implementing a sample design An adequate sample frame needs to be available before a sample can be selected  A sample frame is a list of all units in the population 1 st stage: the sample frame is usually the most recent list of census enumeration areas  It needs to be linked to cartography. Usually from Nat Stats Office. 2 nd stage: the sample frame is usually developed specifically for each survey, by way of a household listing operation conducted in all EAs.  The time and budget of household listing are Small enough to be considered a marginal part of the overall data collection effort Large enough to be a headache if they are forgotten or underestimated

9 Implementing a sample design, cont. Parts of the country may need to be excluded because of  Outside the program (geographically, organizationally, …)  Security reasons  Accessibility  Nomads  Etc. That can be OK, as long as long as  Decisions are properly documented  Results are not extrapolated later to the whole country

10 Implementing a sample design, cont. Basing the sample design on some pre-identified indicator of impact.  For example, for a survey whose main focus is poverty, you may estimate the standard error of consumption expenditure for a given sample design “Power calculations” for sample design  Need to narrow down focus to a small set (or even 1) indicator to design a sample  Need to make assumptions about the mean/distribution of that indicator in the country/region of the survey: usually from existing data  Sometimes, no such data exists Avoid launching surveys that have no power calculations to determine sample design Invest in hiring a sampling expert Document!!!

11 Non-response Non-response: when a household selected for inclusion is not then interviewed. Sources of non-response: listing is outdated (e.g. household moved to a new dwelling, mortality), refusal, unable to locate. None of the following is a solution for non-response  Replace non-respondents with “similar” households  Increase the sample size to compensate for it  Use correction formulas  Use imputation techniques (hot-deck, cold-deck, warm-deck, etc.) to simulate the answers of non-respondents The best way to deal with non-response is to prevent it.

12 The required sample size n is determined by The variability of the indicator of interest Var(X) Though this is unknown…proxied by existing data/evidence The maximum margin of error E we are willing to accept. E is acceptable error (a value or percentage points). How confident we want to be in that the error of our estimation will not exceed that maximum For each confidence level α there is a coefficient t α The size of the population…….. not very important… Sample Size Averages Proportions correction for finite pop (smaller than original)

13 Non-sampling error The quality of the data depends on both sampling error (the integrity of the sample design) and non-sampling error Non-sampling error is the quality of the data collected: completeness of the listing exercise, reliability of responses, mis-reporting, recoding errors, mis-measurement in general. Larger samples, larger teams, shorter time frame for field work: can lead to more non- sampling error. Harder to supervisor and monitor field work.

14 Sampling for impact evaluation (1) Sample design usually aims to produce a sample that will measure with a certain degree of confidence the difference between participants and non-participants with respect to some indicator/outcome Indicator of interest to design the sample is usually the expected size of the effect (outcome of interest for the program)  Minimum detectable effect  Picking effect size that are too optimistic (higher effect assumed) will usually result in a sample that is too small  Still want/need pre-existing data for estimate of variance of the outcome of interest

15 Sampling for impact evaluation (2) If treatment is randomized across villages (clustered design), statistical power or precision is less than for individual randomization, often by a lot -- due to the design effect. Generally, for the sample, the number of individuals in the clusters (villages) matters less than the number of clusters.  For a program implemented at the village level, difficult to compensate for too few villages (clusters) in the treatment/control groups with high number of households surveyed in each village.