Multiple Indicator Cluster Surveys Data Processing Workshop

Slides:



Advertisements
Similar presentations
Calculation of Sampling Errors MICS3 Regional Workshop on Data Archiving and Dissemination Alexandria, Egypt 3-7 March, 2007.
Advertisements

Review of Data Processing Steps MICS3 Data Analysis and Report Writing Workshop.
Multiple Indicator Cluster Surveys Data Processing Workshop
MICS Data Processing Workshop
MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Creating Analysis Files: Description of Preparation Steps.
Multiple Indicator Cluster Surveys Survey Design Workshop
Multiple Indicator Cluster Surveys Survey Design Workshop
Introduction Simple Random Sampling Stratified Random Sampling
Estimates and sampling errors for Establishment Surveys International Workshop on Industrial Statistics Beijing, China, 8-10 July 2013.
CTS130 Spreadsheet Lesson 20 Data Consolidation. Consolidation is a process in which data from multiple worksheets or workbooks is combined and summarized.
Deliverable 2.8: Outliers Gary Brown Office for National Statistics UK.
MICS Data Processing Workshop Tabulation Programs.
Multiple Indicator Cluster Surveys Survey Design Workshop
QBM117 Business Statistics Statistical Inference Sampling 1.
Chapter 7 Sampling Distributions
© Copyright 1992–2005 by Deitel & Associates, Inc. and Pearson Education Inc. All Rights Reserved. Tutorial 14 – Student Grades Application: Introducing.
1 BA 275 Quantitative Business Methods Statistical Inference: Confidence Interval Estimation Estimating the population mean  Margin of Error Sample Size.
T T Population Sampling Distribution Purpose Allows the analyst to determine the mean and standard deviation of a sampling distribution.
Why sample? Diversity in populations Practicality and cost.
Text Exercise 1.38 (a) (b) (Hint: Find the probability of the event in question of occurring.) In the statement of this exercise, you are instructed to.
Ratio estimation with stratified samples Consider the agriculture stratified sample. In addition to the data of 1992, we also have data of Suppose.
A new sampling method: stratified sampling
213Sampling.pdf When one is attempting to study the variable of a population, whether the variable is qualitative or quantitative, there are two methods.
8/2/2015Slide 1 SPSS does not calculate confidence intervals for proportions. The Excel spreadsheet that I used to calculate the proportions can be downloaded.
Data Interpretation and Reporting Claire Mason. Merging of sieve and laser data 1. Sieve data directly entered into an Excel spreadsheet = weights in.
Understanding sample survey data
Multiple Indicator Cluster Surveys Data Interpretation, Further Analysis and Dissemination Workshop Overview of Data Quality Issues in MICS.
Survey Methodology Sampling error and sample size EPID 626 Lecture 4.
Sampling Theory and Surveys GV917. Introduction to Sampling In statistics the population refers to the total universe of objects being studied. Examples.
Scot Exec Course Nov/Dec 04 Ambitious title? Confidence intervals, design effects and significance tests for surveys. How to calculate sample numbers when.
How survey design affects analysis Susan Purdon Head of Survey Methods Unit National Centre for Social Research.
LEARNING PROGRAMME Hypothesis testing Intermediate Training in Quantitative Analysis Bangkok November 2007.
Hypothesis Testing.
Determining Sample Size
Definitions Observation unit Target population Sample Sampled population Sampling unit Sampling frame.
Random Survey Methodology Using A Random Number Generator Michael V. Jacobs Southern Georgia Regional Commission.
1 1 Slide Chapter 7 (b) – Point Estimation and Sampling Distributions Point estimation is a form of statistical inference. Point estimation is a form of.
Kuali Budget Construction Training Catherine Maddaford KBC Administrator.
Multiple Indicator Cluster Surveys Survey Design Workshop Sampling: Overview MICS Survey Design Workshop.
18b. PROC SURVEY Procedures in SAS ®. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
Working with Spreadsheets S S T : S P R E A D S H E E T S SST 2 Objectives 1.Perform data entry tasks 2.Use formulae and functions in worksheet calculations.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 8-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
Multiple Indicator Cluster Surveys Data Processing Workshop CAPI Supervisor’s Menu System MICS Data Processing Workshop.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Information Processing Notes for beginning our Excel Unit.
XP New Perspectives on Integrating Microsoft Office XP Tutorial 3 1 Integrating Microsoft Office XP Tutorial 3 – Integrating Word, Excel, Access, and PowerPoint.
Sampling, sample size estimation, and randomisation
Lecture 4. Sampling is the process of selecting a small number of elements from a larger defined target group of elements such that the information gathered.
Lohr 2.2 a) Unit 1 is included in samples 1 and 3.  1 is therefore 1/8 + 1/8 = 1/4 Unit 2 is included in samples 2 and 4.  2 is therefore 1/4 + 3/8 =
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 7-1 Chapter 7 Sampling Distributions Basic Business Statistics.
ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 1 Training Workshop on the ICCS 2009 database Weighting and Variance Estimation picture.
ECDL. Word processing Work with documents and save them in different file formats Choose built-in options such as the Help function to enhance productivity.
Sampling Sources: -EPIET Introductory course, Thomas Grein, Denis Coulombier, Philippe Sudre, Mike Catchpole -IDEA Brigitte Helynck, Philippe Malfait,
Improving of Household Sample Surveys Data Quality on Base of Statistical Matching Approaches Ganna Tereshchenko Institute for Demography and Social Research,
Analysis Introduction Data files, SPSS, and Survey Statistics.
Two-Way (Independent) ANOVA. PSYC 6130A, PROF. J. ELDER 2 Two-Way ANOVA “Two-Way” means groups are defined by 2 independent variables. These IVs are typically.
MICS Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Creating Analysis Files: Description of Preparation Steps.
Bangor Transfer Abroad Programme Marketing Research SAMPLING (Zikmund, Chapter 12)
Rome, May 2014 Structural variables Weighting the Spanish annual subsample.
ICCS 2009 IDB Seminar – Nov 24-26, 2010 – IEA DPC, Hamburg, Germany Training Workshop on the ICCS 2009 database Weights and Variance Estimation picture.
Tutorial I: Missing Value Analysis
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
Multiple Indicator Cluster Surveys Data Processing Workshop Overview of SPSS structural check programs and frequencies MICS Data Processing Workshop.
MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Tabulation Programs.
Adjustments to the survey design: Sampling
Project 1 to 3. Project 1 (10 pts) (use the word document to enter results and answers – Save this file as Lname_BUA350_Cohort#_projects#.doc Go to Total.
Pen-size Optimization Workbook for Experimental Research design
SAMPLING (Zikmund, Chapter 12.
Chapter 8: Inference for Proportions
EXCEL Study Guide #2.
Presentation transcript:

Multiple Indicator Cluster Surveys Data Processing Workshop Sample Weights MICS Data Processing Workshop

What are sample weights ? Sample weight: a statistical correction factor used to correct for imperfections in the sample that might lead to bias: Unequal probabilities of selection Non-response Constant sampling weight: self-weighting sample

Self-weighting sample Constant sampling weight: self-weighting sample Stratum level (e.g., urban and rural within region) National level: overall self-weighting sample (almost inexistent in household surveys)

Self-weighting sample Advantages Equally representative for every unit Reduced sampling errors Disadvantages: Difficult for survey management (e.g., to distribute the work-load) because of the variant sample take by PSU Difficult to control the expected sample size

Self-weighting sample Disadvantages Self-weighting is not exact because of the rounding of the sample takes and this will bring bias in the survey estimation In most MICS surveys, if not all, samples are not self-weighting. Therefore, sample weights must be used for reporting national estimates

Example - Sample Weights For example, the weights for North and West regions (Popstan) North region 10,000/500 = 20 West region 10,000/250 = 40 In North region, each household selected represents 20 households in that region – same figure is 40 in West Overall, every household selected in Popstan represents 26.6667 households (20,000/750)

Example - Sample Weights In other words, relative to a proportional selection (should be 375 households selected from each region), more households have been selected from North, less have been selected from West This has to be “compensated” by using sample weights during analysis to re-calibrate the sample to the national level

Example - Sample Weights In our example let’s assume that: 25 percent of households in North use improved water sources 75 percent of households in West use improved water sources If the sample was selected proportionally (375 households from each region), then our survey estimate would be ((375 * 0.25) + (375 * 0.75)) / 750 = 0.50

Example - Sample Weights If we do not weight, then our national estimate will be ((500 * 0.25) + (250 * 0.75)) / 750 = 0.417 Because, we have over-sampled a region (North region) where use of improved water sources is less We need to calculate sample weights to “correct” this situation

Example - Sample Weights If we assigned a weight of 20 to each household in North, and 40 to each household in West, this would do the trick (500 * 20 * 0.25) + (250 * 40 * 0.75) ----------------------------------------------- (500 * 20) + (250 * 40) = 0.50

Example - Sample Weights This is fine, but SPSS tables would show 20,000 households as the denominator We do not want this So, we normalize the weights We calibrate (normalize) them so that the average of the weights in the data set is equal to 1

Example - Sample Weights The normalized weight for the North region is calculated as (10000/500)/(20000/750) = 0.75 And for the West region, (10000/250)/(20000/750) = 1.5 When we calculate the national use of improved water sources by using normalized weights, (500 * 0.75 * 0.25) + (250 * 1.5 * 0.75) 375 -------------------------------------------------- = ----- (500 * 0.75) + (250 * 1.5) 750

Sample weights Based on the design of the sample, there are two (common) approaches to calculating weights: Each cluster has a unique sample weight (weights.xls) Each stratum has a unique sample weight (weights_alt.xls) We have templates for both. You will need to work with your sampling expert to see which one you will use

Sample Weights Objects weights.xls spreadsheet that calculates weights weights_table.sps SPSS program that provides input data for spreadsheet weights.sps SPSS program that defines structure of spreadsheet’s output weights_merge.sps SPSS program that merges weights onto the MICS data files

Calculating sample weights The spreadsheet weights.xls is used to calculate the sample weights It has two worksheets, calculations and output. The calculations worksheet performs the calculations The output worksheet contains only the sample weights and a list of cluster numbers; format useful for reading the data into SPSS

Weights calculation template

Calculating and adding sample weights weights_table.sps produces data needed for calculating the sample weights weights_merge.sps adds the appropriate sample weights to the analysis files

Steps in calculating sample weights The process of calculating sample weights and adding them to your analysis files can be broken down into six steps

Steps in calculating sample weights Adjust the number of rows in the calculations and output worksheets so that there is one row per cluster in your survey. After you have added or deleted rows, be sure to check that doing so did not affect the totals row in the calculations worksheet

Steps in calculating sample weights Enter required information for columns B to F and for columns H and I

Steps in calculating sample weights Update the definition of strata (or domains) on lines 3 through 10 of the program weights_table.sps The standard programs assume that strata are formed by all combinations of area (that is, urban and rural) and region and that there are four regions (the program should be modified to reflect the strata or domains in use in your sample)

Steps in calculating sample weights Execute the program weights_table.sps.

Steps in calculating sample weights Copy the information in the table and paste it into the calculations worksheet of weights.xls When you complete this step, weights.xls will automatically calculate the sample weights

Steps in calculating sample weights Execute the program weights_merge.sps Once you have completed the sixth step, be sure to check the output list for error messages and to open the analysis files and confirm that the weights have been properly merged