Presentation on theme: "Calculation of Sampling Errors MICS3 Regional Workshop on Data Archiving and Dissemination Alexandria, Egypt 3-7 March, 2007."— Presentation transcript:
Calculation of Sampling Errors MICS3 Regional Workshop on Data Archiving and Dissemination Alexandria, Egypt 3-7 March, 2007
Background The sample selected in a survey is one of the many samples that could have been selected (with same design and size). Sampling errors are measures of the variability between all possible samples, which can be estimated from survey results.
Background Calculation of sampling errors is very important; -Provides information on the reliability of your results -Tells you the ranges within which your estimates most probably fall - Provides clues as to the sample sizes (and designs) to be selected in forthcoming surveys
Background MICS3 sample designs are complex designs, usually based on stratified, multi-stage, cluster samples. It is not possible to use straightforward formulae for the calculation of sampling errors. Sophisticated approaches have to be used. New versions of SPSS (13 or 14) are used for this purpose. SPSS uses Taylor linearization method of variance estimation for survey estimates that are means or proportions. This approach is used by most other package programs: Wesvar, Sudaan, Systat, EpiInfo, SAS
Background In MICS3, the objective is to calculate sampling errors for a selection of variables, for the national sample, as well as selected sub-populations, such as urban and rural areas, and regions Sampling errors will be presented as part of the final report, in Appendix C
Standard error is the square root of the variance – a measure of the variability between all possible samples
Coefficient of variation (relative error) is the ratio of SE to the estimate
Design effect is the ratio between the SE using the current design and the SE that would result if a simple random sample was used. A DEFT value of 1.0 indicates that the sample is as efficient as a SRS
Weighted and unweighted counts
Upper and lower confidence limits are calculated as p +/- 2.SE Indicate the ranges within which the estimate would fall in 95 percent of all possible samples of identical design and size
Calculating Sampling Errors Customize all syntax – SE01, SE02 and SE03. Remember to copy your customized CHRECVAC.sps syntax into the same directory. Run SE01 Sampling Error Calculation.sps. This calls SE02 Strata Pairs.sps, which pairs clusters and creates pseudo-strata, necessary for these calculations SE01 calls SE03 also, and calculates sampling errors for the variables in SE03. The model syntax includes calculations for national, urban and rural areas, and 5 regions.
How SPSS works COMPLEX SAMPLES module Can be used to select a sample, or indicate the design of the sample from which the data set comes, so that sampling error estimates can be calculated Calculations can be done for means and proportions, ratios, frequencies and crosstabs. Also possible to use general linear models and logistic regression by taking complex designs into account.
How SPSS works Prepare an analysis file to indicate the parameters that define the sample design. CSPLAN ANALYSIS /PLAN FILE='micsplan.csplan' /PLANVARS ANALYSISWEIGHT=hhweight /PRINT PLAN /DESIGN STRATA= strat CLUSTER= HH1 /ESTIMATOR TYPE=WR. Using the plan file, calculate sampling errors. CSDESCRIPTIVES /PLAN FILE = 'micsplan.csplan' /SUMMARY VARIABLES =treated iodized /MEAN /STATISTICS SE CV COUNT DEFF DEFFSQRT /MISSING SCOPE = ANALYSIS CLASSMISSING = EXCLUDE.
Compare these values to those in your tables. They should have exactly the same values as in the tables. Otherwise, there are differences between the syntax used for tabulating the indicator and the syntax used in SE03 Calculate
The output does not include confidence limits, because SPSS cannot calculate these correctly These are calculated later in the excel template
SPSS Output SPSS cannot handle normalized weights – requires the weight variable to be always above 1. We therefore multiply the sample weight variable with 1,000,000 to enable calculations. This is later corrected in the excel template
Using the Excel template Copy all values from the SPSS output to the Excel template Confidence limits are automatically calculated Copy values on the last column (Weighted count in red) onto the Weighted Count column, by using Copy – Paste Special – Values. This returns the weighted counts back to normal (divides by 1,000,000) Delete the last column NOTE: When copying values from the SPSS output to the Excel template, do not copy first to word, and then to excel. This truncates values and you obtain incorrect values in the template. Copy directly from SPSS to Excel
If the value of the indicator is or 1.000, then you should complete the row as follows If the value is based on less than 50 unweighted cases, suppress all values for that indicator, with the exception of the unweighted and weighted counts