Presentation on theme: "Multiple Indicator Cluster Surveys Data Interpretation, Further Analysis and Dissemination Workshop Overview of Data Quality Issues in MICS."— Presentation transcript:
Multiple Indicator Cluster Surveys Data Interpretation, Further Analysis and Dissemination Workshop Overview of Data Quality Issues in MICS
Data quality in MICS 2 Important to maintain data of the highest possible quality! Important to examine data quality carefully before/during the interpretation of survey findings
Looking at data quality – Why? Confidence in survey results Identify limitations in results Inform dissemination and policy formulation, avoid misleading policy makers, third parties A LL SURVEYS ARE SUBJECT TO ERRORS 3
Errors in surveys Two types of errors in surveys Sampling errors Non-sampling errors 4
Sampling error 5 The difference between estimate and true value caused because the survey questions a sample of respondents rather than the whole population.
Non-sampling errors Other types of errors, due to any stage of the survey process other than the sample design, including Management decisions Data processing Fieldwork performance, etc All survey stages are interconnected and play roles in non-sampling errors 6
Control of error in surveys Sampling errors can be estimated before data collection, and measured after data collection Non-sampling errors are more difficult to control and/or identify 7
Minimizing non-sampling errors in MICS MICS has a series of recommendations for quality assurance, including: Roles and responsibilities of fieldwork teams Easy-to-use data processing programs Training length and content Editing and supervision guidelines Survey tools Failure to comply with principles behind these recommendations leads to problems in data quality 8
MICS data quality survey tools Survey tools to monitor and improve quality, assess quality, identify non-sampling errors: Field check tables to quantitatively identify non-sampling errors during data collection and to improve quality Possible with simultaneous data entry, when data collection is not too rapid Data quality tables to be produced at the time of final report 9
Data quality tables A total of 28 tables Data quality tables to look at: Departures from expected (demographic, biological etc) patterns Departures from recommended procedures Internal consistency Completeness Indicators of performance 10
DQ.1 Age Distribution of Household Population 11 Deficit at ages 0-1? Heaping at age 5? Overall quality - heaping Deficit – males AND females? More heaping at age 50 for females than males
DQ2. Age Distribution of Eligible and Interviewed Women 12 Low response rates for women at young ages Surplus at age 50- 54?
DQ3. Age Distribution of Eligible and Interviewed Men 13 Low response rates for men at young ages Surplus at age 50- 54? Might also want to look at the number eligible/number in the household list, by age
DQ.4 Age Distribution of Children 14 Low response rates for infants? Out- transference?
DQ.5 Birth Date Reporting, Household Population 15 Is the inclusion of question on date of birth justified?
DQ.5 Birth Date Reporting, Household Population 16 Is the inclusion of question on date of birth justified?
DQ.6 to DQ.9 Birth Date and Age Reporting for women, men, under-5, and children, adolescents and young people – same structure 17 DQ.6: Birth date and age reporting: Women Percent distribution of women age 15-49 years by completeness of date of birth/age information, Country, Year Completeness of reporting of date of birth and age Total Number of women age 15-49 years Year and month of birth Year of birth and age Year of birth onlyAge onlyOther/DK/Missing Total100.0 Region Region 1100.0 Region 2100.0 Region 3100.0 Region 4100.0 Region 5100.0 Area Urban100.0 Rural 100.0 More important to have full birth dates for individual respondents, adolescents, young people
DQ.6 to DQ.9 18 Target for these columns should be 100 per cent – especially for date of last birth, as it concerns eligibility, and is a very recent occurrence
DQ.11 Completeness of Reporting In general, target is to keep incomplete (missing, DK, etc) below 5 per cent Not for all types of information – especially those that relate to eligibility 19
DQ.11 to DQ.13 Quality of anthropometric measurements Proportion measured Outliers Incomplete date of birth 20 DQ.12: Completeness of information for anthropometric indicators: Underweight Percent distribution of children under 5 by completeness of information on date of birth and weight, Country, Year Valid weight and date of birth Reason for exclusion from analysis Total Percent of children excluded from analysis Number of children under 5 Weight not measuredIncomplete date of birth Weight not measured and incomplete date of birth Flagged cases (outliers) Total100.0 Age <6 months100.0 6-11 months100.0 12-23 months100.0 24-35 months100.0 36-47 months100.0 48-59 months 100.0
DQ.12 Quality of underweight data 21 Should we actually use this data? Children excluded due to non-response or even incomplete date of birth may not be biased, but outliers is a big problem
DQ.13 Quality of stunting data 22 Should we actually use this data?
DQ.15 Heaping in anthropometric measurements 24 Some heaping for height/length
DQ.15 Heaping in anthropometric measurements 25 Usually, more heaping observed in length/height measurements than weight
DQ.16 to DQ.18 Observations of birth certificates, vaccination cards and women’s health cards Two “indicators” of data quality: Performance of interviewers Quality of information the survey collected 26
DQ.18 Women’s health cards 27 In all three tables, look for the proportion of existing documents the interviewers were able to see – as a performance indicator Also look for the proportion of documents observed out of all under-5s or women – if these documents contain better quality information, that would be an indicator of overall quality of the data
DQ.19 Observation of bednets and places for handwashing 28 Added complication of “moving kettles”
DQ.20 Person interviewed for the under-5 questionnaire 29 Universally good data
DQ.21 Random selection of children 30 Very significant improvement in the proportion of children correctly selected
DQ.22 School attendance by single age 31 Cases should fall on the diagonal – look for outliers!
DQ.23 Sex ratio at birth 32 Should be around 1.02 to 1.06 Sex ratios among living children should be lower than for children deceased
DQ.24 to DQ.26 Tables on the quality of information collected in birth histories 33
DQ.24 Births by calendar years 34 Important data quality indicator for birth histories
Check heaping – multiples of 7, days 0 and 1 Percent early neonatal should increase by period Compare with global numbers, earlier surveys 35
36 Check heaping – especially at 12 months Percent neonatal should increase by period Compare with global numbers, earlier surveys
DQ.27 Completeness of information on siblings 37 Missing information
DQ.27 and DQ.28 38 Mean sibship size should be increasing with age, due to falling fertility Look for sex ratios within normal ranges