Presentation on theme: "SADC Course in Statistics (Session 20)"— Presentation transcript:
1SADC Course in Statistics (Session 20) Non-sampling errors(Session 20)
2Learning Objectives By the end of this session, you will be able to describe the types of non-sampling errors that arise in survey workexplain actions that may be taken to minimise commonly occurring non-sampling errorshave a greater appreciation that sampling errors is only a small component of all errors that may arise and that close attention to reducing non-sampling errors is equally or more important in survey work.
3Non-sampling errors: 1Non-sampling errors cover all errors other than those due to sampling a subset of the population.In both surveys and censuses, it is quite usual to find non-sampling errors because their absence implies that the data collection process has been:Implemented and enumerated perfectly, &Completely free of measurement errors, i.e. inaccuracies in the recording of information from selected units.
4Non-sampling errors: 2Non-sampling errors are not all due to avoidable mistakes and/or deficiencies.They can often occur because of decisions by the researchers to balance the need for good quality data with the need to obtain timely data at acceptable cost.The problem then reduces to one of defining and minimising errors associated with the data collection and data processing procedures.
5Types of non-sampling errors Non-sampling errors can be of various typesCoverage (or Frame) errorsNon-response errorsMeasurement errorsData handling errorsNote that the first more often applies to sample surveys, while the last three apply to both surveys and censuses.
6Coverage (frame) errors In surveys, the sample is selected from a list, i.e. a sampling frame, of all population members.An inadequate frame leads to coverage errors. Often can have eitherunder-coverage (missing elements), orover-coverage (duplicates)Both lead to biased results. See below for an example.
7Minimising frame errors For under-coverage, consider re-defining the population, i.e. the target population is simply considered as the population which can be accessed by the frame.For duplicates, develop a system to identify the duplicates, e.g. by using additional information on the recording unit.Both under & over-coverage are minimised by using up-to-date frames, e.g. in UK the Postcode Address File is updated every 3 months, and is hence often used by the Office of National Statistics (ONS).
8Non-response errors Non-response errors are all errors arising from: Unit non-response, i.e. failure to obtaininformation from a pre-chosen sampling unit or population unitItem non-response, i.e. failure to get aresponse to a specific question or item in the data recording form.
9Types of non-response errors Discussion:What are the typical forms of non-response (both unit and item non-response) you encounter in your work?What are the reasons for non-response?How can such non-response errors be minimised?
10Measurement Errors Measurement errors arise when the recorded response differs from the true value.They can occur for a variety of reasons, e.g.by respondent (e.g. heads of households) giving an incorrect answerbecause of instrument or question errorby interviewer error.Further, errors may be greater for some sub-groups of the population, e.g. those less literate, or those unwilling to co-operate.
11Reasons for respondent errors Respondent errors arise for many reasons e.g.respondent gives an incorrect answer, e.g. due to prestige or competence implications, or due to sensitivity or social undesirability of questionrespondent misunderstands the requirementslack of motivation to give an accurate answer“lazy” respondent gives an “average” answerquestion requires memory/recallproxy respondents are used, i.e. taking answers from someone other than the respondent.How can such errors be minimised?
12Instrument Errors Instrument or question errors arise when The question is unclear, ambiguous or difficult to answerthe list of possible answers suggested in the recording instrument is incompleterequested information assumes a framework unfamiliar to the respondentthe definitions used by the survey are different from those used by the respondent (e.g. how many part-time employees do you have? See next slide for an example)How can such errors be minimised?
13An example of instrument error The following example is from Ruddock (1998) – see slide 18In the Short Term Employment Survey (STES) conducted by Office of National Statistics in UK, data are collected on numbers of full-time and part-time employees on a given reference date.Some firms ignored the reference date and gave figures for employees paid at the end of the month, thus including those who joined and those who left in that month – leading to an over-estimate.Firms found it difficult to give details of part-time employees as their definition of “part-time” did not agree with that used by ONS.
14Interviewer errors Interviewer errors arise when different interviewers administer a survey in different waysdifferences occur in reactions of respondents to different interviewers, e.g. to interviewers of their own sex or own ethnic groupinadequate training of interviewersinadequate attention to the selection of interviewersthere is too high a workload for the interviewerHow can such errors be minimised?
15Data handling errorsData handling errors can occur from the stage of data collection up to the final stages of data analysis. Types of errors that can arise include:-errors in transmission of data from the field to the officeerrors in preparing the data in a suitable formatfor computerisation, e.g. during coding ofqualitative answerserrors in computerisation of the dataerrors during data analysis, e.g. imputation andweighting.Do any of these types of error occur in your work. If so, what can you do to minimise them?
16Measuring non-sampling errors Measuring non-sampling errors is difficult and often impossible. Attempts have often been through specific additional studies, e.g. characteristics of non-respondents in the 1996 British Crime Survey were investigated by a mini-questionnaire to those living in 25% of non-responding addresses.Several studies to assess non-sampling errors can be found in Ruddock (1998) (see slide 18 for full ref.) & in Lessler, J.T. and Kalsbeek, W.D. (1992) Non-sampling error in surveys; Wiley.
17Non-sampling errors: Key Points Non-sampling errors are inevitable in production of national statistics. Important that:-At planning stage, all potential non-sampling errors are listed and steps taken to minimise them are considered.If data are collected from other sources, question procedures adopted for data collection, and data verification at each step of the data chain.Critically view the data collected and attempt to resolve queries immediately they arise.Document sources of non-sampling errors so that results presented can be interpreted meaningfully.
18ReferencesRuddock, V. (1998) “Measuring and Improving Data Quality” UK Govt. Statistical Service Methodology Series No. 14, for a very comprehensive coverage of non-sampling errors. This document may be downloaded fromLepkowski, J. (2004) Non-observation error in household surveys in developing countries. Chapter VIII, pp of the UN Publication An Analysis of Operating Characteristics of Household Surveys in Developing and Transition Countries: Survey Costs, Design Effects and Non-Sampling Errors. Available at