SADC Course in Statistics (Session 20)

Presentation on theme: "SADC Course in Statistics (Session 20)"— Presentation transcript:

SADC Course in Statistics (Session 20)
Non-sampling errors (Session 20)

Learning Objectives By the end of this session, you will be able to
describe the types of non-sampling errors that arise in survey work explain actions that may be taken to minimise commonly occurring non-sampling errors have a greater appreciation that sampling errors is only a small component of all errors that may arise and that close attention to reducing non-sampling errors is equally or more important in survey work.

Non-sampling errors: 1 Non-sampling errors cover all errors other than those due to sampling a subset of the population. In both surveys and censuses, it is quite usual to find non-sampling errors because their absence implies that the data collection process has been: Implemented and enumerated perfectly, & Completely free of measurement errors, i.e. inaccuracies in the recording of information from selected units.

Non-sampling errors: 2 Non-sampling errors are not all due to avoidable mistakes and/or deficiencies. They can often occur because of decisions by the researchers to balance the need for good quality data with the need to obtain timely data at acceptable cost. The problem then reduces to one of defining and minimising errors associated with the data collection and data processing procedures.

Types of non-sampling errors
Non-sampling errors can be of various types Coverage (or Frame) errors Non-response errors Measurement errors Data handling errors Note that the first more often applies to sample surveys, while the last three apply to both surveys and censuses.

Coverage (frame) errors
In surveys, the sample is selected from a list, i.e. a sampling frame, of all population members. An inadequate frame leads to coverage errors. Often can have either under-coverage (missing elements), or over-coverage (duplicates) Both lead to biased results. See below for an example.

Minimising frame errors
For under-coverage, consider re-defining the population, i.e. the target population is simply considered as the population which can be accessed by the frame. For duplicates, develop a system to identify the duplicates, e.g. by using additional information on the recording unit. Both under & over-coverage are minimised by using up-to-date frames, e.g. in UK the Postcode Address File is updated every 3 months, and is hence often used by the Office of National Statistics (ONS).

Non-response errors Non-response errors are all errors arising from:
Unit non-response, i.e. failure to obtain information from a pre-chosen sampling unit or population unit Item non-response, i.e. failure to get a response to a specific question or item in the data recording form.

Types of non-response errors
Discussion: What are the typical forms of non-response (both unit and item non-response) you encounter in your work? What are the reasons for non-response? How can such non-response errors be minimised?

Measurement Errors Measurement errors arise when the recorded
response differs from the true value. They can occur for a variety of reasons, e.g. by respondent (e.g. heads of households) giving an incorrect answer because of instrument or question error by interviewer error. Further, errors may be greater for some sub-groups of the population, e.g. those less literate, or those unwilling to co-operate.

Reasons for respondent errors
Respondent errors arise for many reasons e.g. respondent gives an incorrect answer, e.g. due to prestige or competence implications, or due to sensitivity or social undesirability of question respondent misunderstands the requirements lack of motivation to give an accurate answer “lazy” respondent gives an “average” answer question requires memory/recall proxy respondents are used, i.e. taking answers from someone other than the respondent. How can such errors be minimised?

Instrument Errors Instrument or question errors arise when
The question is unclear, ambiguous or difficult to answer the list of possible answers suggested in the recording instrument is incomplete requested information assumes a framework unfamiliar to the respondent the definitions used by the survey are different from those used by the respondent (e.g. how many part-time employees do you have? See next slide for an example) How can such errors be minimised?

An example of instrument error
The following example is from Ruddock (1998) – see slide 18 In the Short Term Employment Survey (STES) conducted by Office of National Statistics in UK, data are collected on numbers of full-time and part-time employees on a given reference date. Some firms ignored the reference date and gave figures for employees paid at the end of the month, thus including those who joined and those who left in that month – leading to an over-estimate. Firms found it difficult to give details of part-time employees as their definition of “part-time” did not agree with that used by ONS.

Interviewer errors Interviewer errors arise when
different interviewers administer a survey in different ways differences occur in reactions of respondents to different interviewers, e.g. to interviewers of their own sex or own ethnic group inadequate training of interviewers inadequate attention to the selection of interviewers there is too high a workload for the interviewer How can such errors be minimised?

Data handling errors Data handling errors can occur from the stage of data collection up to the final stages of data analysis. Types of errors that can arise include:- errors in transmission of data from the field to the office errors in preparing the data in a suitable format for computerisation, e.g. during coding of qualitative answers errors in computerisation of the data errors during data analysis, e.g. imputation and weighting. Do any of these types of error occur in your work. If so, what can you do to minimise them?

Measuring non-sampling errors
Measuring non-sampling errors is difficult and often impossible. Attempts have often been through specific additional studies, e.g. characteristics of non-respondents in the 1996 British Crime Survey were investigated by a mini-questionnaire to those living in 25% of non-responding addresses. Several studies to assess non-sampling errors can be found in Ruddock (1998) (see slide 18 for full ref.) & in Lessler, J.T. and Kalsbeek, W.D. (1992) Non-sampling error in surveys; Wiley.

Non-sampling errors: Key Points
Non-sampling errors are inevitable in production of national statistics. Important that:- At planning stage, all potential non-sampling errors are listed and steps taken to minimise them are considered. If data are collected from other sources, question procedures adopted for data collection, and data verification at each step of the data chain. Critically view the data collected and attempt to resolve queries immediately they arise. Document sources of non-sampling errors so that results presented can be interpreted meaningfully.

References Ruddock, V. (1998) “Measuring and Improving Data Quality” UK Govt. Statistical Service Methodology Series No. 14, for a very comprehensive coverage of non-sampling errors. This document may be downloaded from Lepkowski, J. (2004) Non-observation error in household surveys in developing countries. Chapter VIII, pp of the UN Publication An Analysis of Operating Characteristics of Household Surveys in Developing and Transition Countries: Survey Costs, Design Effects and Non-Sampling Errors. Available at

Practical work follows…

Similar presentations