The Evolution of Survey Process Quality

The Evolution of Survey Process Quality

Concepts Survey Design Quality Quality dimensions Product quality
Process quality Organizational quality

Quality assurance Quality control Error sources Mean squared error

The Concept of a Survey concerns a set of objects comprising a
population population under study has one or more measurable properties goal is to describe the population by one or more parameters defined in terms of the measurable properties

The Concept of a Survey (con’d)
access to the population requires a frame sample is selected in accordance with a sampling design specifying a probability mechanism and a sample size

The Concept of a Survey (cont’d)
observations are made in accordance with a measurement process based on the measurements an estimation process is applied to compute estimates purpose is to make inference to the population (facts, decision-making)

Typical Shortcomings target population is changed during the study
selection probabilities are not known for all selected units correct estimation formulas are not used

Types of Surveys One-time Repeated or continuing
Attitudes, opinions Repeated or continuing Official statistics (short term indicators, agriculture, living conditions, crime) Other (drug use, consumer research, behaviors) International and comparative Official statistics (European Statistical System, poverty, water supply) Student achievement, literacy, values, happiness, marketing, attitudes

Types of organizations
Official Statistics Centralized (NSIs) Decentralized (Different agencies) General survey work Private, academic IMF, OECD, UN

Stakeholders Customers and users Researchers Survey organizations
Owners Interest organizations The general public

A Brief History Biblical censuses
Political arithmetic , Graunt and Eden The 1895 ISI proposal regarding representative investigations Bowley argues for random sampling 1913 in an attempt to connect statistical theory and sample design ISI agrees to promote extended investigation of representative methods in the mid-20s

Tschuprow, stratified random sampling, early 20s
The 1934 Neyman paper on the representative method and optimum allocation Neyman develops theories for sampling (cluster sampling, ratio estimation, two-phase sampling) and confidence intervals Fisher’s random experiments Nonsampling error theory in the 1940s

Interpenetration 1946, Mahalanobis
The US Census Bureau survey model Data quality, Kish, Zarkovich Total survey design, Dalenius 1968

Developments in other disciplines (errors and their causes)
Questions and interviewers (1917-) The response process (1968-) Sudman, Bradburn, Cannell, Tourangeau Interviewer-respondent interaction Statistical process control (SPC) Shewhart’s control chart, 1924 Administrative applications of SPC in survey work, Minton 1968

Quality Milestones Early quality management (building ships, maintaining roads, leading empires) Industrial revolution (Taylor, Benz, Ford 1910-)

The Quality Revolution Starts Here
Shewhart’s control chart for process control Dodge and Romig’s acceptance sampling A theory for statistical process control These are methods and tools to handle process variation

Juran’s spiral of progress Ishikawa’s 7 quality control tools
Deming’s 14 points Juran’s spiral of progress Ishikawa’s 7 quality control tools The Joiner Triangle (quality, scientific approach, teamwork) Taguchi’s experimental design Bottom line Recognition of the client/customer/user Increased competition A need for continuous improvement

Just a Few More Milestones
Business excellence models (ISO, EFQM, Malcolm Balridge), a clear user perspective TQM, Six Sigma, Kaizen, Lean, PDCA, BPR and more Quality assurance and quality control Standards and quality guidelines

Quality According to ISO 9001
The totality of characteristics of an entity that bear on its ability to satisfy stated and implied needs

Definitions of Quality
General Fitness for use Design Conformance In the survey context Accurate, timely, accessible plus other dimensions Advanced visual display vs tables Tolerable error

Quality Assurance and Quality Control
QA is defined as a set of activities whose purpose is to demonstrate that an entity meets all quality requirements QC is defined as a set of activities whose purpose is to ensure that all quality requirements are met

Quality Product (QP) A QP is one that meets the needs and expectations of customers/clients/users

Eurostat’s Quality Dimensions
Relevance of statistical concepts Accuracy of estimates Timeliness and punctuality in disseminating results Accessibility and clarity of the information Comparability Coherence (Completeness)

The Process View Product characteristics are established together with the user The quality of the product is decided by the processes generating the product The processes are controlled via key process variables

Assuring and Controlling Quality
Quality Level Main stake-holders Control instrument Measures and indicators Product User, client Product specs, SLA, evaluation studies, frameworks, standards Frameworks, compliance, MSE, user surveys Process Survey designer SPC, charts, acceptance sampling, risk analysis, CBM, SOP, paradata, checklists, verification Variation via control charts, other paradata analysis, outcomes of evaluation studies Organization Agency, owner, society Excellence models, ISO, CoP, reviews, audits, self-assessments Scores, strong and weak points, user surveys, staff surveys

Measuring and Documenting Quality
Accuracy can be measured Other quality dimensions are qualitative and can be seen as constraints Quality profiles Quality reports Performance measures Codes of practice

Examples of Tools - 1 or other frameworks Checklists
Self-assessment via excellence model or other frameworks Checklists Quality management (TQM, Six Sigma) External and internal auditing Customer satisfaction surveys

Examples of Tools - 2 Staff surveys
Quality control (verification, paradata) Documentation MSE component measures

Improving Quality Benchmarking Changing processes
Small steps or business process reengineering Project teams Standardization via current best methods documents or standard operating procedures and checklists Development of quality guidelines Training

Quality management philosophies
This is how I run my company Theory a la Drucker Improvement methodologies (TQM, Six Sigma, Lean) Business Excellence Models (EFQM, Malcolm Baldrige)

EQFM Model 2010

Contents of ISO 20252 (sections)
Scope Terms and definitions (Swedish translation, some terms inconsistent with terms used at Stats Sweden) Quality management system requirements (documentation, staff competence and training ) Managing the executive elements of research (research proposals, project schedules, Questionnaire Design) Data collection (field worker training, validation levels & methods, also qualitative data collection) Data management and processing (coding, data editing, data storage & data security, eg. original data shall be kept) Report on research projects

What Is Six Sigma? Results oriented management
Infrastructure and competence Problem-solving methodology

Six Sigma focuses on… variations customers processes chronic problems
results

Why the name Six Sigma? With a sigma level of 6σ a process has no more than 3.4 defects per million opportunities (dpmo)

Control chart (example)

Understanding Variation (I)
Common cause variation Common causes are the process inputs and conditions that contribute to the regular, everyday variation in a process Every process has common cause variation Example: Percentage of correctly scanned data, affected by people’s handwriting, operation of the scanner…

Understanding Variation (II)
Special cause variation Special causes are factors that are not always present in a process but appear because of particular circumstances The effect can be large Special cause variation is not present all the time Example: Using paper with a color unsuitable for scanning

Action Eliminate special cause variation
Decrease common cause variation if necessary Do not treat common cause as special cause

Översiktsbild av Partisympatiundersökningen (006)
s Översiktsbild av Partisympatiundersökningen (006)

Roots of paradata Traditional global ones such as error rates (since 1940) The Bristol monograph The 1998 ASA session in Dallas The Eurostat LEG on Quality Handbook on process quality Rapid development last 10 years

Meta and para Prefixes derived from Greek
Meta (discussions about discussions, data about data) Para (beside, near, beyond, parallel)

Mick Couper’s trilogy Data Metadata (data about data)
Paradata (data about processes)

Standards There are many standards for surveys; examples include
ISO 20252 OMB standards NCES statistical standards Quality guidelines developed by specific organizations (Stat Can; RTI; etc.) ESS Standards for survey reports

A Standard Is… A document that
describes methods and procedures for the collecting, processing, storing, and presenting survey data. define the (minimal) level of quality and effort that is acceptable for all survey processes

What purposes do survey standards serve?
Define a minimally acceptable level of quality that organizations should attain. Provide consistency across surveys in different organizations Facilitate communication of complex concepts, formulas, procedures and methodologies Provide transparency of the methodologies used to produce a survey data set. Transfers skills and knowledge of best survey practice

We Concentrate on Accuracy
Data must be of sufficient quality for decision-making Other dimensions are constraints Accuracy is much more difficult to understand It is important to convey information on error sources and their contributions to total survey error Accuracy is measured by the mean squared error, MSE

Two Routes to Handling Survey Errors
Get an estimate of MSE so that we get confidence or other intervals that we can trust Try to develop and use methods that are almost error-free so that the estimated variance becomes an approximation of the MSE

What is mean squared error?
MSE = Bias2 + Variance = (Bspec + BNR + BFr + Bmeas + BDP)2 + Varsamp + Varmeas + VarDP

The Survey Process Revisited

3M Survey Life Cycle Paradigm
Copyright CCSG - ccsg.isr.umich.edu

Examples of issues Research questions and survey questions
General survey design Target population Main mode or mix of modes Developing the instrument Sampling design Data collection Data processing Estimation Providing survey results Quality assurance Quality control Evaluation

Objective of Survey Design
Maximize survey quality for given budget or Minimize cost of achieving specified level of quality

Sampling Error Nonsampling Error
Survey Error Sampling Error Nonsampling Error Due to selecting Errors due to a sample instead of mistakes or system the entire pop’n. deficiencies.

Nonsampling Error Specification Error Frame Error Nonresponse Error
Processing Error Measurement Error

Risk of Bias and Variance by Error Source

How do we estimate bias? Obtain measurements that are essentially error free (“gold standard measurements”) Implement preferred survey methods on a limited basis Record checks Comparisons to external gold standard estimates Census, CPS, other high quality national surveys Modelling attempts

Effects of Nonsampling Errors of Estimates - 1
Variable errors increase the variances of means, totals, and proportions Confidence levels for interval estimates may be over-stated Systematic errors bias the estimates of means, totals, and proportions

Effects of Nonsampling Errors of Estimates - 2
Both variable and systematic errors bias estimates of correlation and regression coefficients The nominal level of Type I error can be either to high or too low in the presence of nonsampling errors.

Total Survey Error Sampling usually more efficient than census
Sampling error predictable Nonsampling error nonpredictable Find the balance Use risk management

Conclusions Survey design involves allocations of resources using incomplete and imperfect information Objective should be to minimize total error subject to cost constraints

Specific Error Sources

Specification Error Concepts Objectives
Subject matter problem translated into a statistical problem Mismatch between research question and survey question Are all research questions covered?

Frame Errors Coverage errors Classification errors Missing units
Duplications Extraneous units Classification errors Industry (e.g., Standard Industry Classification (SIC)) Geography Size

Frame Errors (Cont’d) Contact errors Other errors
Address incomplete or incorrect Contact name Phone number Other errors Unit structure error Frame not current => Errors

Not on Frame On Frame = mean for entire target pop’n

Relative Bias Due to Coverage Error

Coverage Bias as a Function of tc and the Relative Difference Btwn and
25 tc = .50 20 Relative Coverage Bias (%) tc = .70 15 10 tc = .90 5 -50 -40 -30 -20 10 20 30 40 50 -10 -5 -10 -15 -20 -25 Relative Difference Btwn Covered & Noncovered (%)

Nonresponse Error Unit nonresponse Noncontacts Refusals
Item nonresponse Individual questions skipped

Nonresponse Bias Total Population Nonrespondents Respondents

Relative Bias Due to Nonresponse
RBNR = (1- tR) ( YR - YNR Y )

Example tR = Response rate for a tele. survey = 75%
YR = Av. income for respondents = 107 Kr. YNR = Av. income for nonrespondents = 89 Kr. Y = .75 (107) (89) = Kr. Example

RB = (.25) = or 4.4% ( ) 102.50 Example

Relative Diff Btwn Respondents & Nonrespondents (%)
Response Bias as a Function of tR and the Relative Difference Btwn YR and YNR 25 tR = .50 Relative Response Bias (%) 20 tR = .70 15 10 tR = .90 5 -50 -40 -30 -20 10 20 30 40 50 -10 -5 -10 -15 -20 -25 Relative Diff Btwn Respondents & Nonrespondents (%)

Components of Response and Nonresponse

Estimating the Unresolved Units That Are In-Scope
(4) (2) x (3) = (3A)

In-scope Units

Out-of-scope Units

Response Rate Components Global Process Data
Response rate (6) / [(3A)+(4)] Cooperation rate (6) / [(6)+(13)] Refusal rate (13)/ [(6)+(13)] Refusal rate (13)/ [(3A)+(4)] Nonresponse rate (7) / [(3A)+(4)]

Factors Influencing Refusals
Survey Design: Mode Respondent rule Interview length Interview period length Survey topic Questionnaire design

Respondent Characteristics:
Age, gender, income, health Urban-rural Crime rate Literacy

Interviewer Characteristics:
Age, gender, race, perceived income, etc. Prior experience (skill, confidence) Interviewer expectations Attitude, recent experience, motivation Societal Factors: Social responsibility Legitimacy of survey objective

Psychological Factors (Groves, Cialdini, Couper, 1992)
Reciprocation: Compliance as repayment for a gift, payment, or concession; benefit to R Consistency: Compliance is consistent with an announced position (belief, attitude, or value) Social Validation: More willingness to comply if one believes that similar others would also comply

Authority: Compliance is more likely if request comes from a legitimate authority
Scarcity: More willingness to comply to secure opportunities that are scarce Liking: More willingness to comply to requests from interviewers who are liked

Implications for Interviewing
Prolong Interaction: Maintain conversation to identify cues to use with psychological factors Tailoring: Adapt interviewing approach to the sample unit

Other Methods to Handle NR
Decrease respondent burden New theory for respondent-friendly questionnaires Incentives Call scheduling algorithms Adjusting for nonresponse Dillman’s TDM Mixed mode Ensuring confidentiality

The Response Process and Its Implications for Questionnaire Design
Introduction to Measurement Error The Response Process and Its Implications for Questionnaire Design Research Triangle Institute

Setting Information System Respondent Interviewer Instrument Mode
of Data Collection Setting Self- Admin

Response Processes Cannell, Miller and Oksenberg 1981 Tourangeau 1984
Cantor and Edwards 1991 Biemer and Fecso 1995 Sudman, Bradburn and Schwarz 1996 Tourangeau, Raps and Rasinski 2000 Sudman, Willimack, Nichols and Mesenbourg 2000 Willimack and Nichols 2010

Response Processes Individuals Establishments Comprehension Retrieval
Judgment and estimation Communicating an answer Encoding in memory/record formation Identification and selection of respondents Assessment of priorities Comprehension Retrieval Judgment and estimation Communicating an answer Data release

Phenomena I Middle alternatives and DK Vague terms Reference period
Satisficing Telescoping Recency Primacy Surprise questions Context effects Response alternatives effect Middle alternatives and DK Vague terms Reference period Double-barreled questions Sensitive questions

Phenomena II Social desirability bias Respondent calculations
Vague quantifiers Number of scale points Progress indicators Aided recall Labelling scale points Numerical labels Acquiescence General and specific questions CAPITALIZED TEXT Images

Implications for Questionnaire Design
Wording Length Format Open Closed Scales Filter Positioning of questions Type of question Factual Attitude Hypothetical Layout Navigation Computer-aided

Encoding/Record Formation
Description Knowledge is obtained, processed, and is either stored in memory or a physical record is made. To be retrieved the information must exist.

Types of Errors: Proxy R error; responses from R’s who really “don’t know” Memory is incomplete, distorted, or inaccurate Records are missing, incomplete, or incompatible with survey requirements

Comprehension Description
Meaning of the question, as researcher intended it, is understood by the respondent

Types of Errors: Context Errors Use of technical terms
Translation problems Misleading response alternatives

Retrieval of Information
Description Respondent retrieves relevant information from memory or from records or other external sources

Types of Errors: Forgetting Telescoping Estimating
Use of out-dated records

Judgment and Formatting a Response
Description Information is evaluated and a response is formatted corresponding to the response alternatives presented

Types of Errors: Response alternatives are too constrained
Response alternatives suggest a response distribution Respondents are pressured into giving a “top of the head” response

Response Editing and Communication
Description Respondent edits response and communicates it

Types of Errors: Social desirability effects Fear of disclosure
Acquiescent behavior

Errors Due to Interviewers and Interviewing
Interviewers and Respondents Errors Due to Interviewers and Interviewing Research Triangle Institute

The Role of the Interviewer
School A: “Standardized” interview perspective Requires interviewers to: Read questions exactly as worded Refrain from unscripted interactions Obtain a codeable response from the respondent Avoid attempts to clarify concepts unless clarifications are prescripted

School B: “Collaborative” or “conversational” interview perspective
Requires interviewers to: Detect and repair respondent misunderstanding of the question Collaborate with respondent in the interview process Make common sense inferences in recording answers Redesign questions to adapt them to the respondent’s situation

In practice Conversational flexible interviewing approach
Mixture of standardized and conversational Person-oriented style

Systematic Interviewer Errors
Discussing personal opinions with the respondent Inconsistent probing Inconsistent feedback Rewording or misinterpreting questions Falsification

Systematic Errors Poor questionnaire design systematic errors
across all respondents Interviewer error within an interviewer’s assignment

Design Factors that May Explain Interviewer Effects

Interviewer Variance Interviewer A Interviewer E Interviewer B
Interviewer C Interviewer D Interviewer Variance

Interviewer Error Model
Observed value = true value + systematic error + variable error Using this model, we estimate: rint = variance (systematic error) total variance of observed value

Consequences of Interviewer Error for Totals and Means
Variance of y (bar) is increased (i.e. multiplied) by the factor 1 + (m - 1) rint

Computation of the Increase in Variance
Suppose m = 100 and rint = .01 then 1 + (m - 1) rint = 1 + (99) x .01 = 2, approx. Example

Evaluating Interviewer Performance
Monitoring telephone interviews Tape and video recording and behavior coding Verification recontact Reinterviews On-site observations Questionnaire review Keystroke files Mock interviews

Data Collection Modes and Associated Errors
Interview Mode and Setting Data Collection Modes and Associated Errors Research Triangle Institute

Data Collection Mode Modes of data collection Choosing a mode
Data quality considerations New technologies and mode

Modes I CAPI = Computer Assisted Personal Interviewing
ACASI = Audio CASI CATI = Computer Assisted Telephone Interviewing PAPI = Paper and Pencil Interviewing CADE = Computer Assisted Data Entry

Modes II TDE = Touchtone Data Entry
CASI = Computer Assisted Self Interviewing EDI = Electronic Data Interchange DBM = Disk by mail EMS = Electronic Mail Survey VRE = Voice Recognition Entry T-ACASI = Telephone ACASI

Face-to-face Flexible Expensive
Advantages and drawbacks of interviewer Visual aids

Telephone interviewing
Similar to f-f but less flexible Fast Monitoring possible Questionnaires have to be simpler

Mail Good for sensitive topics No control over response process
Can be made respondent-friendly All survey materials must be crystal-clear Respondent sets the pace Question order effects reduced

Web Surveys Internet access varies
Differences in computer systems and browsers must be considered Good for visual stimuli Questionnaires should be short Fast and inexpensive

Diary Recall error increases over time Heavy response burden
Behavior can change temporarily Survey topic is such that total survey period is quite long

Administrative records
Errors similar to those of other modes Statisticians have sometimes no control over contents, updates, etc Statistical purposes come in second after administrative ones Conceptual differences common

Direct observation No respondents Devices and calibration problems
Various kinds of observations Counting behaviors, eye estimates, anthropology, mystery shopping, price collection, photos Observer errors (rho)

Mixed modes Can be an “optimal” solution
Can be a necessity due to frame problems or nonresponse problems Give respondents a choice Adjustment of questions and questionnaire seldom done

The Choice of Mode Each mode has advantages and disadvantages regarding Costs Measurement errors Nonresponse and coverage Flexibility Timeliness

The Decision Regarding Mode
Sometimes there is no real choice due to costs or practical constraints Often more than one mode must be used Pure mode effects difficult to assess The decision often concerns a main mode

Summary The choice of mode can be very simple or very complex
Error structures of new modes are not fully understood

Data Processing Errors and Their Control
Data Processing Quality Data Processing Errors and Their Control Research Triangle Institute

Data Processing Steps for PAPI
1. Check-in: questionnaires are collected and work units are formed 2. Scan edit: entries are inspected to avoid data entry problems 3. Data entry: questionnaire data are captured via keying, scanning or other optical sensing 4. Editing: captured data are “corrected” and “cleaned;” missing data are “imputed.”

Data Processing Error Relatively sparse literature
Some steps are very error prone (e.g., coding and editing) Errors are both systematic and variable rho Increased automation and integration reduces variable error while increasing systematic error

Data Capture Errors Keying errors
Discovered by verification keying or editing Error rates usually small based on records, fields or characters Studies often conducted in QC environments The vital few large errors can have large effects on MSE

Data Capture Errors (cont’d)
Intelligent Character Recognition Error types are substitution and rejection Substitution errors can be systematic Condition of incoming documents and the equipment is crucial which calls for continuing calibration Might have to be complemented with manual keying

Editing definition Editing is the identification and, if necessary, correction of errors and outliers in individual data used for statistics production The definition does not state that all errors be corrected or even identified Editing can be very costly

Purpose of editing To provide information about data quality (patterns and root causes) To provide information about future survey improvements To ”clean up” the data

Different Kinds of Editing
Micro-editing: Editing at record level Macro-editing: Editing at aggregate level Selective editing Output editing

The result is overediting
Historical reasons Large budgets Really QC of the data collection operation Feedback loop often missing Risk management

Key process variables for editing (examples)
Edit failure rate (#objects with edit failures/#objects edited) estimates amount of verification Correction rate (#objects corrected/#objects edited) estimates the effect Edit success rate by variable (#objects with changes on variable X/#objects with edit failures on X estimates how successfully the edits identify errors on X

Coding Classification process where open-ended responses are classified into coding categories Coding can be expensive, error-prone and boring Coding can be manual centralized or decentralized, automated or computer-assisted

Input Action Output Response Coder Code Number Coding Judgment
Instructions Nomenclature Coder Judgment Code Number Assignment The Generic Coding Process

Coding Errors Coding is subjective in nature
Error rates and variability rates can be large Coding error occurs when there is a deviation between the assigned code number and the true code number

Coding Errors (con’d) Coding errors are identified by verification
Coding rules and nomenclatures may be incomplete Errors are controlled by automation, dependent, and independent verification

Examples of Coding Error Rates
1970 Swedish Census Occupation 13.5 % Industry 9.9 % 1970 US Census Occupation 13,3 % Industry 9.1 % 1991 RTI Occupation 21% Industry 17%

Two-way Independent Verification with Adjudication
Production coding by Coder A resulting in code number xA Verification coding by Coder B resulting in code number xB Compare code numbers xA and xB xA = xB? xA = xB is the final, outgoing code number by Coder D resulting in code number xD xD is by Coder C resulting in code number xC xA , xB and xC xA = xC? or xB = xC? xA = xC xB = xC is the final, outgoing No No Yes Yes Two-way Independent Verification with Adjudication

Automated Coding There should be a computer-stored dictionary
Responses are entered online or via some other medium like scanning or keying Responses are matched with dictionary descriptions and based on that matching the responses are coded by the software or transferred to manual coding By collecting and analyzing process data the system is continually improved

Levels of Automation Computer Assisted Coding Automated
Matching can be exact or inexact Coding degrees obtained: -Purchases 73% (Sweden) -Industry and occupation 63% (US)

Key process variables in coding
Coding degree in AC and MC Effects in coding degree by updates of dictionary Coding degree by category, AC and MC Coding error rate by coders, categories, coding mode and update version CAC consultation degree by category and coder

File preparation Attaching weights to each unit
Final weight is a product of base weight and adjustment factors for nonresponse and noncoverage No theory for measurement error adjustment yet Computation can be difficult Application of disclosure avoidance techniques, macrodata and microdata

The Total Survey Error Framework

Deming (1944) “On Errors in Surveys”
American Sociological Review! First listing of sources of problems, beyond sampling, facing surveys The 13 factors

Deming’s 13 factors -The 13 factors that affect the usefulness of a survey -To point out the need for directing effort toward all of them in the planning process with a view to usefulness and funds available -To point out the futility of concentrating on only one or two of them -To point out the need for theories of bias and variability that correlate accumulated experience

Comments on Deming (1944) Does include nonresponse, sampling, interviewer effects, mode effects, various other measurement errors, and processing errors Omits coverage errors Includes nonstatistical notions (auspices) Includes estimation step errors (wrong weighting) “Total survey error” not used as a term

Sampling Text Treatment of Total Survey Error
Kish, Survey Sampling, 1965 Graphic on biases 65 of 643 pages on various errors, with specified relationship among errors

Frame biases “Consistent” Sampling Bias Sampling Biases Constant Statistical Bias Noncoverage Nonobservation Nonresponse Field: data collection Nonsampling Biases Observation Office: processing

Sampling Text Treatment of Total Survey Error
Särndal, Swensson, Wretman, Model Assisted Survey Sampling, 1992 Part IV, 124 pp. of 694, coverage, nonresponse, measurement error; omits processing error Lohr, Sampling Design and Analysis, 2009 34 of 600 pages on nonresponse, and 40 on nonsampling errors and survey quality

Other textbooks Cochran (1953). Sampling Techniques.
40 pages in concluding chapter on “sources of error in surveys” Deming (1950). Some Theory of Sampling. Starts with the 1944 factors but then continues with pure sampling Hansen, Hurwitz and Madow (1953). Sample Survey Methods and Theory, Vol 1. Nine pages on survey errors. Zarkovich Quality of Statistical Data.

Total Survey Error (1979) Anderson, Kasper, Frankel, and Associates
Empirical studies on nonresponse, measurement, and processing errors for health survey data Initial total survey error framework in more elaborated nested structure

Sampling Variable Error Field Nonsampling Processing Total Error Frame Sampling Consistent Noncoverage Bias Nonobservation Nonresponse Nonsampling Field Observation Processing

Survey Errors and Survey Costs (1989), Groves
Attempts conceptual linkages between total survey error framework and psychometric true score theories econometric measurement error and selection bias notions Ignores processing error Highest conceptual break on variance vs. bias Second conceptual break on errors of nonobservation vs. errors of observation

Mean Square Error Variance Errors of Nonobservation Observational
construct validity theoretical validity empirical validity reliability Variance Errors of Nonobservation Observational Errors Coverage Nonresponse Sampling Interviewer Respondent Instrument Mode criterion validity - predictive validity - concurrent validity Bias Errors of Nonobservation Observational Errors Coverage Nonresponse Sampling Interviewer Respondent Instrument Mode

Nonsampling Error in Surveys (1992), Lessler and Kalsbeek
Evokes “total survey design” more than total survey error Omits processing error

Components of Error Topics
Frame errors Missing elements Nonpopulation elements Unrecognized multiplicities Improper use of clustered frames Sampling errors Nonresponse errors Deterministic vs. stochastic view of nonresponse Unit nonresponse Item nonresponse Measurement errors Error models of numeric and categorical data Studies with and without special data collections

Introduction to Survey Quality, (2003), Biemer and Lyberg
Major division of sampling and nonsampling error Adds “specification error” (a la “construct validity”) or relevance error Formally discusses process quality Discusses “fitness for use” as quality definition

Sources of Error Types of Error
Specification error Concepts Objectives Data element Frame error Omissions Erroneous inclusions Duplications Nonresponse error Whole unit Within unit Item Incomplete Information Measurement error Information system Setting Mode of data collection Respondent Interview Instrument Processing error Editing Data entry Coding Weighting Tabulation

Survey Methodology, (2009) Groves, Fowler, Couper, Lepkowski, Singer, Tourangeau
Notes twin inferential processes in surveys from a datum reported to the given construct of a sampled unit from estimate based on respondents to the target population parameter Links inferential steps to error sources

Inferential Population Construct
Representation Measurement Inferential Population Construct Validity Target Population Coverage Error Measurement Sampling Frame Measurement Error Sampling Error Response Sample Processing Error Nonresponse Error Edited Data Respondents Survey Statistic

Key Statistical Developments in Total Survey Error 1
Errors of observers can be correlated (1902), Karl Pearson Interpenetrating samples (1946), Mahalanobis Criteria for true values (1951), Hansen, Hurwitz, Marks and Mauldin Essential survey conditions, correlated response variance (1959), H-H-Bershad BC survey model “mixed-error model”(1961), H-H-B

Key Statistical Developments in Total Survey Error 2
Interviewer effects using ANOVA (Kish 1962) Simple response variance via reinterviews (1964), H-H-Pritzker Relaxed assumptions of zero covariance of true values and response deviations (1964, 1974), Fellegi Errors of Measurement (1968), Cochran Estimating model components via basic study schemes using replication, interpenetreation and combinations of the two (1969), Bailar and Dalenius Estimating nonsampling variance using mixed linerar models (1978), Hartley and Rao “Error Profile” of Current Population Survey (1978), Brooks and Bailar Multi-method multi-trait models on survey measures (1984), Wothke and Browne

Weaknesses of the Common Usage of “Total Survey Error”
Notably a user perspective is missing Key quality dimensions are missing in the TSE paradigm User often cannot or prefers not to question accuracy The complexity does not invite outside scrutiny of accuracy Users not really informed about real levels of error or uncertainty We don’t really know how users perceive information on errors

Other Weaknesses of the Total Survey Error Paradigm 1
1. Lack of routine measurements No agency does this Error/quality profiles are useful but rare 2. Ineffective influence on professional standards Little expansion beyond sampling error in practice Press releases on Federal statistics rarely contain even sampling errors Survey error research compartmentalized rather than integrated Methodologists tend to specialize Root causes of error often still missing How about OMB’s requirement of NR bias studies if NR expects to exceed 20%?

Other Weaknesses of the Total Survey Error Paradigm 2
3.Large burden on design of some estimators Interpenetration, reinterviews for variance estimation complicated and costly Intractable expressions for some components 4. Some assumptions unrealistic

Strengths of the Total Survey Error Framework
Taxonomic decomposition of errors nomenclature for different components Separation of phenomena affecting statistics in different ways variance vs. bias; observation vs. nonobservation; respondent/interviewer/measurement task; processing Conceptual foundation of the field of survey methodology subfields defined by errors Tool to identifying gaps in the research literature e.g., where are the error evaluation papers on processing?

Needed Steps in a Research Agenda for Total Survey Error 1
Integrating causal models of survey errors cognitive psychological mechanisms (anchoring, recall decay) Research on interplay of two or more error sources jointly e.g., nonresponse and measurement error Research on the interplay of biases and variances e.g., does simple response variance increase accompany some response bias reductions (self-administration effects)?

4. Guidance on tradeoffs between quality measurement and quality maximization and between measures and developing error-free processes - how much should we spend on quality enhancement vs. measurement of quality (Spencer, 1985)? 5. Integrating other notions of quality into the total survey error paradigm - if “fitness for use” predominates as a conceptual base, how can we launch research that incorporates error variation associated with different uses? Australian Bureau of Statistics

6. Exploiting a multiple-mode, multiple frame, multiple phase survey world 7. Need for methodological studies to assist the user 8. Costs and risks 9. Develop theories for optimal design of specific operations, design principles 10. More standards?

Measures and Indicators of Quality

Assuring and Controlling Quality
Quality Level Main stake-holders Control instrument Measures and indicators Product User, client Product specs, SLA, evaluation studies, frameworks, standards Frameworks, compliance, MSE, user surveys Process Survey designer SPC, charts, acceptance sampling, risk analysis, CBM, SOP, paradata, checklists, verification Variation via control charts, other paradata analysis, outcomes of evaluation studies Organization Agency, owner, society Excellence models, ISO, CoP, reviews, audits, self-assessments Scores, strong and weak points, user surveys, staff surveys

Process data and Paradata

Definitions Process is a series of actions or steps towards achieving a particular end Process quality is an assessment of how far each step meets defined criteria Process variables are factors that can vary with each repetition of the process Key process variables are factors that have a large effect on process end result

Some paradata terminology
Data, Metadata, Paradata Macro paradata– global process data such as response rates, coverage rates, edit failure rates, sometimes broken down Micro paradata– process data that concern individual records such as flagged imputed records, keystroke data Formal selection, collection, and analysis of key process variables that have an effect on a desired outcome, e.g., increased productivity

Definitions of paradata
Groves and Couper: Paradata are data about the data collection process Admit that definition is not well-evolved and subject to debate Groves et al. : Process and administrative data produced auxiliary to the survey data collection. The European term “process data” takes all survey processes into account Developing terminology standards is usually a waste of time

Paradata is a subset of process data but nothing to argue about
The important thing is: Never collect data on processes that are not related to quality, every collection should be goal-driven Collecting data on processes related to quality without using SPC and other proper analysis methods is extremely wasteful If you don’t know how to analyze don’t collect

Plan for continuous improvement (of a product) Marker and Morganstein 1997
Identify critical product characteristics Develop a process flow map Determine key process variables Evaluate measurement capability Determine stability of critical processes Determine process capability Establish a system for continuous process monitoring

Product characteristics
Ideally decided by the customer Communicating concepts and innovative ideas

Flow charts Flow, decision points, customers Define owners
List process variables (those whose values can affect product characteristics) At this stage a process “variable” is much broader than what is usually meant (factors such as prices, dates, lists of customers, etc can be “variables”

Key process variables Difficult step
Key are those that have the largest effect on process outputs Collective knowledge is used in the selection process Tools include the Pareto diagram and the cause-and-effect diagram (fishbone or Ishikawa)

Measurement capability
Do not reach conclusions about process stability without knowledge about measurement errors Available data may be useless Data should allow quantification of improvement Be careful when it comes to customer satisfaction surveys

Determine stability of critical processes
Control charts Diagnose type of process variation (Assignable) special cause Common cause Take action

Determine system capability
After system changes (improvement projects) triggered by unacceptable common cause variation process stability must be reevaluated so that the new process is capable of meeting specs such as minimum response rates, minimum error rates, deadlines, etc Reduced variation is maintained by adhering to SOPs or CBMs

System for process monitoring
Processes cannot be expected to remain stable over time. Technology changes, new types of human errors, customer requirements change Thus, monitoring necessary

Paradata in coding, say, occupation
Manual: error rate by coder, category, coder experience, within- and between coder variability Computer-assisted: degree of computer-consulting, error rates combined with computer use Automated: error rates by category, coding degree in general, by dictionary update, by dictionary type

New types of paradata Interviewer notes Attributes of call attempts
Nature of interaction with sample member Behaviours during the interview Flagging imputed records Keystroke data Response latency

Importance of paradata (I)
Continuous updates of progress and stability checks (monitoring) Control charts, standard reports Managers choose to act or not to act Early warning system Input to long-run process improvement of product quality Analysis of special and common cause variation Input to methodological changes Finding and eliminating root causes of problems Research

Importance of paradata (II)
Responsive designs Simultaneous monitoring of paradata and regular survey data to improve efficiency and accuracy Input to organizational change E.g., centralization, decentralization, standardization Quality profiles, client communication, public use paradata files, inference, picturing quality over time

Exploratory analysis of paradata
Example of multivariate situation Observing one interviewer: Large % vacant housing, unusual time of interview, short interview length, response pattern does not vary much Possible curbstoning

Risks associated with paradata
Could be a lot due to automatic byproducts Could be a lot of indirect indicators of cost and quality Correct analysis approaches must be used Ethical concerns Overuse and underuse

Thoughts on development
Process indicators should be key Paradata are multivariate in nature and might have to be combined to be relevant We need to learn how to use paradata to intervene in the process as needed Create paradata archives to allow reanalysis so that understanding of what is key can grow or change Examine potential of partnership across organizations Communicating paradata with users

An Overview of Survey Error Evaluation Methods
Research Triangle Institute

Purpose of Survey Error Evaluation
Compare data collection modes or methods Optimize allocation of resources Error reduction for specific survey processes Provide users with information on data quality Adjustment estimates for nonsampling error

General Methods for Evaluation
Pretesting Experiments Statistical Process Control Process Control Key process variables Control charts Acceptance sampling Postsurvey validation

Some Techniques for Survey Evaluation I
Evaluation method Expert review of questionnaires Unstructured Structured Stage Design Purpose Identify problems with questionnaire layout and format, question wording, order and instructions

Some Techniques for Survey Evaluation II
Evaluation method Cognitive methods Behavior coding Cognitive interviewing Other cognitive lab methods Stage Design/pretest Purpose Evaluate one or more stages of the response process

Some Techniques for Survey Evaluation III
Evaluation method Debriefings Interviewer group discussions Respondent focus groups Stage Pretest/survey/post survey Purpose Evaluate questionnaire and data collection procedures

Some Techniques for Survey Evaluation IV
Evaluation method Observation Supervisor observation Telephone monitoring Tape recording/CARI Stage Pretest/survey Purpose Evaluate interviewer performance. Identify questionnaire problems.

Some Techniques for Survey Evaluation V
Evaluation method Post-survey analysis Experimentation Nonrandom observation Internal consistency External validation Stage Post-survey Purpose Compare alternative methods of data collection, estimate MSE components, validate estimates

Some Techniques for Survey Evaluation VI
Evaluation method Post-survey data collection Reinterviews Nonresponse follow-up Record checks Stage Post-survey Purpose Estimate MSE components

Basic interview-reinterview table for a dichotomous variable
Interview Reinterview 1 a b a+b c d c+d a+c b+d n

Some measures g=(b+c)/n gross difference rate or disagreement rate
A=(a+d)/n agreement rate (1-g) ndr=(b-c)/n net difference rate I=g/[p1(1-p2)+p2(1-p1)] index of inconsistency p1=(a+b)/n p2=(a+c)/n

Practical Survey Design for Minimizing MSE

What Should Be Designed?
Requirements+specifications+operations Ideal goal+ Defined goal+Actual results Good survey design means control of accuracy through the specs (QA) and control of operations (QC)

Some Early Thinking Hansen-Hurwitz-Pritzker 1967
Take all error sources into account Minimize all biases and select a minimum-variance scheme so that Var becomes an approximation of (a decent) MSE The zero defects movement that later became Six Sigma Dalenius 1969 Total survey design

Alternative Criteria of Effectiveness
Minimizing MSE for a given budget while meeting other requirements Maximizing fitness for use for a given budget Maximizing comparability for a given budget All these reversed Something else?

The Elements of Design Assessing the survey situation (requirements)
Choosing methods, procedures, “intensities”, and controls (specifications) Allocating resources Assessing alternative designs Carry out one of them or a modification of it Have a Plan B

So, What’s the Problem? No established survey planning theory
Multi-purpose, many users The information paradox Uninformed clients/users/designers Much design work is partial, not total Limited knowledge of effects of measures on MSE and cost

More Problems Decision theory and economics theory not used to their potential New surveys conducted without sufficient consideration of what is already known No one knows the proper allocation of resources put in before, during and after The literature is small

Various Skills Needed Which Calls for a Design Team
Survey methodology Subject-matter Statistics (decision theory, risk analysis, loss functions, optimization, process control) Economics (cost functions, utility) IT

Rules of the Road Use reliable methods
Develop a survey plan showing the resource allocation to each stage To be able to allocate resources optimally, collect information during planning and implementation

Rules of the Road (cont’d)
Monitor the processes that lead to the product Disseminate information on data quality to users and producers

The Balance Between Cost, Errors and Other Quality Features
Quality dimensions conflict Accuracy vs timeliness Accuracy vs relevance Comparability vs accuracy Cost vs error

Problems that Impede our Ability to Optimize Surveys
Lack of expertise The relationship between resources spent on error reduction and actual error reduction is unknown Survey errors are highly interactive

Problems that Impede our Ability to Optimize Surveys (con’d)
Major surveys are multi-purpose All quality dimensions and constraints on them limit design flexibility It is not known how to allocate resources between pilot studies, error reduction and error measurement

Bad News and Good News Bad news
Cost-survey optimization can be extremely complex and much of this complexity is unknown

Bad News and Good News Good news
Simple models describing the relationship between cost and error are still useful because often the optimum is flat

The Adaptive Element The entire survey process should be responsive to anticipated uncertainties that exist before the process begins and to real time information obtained throughout the execution of the process or Use process data (paradata) to check, and if necessary, adjust the process

We Should Assemble What We Know
Assessment methods Design principles Trade-offs and their effects The potential offered by other disciplines We shouldn’t accept partial designs

Apply Design Principles
If pop is skewed then…. If pop is nested then…. If questions are sensitive then…. If a high NR rate is expected then…

Examples of Trade-offs
Accuracy vs timeliness Response burden vs wealth of detail Conduct survey vs other information collection Large n vs smaller n Mixed vs single mode NR bias vs measurement error NR vs interpretation by family members

Example of Outline for a Survey Plan
Statement of work Technical approach Management plan Schedule of activities and deliverables Budget

Checking out the Resources
Consult in-house experts Participate in professional activities Develop current best methods for major survey processes

Checking out the Resources (cont’d)
Apply findings from the survey methods literature Consult general quality guidelines developed by prominent organizations

Examples of resources Conferences ASA AAPOR ISI Topic Journals JOS
Survey Methodology POQ Books

Using Pilot Studies to Inform Survey Design
Paradox: In principle, the survey designer needs information that will not be available until the survey has been completed The answer: Pilot studies on a smaller scale than the survey itself

Examples of Pilot Study Topics
Choice of mode Length of recall period Topic sensitivity Response burden Clarity of concepts and definitions Effect of confidentiality pledges Question wording Alternative respondent rules Time estimates Expected rates of nonsampling error Cost components

Documentation Survey administrative processes Survey plan
Revisions of plan Process details Process variables

Documentation (con’d)
Quality reports Use framework based on quality dimensions Report estimates of MSE components In absence of MSE component estimates provide indicators of quality Implement a rolling evaluation scheme

The Evolution of Survey Process Quality

Similar presentations

Presentation on theme: "The Evolution of Survey Process Quality"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

The Evolution of Survey Process Quality

Similar presentations

Presentation on theme: "The Evolution of Survey Process Quality"— Presentation transcript:

Similar presentations

About project

Feedback