Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Improving Data Quality. COURSE DESCRIPTION Introduction to Data Quality- Course Outline.

Similar presentations


Presentation on theme: "1 Improving Data Quality. COURSE DESCRIPTION Introduction to Data Quality- Course Outline."— Presentation transcript:

1 1 Improving Data Quality

2 COURSE DESCRIPTION

3 Introduction to Data Quality- Course Outline

4 4 What is Quality Data?

5 5 Does Data Quality Matter? Policy and program decisions Trends Modification of survey tools

6 6 Data Quality: The degree of excellence or accuracy of the factual information being collected in a survey or census needed to make it meet the user’s needs for decision making purposes.

7 7 Quality Assurance Program A good quality assurance program is an effective tool used to fine-tune the products and processes of a census or survey to prevent data errors before they happen, saving time and money.

8 8 Goals of Data Quality: Relevance Accuracy Timeliness Accessibility Interpretability Coherence

9 9 RELEVANCE Relevance is the degree to which the data meets the users’ needs. In order to meet these goals, Subject-matter specialists of the statistical organization must meet with the users to define: Items to be measured Concepts and definitions Analytical plans Tabulation plans

10 10 ACCURACY The objective of a survey or census is to obtain estimates of the true (unknown) value of a population or economic parameter. For these estimates to have any worth they must be close to the true value. Therefore, it is of utmost importance to establish accuracy as a primary goal for data production.

11 11 TIMELINESS Timeliness refers to the length of time between data availability and the event it describes. Timely information is valuable because it can still be acted upon. Timelines is usually a trade-off with accuracy.

12 12 ACCESSIBILITY The accessibility of statistical information refers to the ease with which it can be obtained from the national statistical office. This includes the ease with which the existence of information can be ascertained, as well as the suitability of the form or medium through which the information can be accessed. The cost of the information may also be an aspect of accessibility for some users.

13 13 INTERPRETABILITY The interpretability of statistical information reflects the availability of the supplementary information and metadata necessary to interpret and utilize it appropriately. This information normally covers the underlying concepts, variables, and classifications used, the methodology of collection, and indications of the accuracy of the statistical information.

14 14 COHERENCE The coherence of statistical information reflects the degree to which it can be successfully brought together with other statistical information within a broad analytical framework over time. The use of standard concepts, classifications, and target populations promotes coherence, as does the use of common methodology across surveys.

15 15 Responsibility of the Statistical Organization: Produce timely, coherent data to satisfy users’ needs, which is accessible and easily understood, while insisting on the greatest possible accuracy. Relevance Timeliness Accuracy Coherence Accessibility Interpretability

16 16 Benefits of High Quality Data: Increased use of data Increased visibility and prestige for the statistical office Generate a culture of data use and demand

17 17 Quality Assurance Program: Major components: A Training Program Quality Control Program An Evaluation Program

18 18 Purposes of Quality Control: To control the product:  Census products are the results of any work that is produced by one group of persons that will be used by another group of persons later in the census.  In order to control census products, we need definitions of acceptable for each product, decision rules to determine which products are accepted or rejected, and appropriate actions to take based on the results of the decision.

19 19 Purposes of Quality Control: To control the process  Control the methods used to monitor the operation  Control the steps that determine when an employee needs to be retrained or released

20 - User meetings -Data Collection -Post-collection processing -Design and Development -Analysis -Dissemination PLAN the products COLLECT the data DELIVER the products Documentation Customer Service Manage the Process Quality Control

21 21 Anatomy of a Survey/Census 5 Phases Contract Negotiation Design and Development Data Collection Post-Collection Processing Analysis and Dissemination Each phase has its own: Objective Key tasks Deliverables Documentation

22 22 Contract Negotiation Objective: Identify the sponsor’s needs and outline the survey(s) to meet those needs. Key Tasks: Understand the requirements Generate the contract Negotiate to final decision Gain necessary government approval/clearance Deliverables: Approved contract Rough schedules and timelines Rough questionnaire outline Documentation: Project description List of data products expected Contract

23 23 Design and Development Objective: Develop survey tools to meet the objectives, given time and cost parameters Key Tasks: Finalize schedule Sampling Create input files (listing) Develop/revise data capture systems Develop and test the questionnaire Develop training and interviewing materials Conduct field pre-test Test systems Deliverables: Sample Approved data collection/capture modes Input files (master list) Training/interview materials Analysis plan Documentation: Baseline schedule Final specifications Sampling plan Training materials Instrument documentation

24 24 Data Collection Objective: Gather raw data in a timely and cost-effective manner. Key Tasks: Conduct training Field the survey Collect the data from the field Monitoring and problem solving Deliverables: Status of each case Raw data for each case Documentation: Tracking report of field problems Progress/status reports

25 25 Post-Collection Processing Objective: Generate accurate and organized final microdata. Key Tasks: Data capture Data receipt (reformatting) Preliminary review Clean the data Imputation Weighting Generation of preliminary tables Monitoring and problem solving Deliverables: Approved internal data file (microdata) Crosstabulations and/or work tables Documentation: Data dictionary All processing specifications (coding, editing, imputation, weighting, etc.) Problem tracking and progress reports

26 26 Analysis and Dissemination Objective: Translate data into useful information that meets objectives, and distribute it to the appropriate audience. Key Tasks: Send data directly to sponsor (if applicable) Create public use file Table/publication generation Compile/produce final documentation Evaluation and debrief Deliverables: Tables for publication Reports Public use file Press Releases Documentation: Lessons learned Procedural History Reports/Publications/Press Releases Public use file Disclosure request

27 27 Activities of a Quality Assurance Program Measurement of Quality Characteristics Comparison to Pre-determined Standards Corrective Actions

28 28 Quality Control Inspections Types: Qualitative or Attribute Inspections Examination of a characteristic of interest and determination of whether a presence or absence of a certain property is there. Quantitative or Variable Inspections Measurement of the characteristic of interest on a continuous scale. Methods: Sample Inspections 100% Verification

29 29 Verification Methods Dependent Verification: Production clerk Verifier Verifier sees production clerk’s work PROBLEM: In dependent verification, the verifier may agree more often than they should since they see the production clerk’s work.

30 30 2-Way Independent Verification: Two-way match Production clerk Verifier Matcher Agreements between production clerk and verifier are correct Disagreements are reviewed by a matcher PROBLEM: Independent verification is more costly since there are three clerks involved in the process: the production clerk, the verifier, and the matcher. However, independent verification is more accurate since the verifier is not influenced by the production clerk’s work.

31 31 3-Way Independent Verification: Three-way match Production clerk Two independent verifiers Matcher Agreements of all three (clerk and verifiers) are correct If two out of three agree, an error is charged If all three disagree, no error is charged and matcher decides the correct answer Problem: more costly but more accurate


Download ppt "1 Improving Data Quality. COURSE DESCRIPTION Introduction to Data Quality- Course Outline."

Similar presentations


Ads by Google