Presentation is loading. Please wait.

Presentation is loading. Please wait.

U.S. Department of the Interior U.S. Geological Survey Tutorials on Data Management Lesson 6: Manage Quality CC image by Shane Melaugh on Flickr.

Similar presentations


Presentation on theme: "U.S. Department of the Interior U.S. Geological Survey Tutorials on Data Management Lesson 6: Manage Quality CC image by Shane Melaugh on Flickr."— Presentation transcript:

1 U.S. Department of the Interior U.S. Geological Survey Tutorials on Data Management Lesson 6: Manage Quality CC image by Shane Melaugh on Flickr

2 Manage Quality Provided by DataONE Lesson Topics  Definitions  Quality assurance and Quality control  Data contamination  Types of errors  QA/QC best practices  Before data collection  During data collection/entry  After data collection/entry CC image by cobalt123 on Flickr

3 Manage Quality Provided by DataONE Learning Objectives  After completing this lesson, the participant will be able to:  Define data quality control and data quality assurance  Perform quality control and assurance on their data at all stages of the research cycle CC image by 0xFCAF on Flickr

4 Manage Quality Provided by DataONE The Data Lifecycle

5 Manage Quality Provided by DataONE Information Quality Act  Requires federal agencies to publish their "Guidelines for Ensuring and Maximizing the Quality, Objectivity, Utility, and Integrity of Information" that they disseminate Provided by Tom Chatfield, BLM

6 Manage Quality Provided by DataONE Information Quality Act  Also requires agencies to establish mechanisms for receiving requests for correction of information  Further requires that agencies report annually on the number and nature of complaints received and the steps that were taken to resolve them Provided by Tom Chatfield, BLM

7 Manage Quality Provided by DataONE What information is covered by the Act?  Information must be disseminated  Either published or publicly available on or after October 1, 2002  Agency Sponsored  Agency creates and uses this data or has decided to use this data  Could include third party data if it is used for agency decision-making and/or can be considered “endorsed” by the agency Provided by Tom Chatfield, BLM

8 Manage Quality Provided by DataONE What information is not covered?  Party-to-Party transactions  Internet hyperlinks and other references  Opinions  Press release, fact sheets, press conferences or similar communications in any medium  Public filings of information  Dissemination of information by an agency employed scientist, grantee, or contractor Provided by Tom Chatfield, BLM

9 Manage Quality Provided by DataONE What information is not covered?  Testimony and other submissions to Congress  Inadvertent or unauthorized disclosure of information intended only for inter-agency and/or intra-agency use of communication  Correspondence with individuals  Records covered by other laws, including FOIA, Privacy Act, etc.  Archived records Provided by Tom Chatfield, BLM

10 Manage Quality Provided by DataONE Objectivity, Utility, & Integrity  Objectivity - process under which the data are collected and maintained  Utility - usefulness of the information to the Intended users  Integrity - protections against unauthorized tampering Provided by Tom Chatfield, BLM

11 Manage Quality Provided by DataONE How Is Data Objective ?  Objectivity is achieved by using the same procedures to collect, verify, record, maintain, and archive data for each occurrence  Ensures that no possible bias exists  Procedures must be “Transparent”  Procedures must be “Reproducible” Provided by Tom Chatfield, BLM

12 Manage Quality Provided by DataONE How to Define Transparency  Identify:  the source  assumptions employed  analytical methods employed  statistical procedure involved  Identify/use commonly accepted standards  Document what you did Provided by Tom Chatfield, BLM

13 Manage Quality Provided by DataONE How to Define Reproducibility  If anyone follows the process the agency used, it will get the same result  An agency is not required to duplicate the process if challenged and is not compelled to run the process to certify that it can be duplicated  An agency needs to be prepared to explain the process and must ensure that the process was followed in all cases Provided by Tom Chatfield, BLM

14 Manage Quality Provided by DataONE How Is Utility of Data Determined?  Based on the relevance of data to the analysis being performed  Information Quality Act Guidelines - data’s usefulness to the public (Data must have a perceived public benefit)  Must be a connection between strategic mission requirements and data collected/maintained by an agency  Intent to eliminate collection and retention of extraneous data Provided by Tom Chatfield, BLM

15 Manage Quality Provided by DataONE Data From External Sources  Agencies must disclose what they know of the quality of the data  Transparency and reproducibility standards apply to external information  Agency can be challenged on “third party data” under the Information Quality Act Provided by Tom Chatfield, BLM

16 Manage Quality Provided by DataONE Third Party Data Challenge  The response should be to inform the party that possesses the data about the concern and notify the complainant  While the agency is not responsible for correcting the data, they can be held responsible for relying on the data for decision-making purposes Provided by Tom Chatfield, BLM

17 Manage Quality Provided by DataONE Data Integrity  Integrity refers to the protection of information from unauthorized access or revision, to ensure that the information is not compromised through corruption or falsification  Access and security controls must be sufficient to prevent contamination of the data Provided by Tom Chatfield, BLM

18 Manage Quality Provided by DataONE Request for Correction of Information Who may request a correction of information? Must be an “affected person” Anyone who may use, be benefited by, or be harmed by the disseminated information Provided by Tom Chatfield, BLM

19 Manage Quality Provided by DataONE Contents of Data Quality Correction Request  Name and contact information  A description of the information the person believes does not comply with the Guidelines  An explanation of how the information does not comply with the Guidelines  Recommendation of correction action Provided by Tom Chatfield, BLM

20 Manage Quality Provided by DataONE BLM Request for Information Correction Process  USGS? REQUEST PROCESS? Provided by Tom Chatfield, BLM

21 Manage Quality Provided by DataONE Definitions Data Contamination  Process or phenomenon, other than the one of interest, that affects the variable value  Erroneous values CC image by Michael Coghlan on Flickr

22 Manage Quality Provided by DataONE Definitions: Types of Errors  Errors of Commission  Incorrect or inaccurate data entered  Examples: malfunctioning instrument, mistyped data  Errors of Omission  Data or metadata not recorded  Examples: inadequate documentation, human error, anomalies in the field CC image by Nick J Webb on Flickr

23 Manage Quality Provided by DataONE Defining QA/QC  Strategies for preventing errors from entering a data set  Activities to ensure quality of data before collection  Activities that involve monitoring and maintaining the quality of data during the study

24 Manage Quality Provided by DataONE QA/QC Before Collection  Define & enforce standards  Formats  Codes  Measurement units  Metadata  Assign responsibility for data quality  Be sure assigned person is educated in QA/QC

25 Manage Quality Provided by DataONE  Double entry  Data keyed in by two independent people  Check for agreement with computer verification  Record a reading of the data and transcribe from the recording  Use text-to-speech program to read data back CC image by weskriesel on Flickr QA/QC During Data Entry

26 Manage Quality Provided by DataONE  Design data storage well  Minimize number of times items that must be entered repeatedly  Use consistent terminology  Atomize data: one cell per piece of information  Document changes to data  Avoids duplicate error checking  Allows undo if necessary QA/QC During Data Entry

27 Manage Quality Provided by DataONE  Make sure data line up in proper columns  No missing, impossible, or anomalous values  Perform statistical summaries CC image by chesapeakeclimate on Flickr QA/QC After Data Entry

28 Manage Quality Provided by DataONE  Look for outliers  Outliers are extreme values for a variable given the statistical model being used  The goal is not to eliminate outliers but to identify potential data contamination QA/QC After Data Entry

29 Manage Quality Provided by DataONE  Methods to look for outliers  Graphical  Normal probability plots  Regression  Scatter plots  Maps  Subtract values from mean QA/QC After Data Entry

30 Manage Quality Provided by DataONE Points to Remember  Determine the relative importance of the Fields you are entering (compared to other fields you are entering)  Adjust any quality control factors (# in sample, for instance) to ensure that accuracy level is properly accounted for  Target training and review to those fields with the highest accuracy level requirement  Do not assume overall quality based on entries alone; ensure that the relative importance of certain entries are factored in  Anyone can lie (or at least mislead) with statistics Provided by Tom Chatfield, BLM

31 Manage Quality Provided by DataONE Error Rates  the overall number of entries and the errors for the entire data set. Assuming all entries to be equal, the total number of errors is 22 out of a total of 650 entries made. Error Rate = 3% meaning that the quality of the data entered is 97%.  97% is very likely to be the desired level of accuracy (typically 95% -97%  would be expected in quality organizations). Provided by Tom Chatfield, BLM

32 Manage Quality Provided by DataONE  Data contamination is data that results from a factor not examined by the study that results in altered data values  Data error types: commission or omission  Quality assurance and quality control are strategies for  preventing errors from entering a data set  ensuring data quality for entered data  monitoring, and maintaining data quality throughout the project  Identify and enforce quality assurance and quality control measures throughout the Data Lifecycle Summary

33 Manage Quality Provided by DataONE 1.D. Edwards, in Ecological Data: Design, Management and Processing, WK Michener and JW Brunt, Eds. (Blackwell, New York, 2000), pp. 70-91. Available at www.ecoinformatics.org/pubs www.ecoinformatics.org/pubs 2.R. B. Cook, R. J. Olson, P. Kanciruk, L. A. Hook, Best practices for preparing ecological data sets to share and archive. Bull. Ecol. Soc. Amer. 82, 138-141 (2001). 3.A. D. Chapman, “Principles of Data Quality:. Report for the Global Biodiversity Information Facility” (Global Biodiversity Information Facility, Copenhagen, 2004). Available at http://www.gbif.org/communications/resources/print-and- online-resources/download-publications/bookelets/ http://www.gbif.org/communications/resources/print-and- online-resources/download-publications/bookelets/ References


Download ppt "U.S. Department of the Interior U.S. Geological Survey Tutorials on Data Management Lesson 6: Manage Quality CC image by Shane Melaugh on Flickr."

Similar presentations


Ads by Google