Data Quality Class 4. Goals Discuss Project Midterm Statistical Process Control Data Quality Rules.

Slides:



Advertisements
Similar presentations
© 2006 Prentice Hall, Inc.S6 – 1 Operations Management Supplement 6 – Statistical Process Control © 2006 Prentice Hall, Inc. PowerPoint presentation to.
Advertisements

Quality Assurance (Quality Control)
1 © The McGraw-Hill Companies, Inc., 2006 McGraw-Hill/Irwin Technical Note 9 Process Capability and Statistical Quality Control.
Statistical Process Control Processes that are not in a state of statistical control show excessive variations or exhibit variations that change with time.
Operations Management Supplement 6 – Statistical Process Control © 2006 Prentice Hall, Inc. PowerPoint presentation to accompany Heizer/Render Principles.
S6 - 1© 2011 Pearson Education, Inc. publishing as Prentice Hall S6 Statistical Process Control PowerPoint presentation to accompany Heizer and Render.
6-1 Is Process Stable ? The Quality Improvement Model Use SPC to Maintain Current Process Collect & Interpret Data Select Measures Define Process Is Process.
Process Control Charts An Overview. What is Statistical Process Control? Statistical Process Control (SPC) uses statistical tools to observe the performance.
Agenda Review homework Lecture/discussion Week 10 assignment
CD-ROM Chap 17-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition CD-ROM Chapter 17 Introduction.
Copyright (c) 2009 John Wiley & Sons, Inc.
Chapter 10 Quality Control McGraw-Hill/Irwin
S6 - 1© 2011 Pearson Education, Inc. publishing as Prentice Hall S6 Statistical Process Control PowerPoint presentation to accompany Heizer and Render.
© 2008 Prentice Hall, Inc.S6 – 1 Operations Management Supplement 6 – Statistical Process Control PowerPoint presentation to accompany Heizer/Render Principles.
Operations Management
Data Quality Class 5. Goals Project Data Quality Rules (Continued) Example Use of Data Quality Rules.
Data Quality Class 3. Goals Dimensions of Data Quality Enterprise Reference Data Data Parsing.
Software Quality Control Methods. Introduction Quality control methods have received a world wide surge of interest within the past couple of decades.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. 10 Quality Control.
Goal Sharing Team Training Statistical Resource Leaders (1)
8-1 Quality Improvement and Statistics Definitions of Quality Quality means fitness for use - quality of design - quality of conformance Quality is.
Accounting Databases Chapter 2 The Crossroads of Accounting & IT
Data Quality Class 4. Goals Questions Review of SQL select Data Quality Rules.
Total Quality Management BUS 3 – 142 Statistics for Variables Week of Mar 14, 2011.
Rev. 09/06/01SJSU Bus David Bentley1 Chapter 10 – Quality Control Control process, statistical process control (SPC): X-bar, R, p, c, process capability.
15 Statistical Quality Control CHAPTER OUTLINE
Methods and Philosophy of Statistical Process Control
X-bar and R Control Charts
Statistical Process Control
Chapter 7 Constraints and Triggers Spring 2011 Instructor: Hassan Khosravi.
Even More SQA: CAPA Corrective and Preventive Actions.
Statistical Applications in Quality and Productivity Management Sections 1 – 8. Skip 5.
10-1Quality Control William J. Stevenson Operations Management 8 th edition.
Forecasting and Statistical Process Control MBA Statistics COURSE #5.
THE MANAGEMENT AND CONTROL OF QUALITY, 5e, © 2002 South-Western/Thomson Learning TM 1 Chapter 12 Statistical Process Control.
Chapter 9 Designing Databases Modern Systems Analysis and Design Sixth Edition Jeffrey A. Hoffer Joey F. George Joseph S. Valacich.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 17-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
© 2003 Prentice-Hall, Inc.Chap 13-1 Business Statistics: A First Course (3 rd Edition) Chapter 13 Statistical Applications in Quality and Productivity.
Lecture 7 Integrity & Veracity UFCE8K-15-M: Data Management.
DAY 12: DATABASE CONCEPT Tazin Afrin September 26,
© 2006 Prentice Hall, Inc.S6 – 1 Operations Management Supplement 6 – Statistical Process Control © 2006 Prentice Hall, Inc. PowerPoint presentation to.
Statistical Process Control (SPC)
Chapter 36 Quality Engineering (Part 2) EIN 3390 Manufacturing Processes Summer A, 2012.
Operations Management
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Lecture Slides Elementary Statistics Tenth Edition and the.
Advanced Accounting Information Systems Day 7 Database Modeling.
1 © Prentice Hall, 2002 Chapter 5: Logical Database Design and the Relational Model Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B.
Managing Quality CHAPTER SIX McGraw-Hill/Irwin Statistical Process control.
1 CS 430 Database Theory Winter 2005 Lecture 4: Relational Model.
Copyright © 2010, 2007, 2004 Pearson Education, Inc Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Slide 1 Copyright © 2004 Pearson Education, Inc..
1 SMU EMIS 7364 NTU TO-570-N Control Charts Basic Concepts and Mathematical Basis Updated: 3/2/04 Statistical Quality Control Dr. Jerrell T. Stracener,
Quality Control  Statistical Process Control (SPC)
Quality Control Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill.
10 March 2016Materi ke-3 Lecture 3 Statistical Process Control Using Control Charts.
Logical Database Design and Relation Data Model Muhammad Nasir
Methodology - Logical Database Design. 2 Step 2 Build and Validate Local Logical Data Model To build a local logical data model from a local conceptual.
Control Charts. Statistical Process Control Statistical process control is a collection of tools that when used together can result in process stability.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Quality Control Chapter 6. Transformation Process Inputs Facilities Equipment Materials Energy Outputs Goods & Services Variation in inputs create variation.
Yandell – Econ 216 Chapter 17 Statistical Applications in Quality Management Chap 17-1.
36.3 Inspection to Control Quality
Chapter 7 Process Control.
Statistical Process Control (SPC)
Statistics for Managers Using Microsoft Excel 3rd Edition
10 Quality Control.
10 Quality Control.
Agenda Review homework Lecture/discussion Week 10 assignment
Process Capability.
Statistics Process Control
Presentation transcript:

Data Quality Class 4

Goals Discuss Project Midterm Statistical Process Control Data Quality Rules

Project Informtion is now on web site Final version is due on July 26 Data will be available by end of the week We will spend some time discussing goals today

Midterm Written exam on July 5 th Will cover: –Cost of low data quality –Dimensions of data quality –domains and mappings –SPC –Data Quality Rules

Statistical Process Control Developed by Shewhart at Bell Labs in the 1920’s through 1950’s Notions of Variation vs. Control Important in original context of both equpiment manufacture and service quality

Variation Natural variations Defects Errors Mistakes Some variations are meaningful, some are not

Causes of Variation Common, or Chance causes –minor fluctuations or differences –not necessarily important to correct –observed to form a normal distribution Assignable, or Special causes –(self explanatory) We expect to see the normal variations, but assignable cause variations are interesting

Example Measure railroad on-time performance –Trains are typically on time or a few minutes late –One night, the trains are all 1 hour late due to electrical problems – a special cause

Statistical Control State in which variations observed can be attributed to common causes that do not change with time

Pareto Principle In a population that contributes to a common effect, relaively few of the contributors account for the bulk of the effect Example: code performance analysis Can be used to direct analysis

Control Chart

Control Chart 2 Used to look for distinct variations from the mean Goal: predictable behavior Plot series of data over time Variations are represented as distance from the mean

Control Chart 3 Center Line: can be computed as mean of variable points Upper Contril Limit: three standard deviations above center line Lower Control Limit: three standard deviations below center line

Control Chart 4 As long as all points are between UCL and LCL, the variations are due to common causes, and the process is said to be in control, or stable Points above UCL or below LCL are indicative of abnormal variation, and are due to special causes – the process is not in control

Control Chart 5 Select variables chart or attributes chart Use data quality dimensions as guideline Select meaningful variables to measure (i.e., stuff that will point at a diagnosible problem)

Interpreting the Control Chart Lack of stability indicates potential problem Look for: –points utside of control limits –zone testing (clusters of points within certain standard deviation limits) –potential to split out data points into different logical data sets Look for cycles

SPC and Data Quality “The Information Factory” Use data quality dimensions as guideline for investigation Analyze the state of data as it passes through the information chain Probing can be automated with data quality rules

Inserting the Probes FInd a location in information chain that is: –nondisruptive –easy to access –easy to retool

Data Quality Rules Definitions Proscriptive Assertions Prescriptive Assertions Conditional Assertions Operational Assertions

Definitions Nulls Domains Mappings

Proscriptive Assertions Describe what is not allowed Used to figure out what is wrong with data Used for validation

Prescriptive Assertions Describe what is supposed to happen with data Can be used for data population, extraction, transformation Can also be used for validation

Conditional Assertions Define an assertion that must be true if a condition is true

Operational Assertions Define an action that must be taken if a condition is true

9 Classes of Rules 1) Null value rules 2) Value rules 3) Domain membership rules 4) Domain Mappings 5) Relation rules 6) Table, Cross-table, and Cross-message assertions 7) In-Process directives 8) Operational Directives 9) Other rules

Null Value Rules Null value specification –Define GETDATE for unavailable as “fill in date” Null values allowed –Attribute A allowed nulls {GETDATE, U, X} Null values not allowed –Attribute B nulls not allowed

Value Rules Value restriction rule Restrict GRADE: value >= ‘A’ AND value <= ‘F’ AND value != ‘E’

Domain Rules Domain Definition Domain Membership Domain Nonmembership Domain Assignment

Mapping Rules Mapping definition Mapping membership Mapping nonmembership

Relation Rules Completeness Exemption Consistency Derivation

Completeness Defines when a record is complete (I.e., what fields must be present) IF (Orders.Total > 0.0), Complete With {Orders.Billing_Street, Orders.Billing_City, Orders.Billing_State, Orders.Billing_ZIP}

Exemption Defines which fields may be missing IF (Orders.Item_Class != “CLOTHING”) Exempt {Orders.Color, Orders.Size }

Consistency Define a relationship between attributes based on field content –IF (Employees.title == “Staff Member”) Then (Employees.Salary >= AND Employees.Salary < 30000)

Derivation Prescriptive form of consistency rule Details how one attribute’s value is determined based on other attributes IF (Orders.NumberOrdered > 0) Then { Orders.Total = (Orders.NumberOrdered * Orders.Price) * 1.05 }

Table and Cross-Table Rules Functional Dependence Primary Key Assertion Foreign Key Assertion (=referential integrity)

Functional Dependence Functional Dependence between columns X and Y: –For any two records R1 and R2 in a table, if field X of record R1 contains value x and field X of record R2 contains the same value x, then if field Y of record R1 contains the value y, then field Y of record R2 must contain the value y. In other words, attribute Y is said to be determined by attribute X.

Primary Key Assertion A set of attributes defined as a primary key must uniquely identify a record Enforcement = testing for duplicates across defined key set

Foreign Key Assertion When the values in field f in table T is chosen from the key values in field g in table S, field S.g is said to be a foreign key for field T.f If f is a foreign key, the key must exist in table S, column g (=referential integrity)

In-process Directives Definition directives (labeling information chain members) Measurement directives Trigger directives

Operational Directives Transformation Update

Other Rules Approximate Searching rules Approximate Matching rules