Presentation on theme: "Data Quality: What you need to know to Create and Sustain a Data Quality Program."— Presentation transcript:
Data Quality: What you need to know to Create and Sustain a Data Quality Program
2 Panel Members Daniel Wallace Manager, Financial Informatics Arkansas Blue Cross & Blue Shield Gayle Bunn, Data Warehouse Analyst, EDW Blue Cross and Blue Shield of Idaho Amit Bhagat, President & Principal Consultant Amitech Solutions
Data Quality Panel Objectives: To share information and insight on: Overall organizational approach to creating and sustaining data quality program
Panel Presentation Please provide us with a brief overview of the overall approach to creating and sustaining the data quality program in your organization
What You Need to Know to Create and Sustain a Data Quality Program Daniel Wallace Manager, Financial Informatics Arkansas Blue Cross & Blue Shield Contact Info: Phone: 501-396-4090 Email: firstname.lastname@example.org
6 Agenda Creating a Data Quality Program The People The Scope The Processes The Tools Sustaining a Data Quality Program Policy Communication Demonstrate Value
7 Data Quality Creating a Data Quality Program The People – Knowledge of the Business – Multidiscipline Staff – Skill Set Ability to handle large and complex datasets Ability to test and verify systems processes to understand causes of data issues Ability to query/profile data using SQL, SAS, Excel Ability to communicate with business areas and management
8 Data Quality Creating a Data Quality Program The Scope – Importance of Defining Likely to solve a real problem Able to quantify value of DQ program – Where to Begin System Level? Process Level? Subject Area Level? Application Level? Project Level?
9 Data Quality Creating a Data Quality Program The Processes (Assess, Improve) Assessment – Data Profiling – Define DQ Rules – Define Measure (from DQ rules) Improvement – Data Cleansing – Improve Processes – Measure Quality – Monitor Quality
10 Data Quality Creating a Data Quality Program The Tools – Purpose/Need Understanding your data Profiling and Rule Discovery Data Standardization Data Cleansing Metadata Management –People Manage Data Quality not Tools
11 Data Quality Sustaining a Data Quality Program The Need for a DQ Policy Policy Guidelines – Treat Information as a Product/Asset – Focus on the Business Side – Define Roles and Responsibilities – Resolution Management – Proactive Approach – Data Standards
12 Data Quality Sustaining a Data Quality Program Communication – Make/Break your DQ initiatives – Stakeholders Their Role in DQ/DG Program Successful DQ program must be done with them Include all functional areas that create or use data Regular meetings needed
13 Data Quality Sustaining a Data Quality Program Demonstrate Value & Communicate It – Identify DQ Issue to Target – Engage Management – Select Metrics to Measure, Establish Baseline – Implement Solution DQ program can mitigate inefficiencies, excessive costs associate with poor data, compliance risks, improve customer satisfaction
15 Biography Gayle Bunn, MBA, PMP, BSEE Data Warehouse Analyst Enterprise Data Warehouse (EDW) Blue Cross of Idaho Responsible for EDW Data Quality, Support & Maintenance, Training, Customer Service, and Data Governance Contact Info: Phone: (208)331-7487 Email: email@example.com
Current Steps at BCI 1.Started small – EDW focus 2.Established data quality workflow 3.Established 1 automated touch point 4.Added initial data quality metrics Timeliness Completeness 5.Socialized timeliness 6.Socialized completeness 7.Data quality evolved into many flavors 8.Established S.M.A.R.T. data quality metrics 9.Performed ongoing process improvement 10.Major milestone occurred! 11.Data governance and MDM emerges
1. Started Small – EDW Focus Member Medical Dental Drug Enterprise Data Warehouse (EDW) Data Analyst Community Data Quality Review Team (DQRT) formed. We need better data quality! We need to work together & discuss issues! EDW Team
2. Established Data Quality Workflow Enterprise Data Warehouse (EDW) Data Quality Review Team (DQRT) Manual Fix Mark Fixed! Data Analyst CommunityEDW Team Prioritize SharePoint List Title Description Assigned To Resolved (yes/no) Document data quality issues Yay! The data is still wrong! Faster please! Wrong?
3. Established 1 Automated Touch Point Member Medical Dental Drug Enterprise Data Warehouse (EDW) TP = Touch Point Extract X-form TP Load 1 Automated Touch Point (Check for missing data) Stop load if data is not complete! Some of the data is missing! Can we have the data faster? Cool! Yay! Can we have more data? We need Service Level Agreements (SLAs)! Hard to please!
4. Added Initial Data Quality Metrics Member Medical Dental Drug Enterprise Data Warehouse (EDW) TPExtract X-form TP Load Fix New Touch Point Automate Fix for Common Problems Very cool! Timeliness Jobs completed on time. What does completeness mean? Completeness Amount of data without noise. Yay! Noise = Missing data in Fact Tables We need to socialize this! Vision Grouper Sales Premium More Data Delivered
5. Socializing 1 st Metric - Timeliness Automate: Graph when jobs miss SLA. Manual: Track when weekly/monthly jobs complete. EDW SLA - SharePoint SQL Server Reporting Services (SSRS) I can tell when jobs finish! I can see where to improve!
6. Socializing 1 st Metric - Completeness Automate: Graph when noise issues occur. Automate: Track noise in data. SQL Server Integration Services (SSIS) to SharePoint SQL Server Reporting Services (SSRS) Track Noise in Fact Tables Count when dimension data is not available in a Fact record (PK<0). Only 2.19% noise? The data is more complete than I thought! I can see where to improve! Dimension PK Value Dimension PK Value Not Applicable Error -2 Missing -3 Default -4 NOISE What do we mean when we say data quality anyway?
7. Data Quality Evolved into Many Flavors Accurate Consistent TimelyIntegrity Valid Complete Reconciles Appropriately Matches Source On Time Delivery Correct Business Rules Appropriate Data No Noise (missing data) Successfully Performed in BCIs Enterprise Data Warehouse (EDW) I have a data quality problem! You mean opportunity! What flavor?
Potential Data Quality Metrics Data Quality Metrics 8. Established Data Quality Metrics Accuracy (Reconciles) Consistency (Match Source) Timeliness (Right Time) Integrity (Right Rules) Validity (Right Data) Completeness (No Noise) Accessibility Uniqueness Compliance Efficiency % data loads where data reconciles # accuracy incidents % data loads where data matches source # consistency incidents % data loads delivered on-time # timeliness incidents % load with Appropriate Business Rules Applied # integrity incidents % loads with appropriate date range # validity incidents % records without noise (missing data) # noise incidents % of Critical Data Fields provided % total where duplicate records exist # of regulatory noncompliance data issues with HIPAA, PHI Avg. time taken for data quality issues to be resolved VALUE
9. Performed Ongoing Process Improvement TP Enterprise Service Bus (ESB) TP Enterprise Data Warehouse (EDW) TP Extract X-form TP Load Fix! Use Data Quality metrics to identify issues other TPs dont. Use Data Quality process to fix source issues. Validate (matches source) Accuracy (Reconcile) Completeness (no noise) Data Sources
10. Major Milestone Occurred Enterprise Data Warehouse (EDW) Data Quality Review Team (DQRT) Milestone: No Issues! Data Analyst CommunityEDW Team SharePoint List Title Description Assigned To Resolved (yes/no) Data Quality Area Yay! Theres one in every crowd! Finally! Yay! Yes!
11. Data Governance & MDM Emerges Master Data Management (MDM) Data Governance Data Quality Accurate Consistent TimelyIntegrity Valid Complete Data Governance is emerging around Data Quality MDM is emerging around Data Governance With Success: The small birds chirp of data quality was heard!
Critical Success Factors at BCI 1.Gain Steering Committee sponsorship 2.Establish a clear Mission Statement/Purpose 3.Develop Program Goals for the Team 4.Establish cross-functional DQRT representation (including across IS) 5.Create a non-blame, non-judgmental environment 6.Use a divide and conquer approach to issue resolution (broad participation) 7.Establish continuous improvement over time (Rome was not built in a day) 8.Conduct regular meeting schedule, frequency dependent on need 9.Appoint a data quality champion
Data Quality: What you need to know to create and sustain data a quality program Amit Bhagat President & Principal Consultant Amitech Solutions Contact Info: Phone: 314-480-6301 Email: Amit.Bhagat@amitechsolutions.com
30 Agenda DQ Symptoms Use Case DQ Myths & Reality DQ Design Approach Business Need Define Profile Remediate Sustain
31 DQ Symptoms The data is wrong – I will do it myself. We spent $5 million on the claims system and it still sends incorrect payments. We get a different member month count depending on whom we ask. We are not sure if our MLR is correct.
32 Use Case Business Problem Ensure accurate risk scoring for membership under the ACA for payment transfer between carriers. Data Profiling Missing or incorrect diagnosis code in claims data. Outcomes Pay other plans, potentially 2% or more of loss ratio because we may "appear" healthier than others in our market. Focus on diagnosis code as a critical data element.
DQ Myths & Reality 33 Myths Quality is solved by technology alone. Quality is an IT problem. Quality is best fixed at the point of entry. Quality is the sole responsibility of the data owners. Quality requires all data to perfect. Reality Quality requires people, process, culture, and technology to work in concert. Quality is a fit for purpose process that delivers the highest data quality over time.
DQ Design: Approach 34 Member Retention & Growth Business Need Sales Marketing Customer Service Function Membership Data Domain Email Phone Number Attributes
DQ Design: 5 Step Process 35 1. Business need 2. Define 3. Profile 4. Remediate 5. Sustain Optimal DQ
1. Business Need Determine the scope and business relevance of DQ effort. 36 Acquire business goals Identify levers Identify components Identify candidates for DQ ObjectiveImprove member retention by 5% Business ActionIncrease member satisfaction, Improve customer service. Reduce hold time, improve member portal for self service, provide mobile app for provider directory Information use (levers)Identify dissatisfied members Data ComponentsMember Satisfaction Surveys Customer Service Premium & Claims Data CandidatesSurvey Premium & Claims by Product Membership, Customer Service call data
2. Define: DQ Objectives & Measures Identify completion criteria for current DQ iteration: Reduce member duplicates by 10%. Determine metrics to be developed: What you are measuring (measure). When you are measuring (milestone). Why you are measuring (business impact). 37 Business DriverSample Data Quality Metrics Accurately calculate the number of net new members Number of duplicate members Number of Members with missing SSN Number of Members without Member ID Number of Members with missing address
3. Profile This step determines the exact sources, location, and types of techniques to use to assess DQ: 38 Identify specific tools / techniques to be used. Review initial measures for relevance and accuracy. Verify accuracy of what was intended vs. actual. –Analyze data for business rule conformance. –Profiling reports are analyzed, and root causes and business impacts are identified and reported.
39 4. Remediate: Technology & Process Technology Apply tools to cleanse and standardize data in the ETL process to ensure required levels of quality are met. Process & Standards Consistent application of process and standards to outline the expectations for data quality across the enterprise. Develop the immediate and ongoing technical architecture and process components required to reduce or eliminate DQ problems. Use cleansing & standardization tools Develop audit, balance, and control Integrate DQ with Enterprise Information Management program Develop and implement business processes Develop work flows to fix bad data at source Develop and implement data movement controls
40 5. Sustain This step covers the culture change, governance, and ongoing support and progress reporting of the DQ effort. Data Governance Provides the framework and ongoing oversight to enable effective management. Change Management Implementation of various culture change management efforts to sustain data quality efforts.
41 Summary Data quality is a known, for sure problem. Existing processes that create bad data must be addressed. Technology cannot be the only road to a solution. People: Perceptions of doing bad things are inevitable. Manage resistance, politics, priorities. Culture management mandatory. Technology: Integrate with EIM. Lots of new stuff!
42 Share Your Experience Panel Members Daniel Wallace Manager, Financial Informatics Arkansas Blue Cross & Blue Shield Gayle Bunn, Data Warehouse Analyst, EDW Blue Cross and Blue Shield of Idaho Amit Bhagat, President & Principal Consultant Amitech Solutions
Question # 1 How does data quality program fit into your strategy for information management? 43
Question # 2 Are you able to produce "one version of the truth" throughout the whole company, or do various versions surface from different areas? What subject areas are you currently managing in your data quality program? 44
Question # 3 Are data definitions established at the individual, department, or enterprise level? Are you leveraging data governance program for data quality? How? 45
Question # 4 Describe what impact data quality has on the delivery of business value through analytics and BI? Tell us how your organization manages data quality and how it responds to data quality issues (as a matter of project work, daily operations, planning, etc). Does your organization have ways of measuring or quantifying poor quality and the results of poor quality data? 46
Question # 5 In your organization, how do the various stakeholders around any given data quality project work together? 47
Question # 6 Have you integrated master data in your DQ program? What was your approach? How did it go? –Successes? –Lessons learned? 48
Question # 7 What are your next steps? New efforts toward data quality? 49