Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Data Quality: Opportunities, Data, and Examples.

Similar presentations


Presentation on theme: "1 Data Quality: Opportunities, Data, and Examples."— Presentation transcript:

1 1 Data Quality: Opportunities, Data, and Examples

2 2

3 3 – Level of analysis Take a quick look at what/why use data Linking data from disparate and third party sources – Explore data types – Typical issues & Tricks Cross validation and sourcing Reverse Look-up GIS layering Backfill from text correlated to codes – Information from operations Text analytics – Level of analysis Take a quick look at what/why use data Linking data from disparate and third party sources – Explore data types – Typical issues & Tricks Cross validation and sourcing Reverse Look-up GIS layering Backfill from text correlated to codes – Information from operations Text analytics Better and More Data

4 4 Sales and Distribution Producer Segmentation Market Planning Revenue Forecasting Cross sell and Up sell Retention and Profitability Underwriting Risk Selection and Pricing Portfolio Management Premium Adequacy Billing and Collections Management Claims Payment Accuracy Claim Collaboration > Fraud Detection > Subrogation > Risk Transfer > 3 rd Party Deductible > Reinsurance Recoverable General Organizational Overview An information business focused on risk taking. Make. Sell. Serve.

5 5 Same Problems – Different Lines of Business Personal – Auto, HO, Umbrella Small Commercial – BOP, CPP Middle Market Commercial – CPP w/GL, CP, Crime, CIM, B&M, WC, Auto Large Commercial Accounts Commercial Auto Workers Comp Umbrella/Excess Specialty Lines – D&O, EPL, E&O, Farm, FI Personal – Auto, HO, Umbrella Small Commercial – BOP, CPP Middle Market Commercial – CPP w/GL, CP, Crime, CIM, B&M, WC, Auto Large Commercial Accounts Commercial Auto Workers Comp Umbrella/Excess Specialty Lines – D&O, EPL, E&O, Farm, FI

6 6 Structured data Semi-structured data Unstructured data Text Spatial Pictographic Graphic Voice Video Data Types and Forms

7 7 Data Archive, Legacy Systems Current System Claim Multiple States Billing Systems Finance Systems CRM Systems, other data Policy Multiple Underwriting Systems Medical Data - Bill Review - PPO - Case Management - Paradigm Multiple Data Systems which must be pulled together for analysis. Great opportunity for cross-validation and sourcing Identify Data Systems Get right data from right systems Overcome internal Organizational Barriers Bridge to legacy systems and archived data Augment to create rich data mining environment Expect the need to negotiate for resources ACTIONS Vendors/Partners External Data

8 8 Dun & Bradstreet Experian Bureau of Labor and Statistics Market Stance AM Best Equifax US Census Claritas Melissa Data ISO GIS vendors U&C Data sets Code Sets for ICD-s and CPT’s … Some typical external data sources and vendors

9 9 Data Glitches – historical and on-going Systemic changes to data not process related – Changes in data layout / data types – Changes in scale / format – Temporary reversion to defaults – Missing and default values – Gaps in time series Systemic changes to data not process related – Changes in data layout / data types – Changes in scale / format – Temporary reversion to defaults – Missing and default values – Gaps in time series

10 10 Process Reasons for poor data entry

11 11 Defining Issues-sample Source Data 1-Define Issues

12 12 Data Elements DZ BE CN DK EG FR... ZW ISO 3166 English Name ISO 3166 3-Numeric Code 012 056 156 208 818 250... 716 ISO 3166 2-Alpha Code Algeria Belgium China Denmark Egypt France... Zimbabwe Name: Context: Definition: Unique ID: 4572 Value Domain: Maintenance Org. Steward: Classification: Registration Authority: Others ISO 3166 French Name L`Algérie Belgique Chine Danemark Egypte La France... Zimbabwe DZA BEL CHN DNK EGY FRA... ZWE ISO 3166 3-Alpha Code MORE ISSUES… Mapping across sources: Same Fact, Different Terms Algeria Belgium China Denmark Egypt France... Zimbabwe Name: Country Identifiers Context: Definition: Unique ID: 5769 Conceptual Domain: Maintenance Org.: Steward: Classification: Registration Authority: Others Data Element Concept

13 13 Data Filling Manual Statistical Imputation Temporal Spatial Spatial-temporal Manual Statistical Imputation Temporal Spatial Spatial-temporal

14 14 Geographic Hierarchy

15 15 Deriving Data = Power  Totals: Household Income  Trends: Rate of Medical Bill Increases  Ratios: Claims/Premium, Target/Median  Friction: Level of inconvenience, ratio of rental to damage  Sequences: Lawyer-Doctor, Auto-Life Policy  Circumstances: Minimal Impact Severe Trauma  Temporal: Loss shortly after adding collision  Spatial: Distance to Service, proximity of stakeholders  Logged: Progress Notes, Diaries,  Who did it, When, “Why”

16 16 Deriving Data = Power (Cont’d)  Behavioral: Deviation from past usage, spike buying  Experience Profiles: Vendor, Doctor, Premium Audit  Channel: How applied, How reported, Service Chain  Legal Jurisdiction: Venue Disposition, Rules  Demographics: Working, Weekly wage, lost income  Firmographics: Industry Class Code Vs Injuries Claimed  Inflation: Wage, Medical, Goods, Auto, COLA  Gov’t Statistics: Crime Rate, Employment, Traffic  Other Stats: Rents, Occupancy, Zoning, Mgd Care

17 17 “Search” versus “Discover” Data Mining Text Mining Data Retrieval Information Retrieval Search (goal-oriented) Discover (opportunistic) Structured Data Unstructured Data (Text)

18 18 Word Replacement Lists Input Value [Jim] SearchingSearching Returns “Similar Matches” All Records Found: Jimmy Jim James JimmyJimJames JAMESJAMESJAMES Transformed Input Value [JAMES]

19 19 Motivation for Text Mining Approximately 90% of the world’s data is held in unstructured formats (source: Oracle Corporation) Information intensive business processes demand that we transcend from simple document retrieval to “knowledge” discovery. Approximately 90% of the world’s data is held in unstructured formats (source: Oracle Corporation) Information intensive business processes demand that we transcend from simple document retrieval to “knowledge” discovery. 90% Structured Numerical or Coded Information 10% Unstructured or Semi-structured Information

20 20 Convergence of Disciplines Example

21 21 Techniques for attacking text data:  Rules-based  Statistical Text Analysis and Clustering  Linguistic and Semantic Clustering  Support Vector Machines  Pattern Matching or other statistical algorithms  Neural Networks  Combination of methods from above Text is like a data iceberg

22 22 Claims processing – Progress notes and Diaries CLAIMS ADJUSTER Medical Management Staff Special Investigation Unit NICB Vendor Management Consulting Engineers Hearing Representative Structured Settlement Unit Recovery Staff Legal Staff Home Office Staff Field Office Claim Staff Insured Risk Manager Agent or Broker Diary forward – “call Dr Jones next week” Business Rule – large loss review System Reminder – update case reserves Correspondence Tracking – legal letter sent Service

23 23 Semantic processing: Named Entity Extraction Identify and type language features Examples: People names Company names Geographic location names Dates Monetary amount Phone #, zipcodes, SSN, FEIN Others… (domain specific) Identify and type language features Examples: People names Company names Geographic location names Dates Monetary amount Phone #, zipcodes, SSN, FEIN Others… (domain specific)

24 24 Feedback to UW

25 25 Data Quality: Opportunities, Data, and Examples


Download ppt "1 Data Quality: Opportunities, Data, and Examples."

Similar presentations


Ads by Google