Presentation is loading. Please wait.

Presentation is loading. Please wait.

Connecticut is Data Rich but Information Poor. Our Vision: Connecting the Silos.

Similar presentations


Presentation on theme: "Connecticut is Data Rich but Information Poor. Our Vision: Connecting the Silos."— Presentation transcript:

1 Connecticut is Data Rich but Information Poor

2 Our Vision: Connecting the Silos

3 PATH Presentation CT Data Collaborative June 2014 PATH Overview (Rob) How PATH Works (April) PATH Demo (Laurel and April) Why Integrate Data? The MAPS Study (Rob) PATH vs Data Ladder (Rob & April)

4 How PATH Works  Virtual Data Warehouse  Identity Resolution across multiple sources that don’t share a Gold Standard Identifier  HIPAA and FERPA Compliant  Always transfers Fact data separately from Demographic data or Personally Identifiable Information  Data Owners control which data is exported to a location outside of their data center  Data Owners approve all queries

5 PATH History Completed Phases  Established in Statute - Public Act  Initial Development as CHIN, inclusion of 4 initial data sources  Implemented advanced record linkage in a virtual data warehouse  Scalability to 1M+ individuals, ability to add additional data sources and manage metadata w/o code modifications, unlimited data sources  Implemented for P20WIN 40M Records, 1.6B Data Elements Now Available to CT Agencies and Organizations as PATH

6 Data Categories  People Records Demographic Information such as Name, Address, SSN, DOB, etc. Also known as PII – Personally Identifiable Information  Fact Records Education, Health, Labor, etc. Information about a person BUT without the PII information De-Identified or Anonymized Data

7 A Walk Through of How PATH Works P20WIN Example

8 PATH Remote Software installed at each Participating Agency Agency Data Steward uses the PATH Metadata Editor to Identify: Table/Record Schema of Agency Data Data at the Field or Table Level marked Available or Unavailable for Download Common Data Element fields used for linking records - provides Identity Resolution across the different sources Step 1 Agency Data SDE CCC CSU DOL Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL

9 During Remote Initialization the Extract/Transform/Load function of PATH builds a Record Index of the People Records from each Data Source Step 2 Agency Data SDE CCC CSU DOL Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Record Index

10 DAS/BEST PATH Software installed at a Main Location - for P20WIN this location is DAS/BEST Step 3 DOL Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Record Index Agency Data Record Index SDE CCC CSU Probabilistic Integrator - Pi UI, Security, Workflow, Query Engine

11 During Main Initialization Using each Agency’s Record Index, Extracts Common Data Elements from People Records Step 4 Probabilistic Integrator - Pi Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Record Index Agency Data Record Index SDE CCC CSU DOL DAS/BEST UI, Security, Workflow, Query Engine

12 During Main Initialization Using each Agency’s Record Index, Extracts Common Data Elements from People Records Sends them to Main & Loads into Memory ONLY Step 4 Probabilistic Integrator - Pi Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Record Index Agency Data Record Index SDE CCC CSU DOL DAS/BEST UI, Security, Workflow, Query Engine

13 During Main Initialization Extracts Common Data Elements from People Records using each Agency’s Record Index Sends them to Main & Loads into Memory ONLY Combines multiple records for individuals into Clusters via Probabilistic Integration Utility Builds a Data Base of Clusters containing only Agency Record Indices No Personally Identifiable Agency Data written to disk outside of Agency Step 4 Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Record Index Agency Data Record Index SDE CCC CSU DOL Probabilistic Integrator - Pi DAS/BEST UI, Security, Workflow, Query Engine DB of Clustered Indices

14 Use UI features to establish user Roles, Login, etc. Step 5 Probabilistic Integrator - Pi DB of Clustered Indices Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Record Index Agency Data Record Index SDE CCC CSU DOL DAS/BEST UI, Security, Workflow, Query Engine

15 Use UI features to establish user Roles, Login, etc. Use UI features to: Create a Query Approve a Query Schedule a Query Step 5 Probabilistic Integrator - Pi DB of Clustered Indices Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Record Index Agency Data Record Index SDE CCC CSU DOL DAS/BEST UI, Security, Workflow, Query Engine

16 Use UI features to establish user Roles, Login, etc. Use UI features to: Create a Query Approve a Query Schedule a Query Use Query Engine to: Build Agency Query Requests Uses ONLY Data Available for Download in Query Request Step 5 Probabilistic Integrator - Pi DB of Clustered Indices Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Record Index Agency Data Record Index SDE CCC CSU DOL DAS/BEST UI, Security, Workflow, Query Engine

17 Query Engine uses Clusters of Indices to Get the needed Agency Records Indices Queries Only Agency Data marked Available for Download Transfers only data marked Available for Download to the Main Downloads Only Approved Queries Step 6 Probabilistic Integrator - Pi DB of Clustered Indices UI, Security, Workflow, Query Engine De-identified Integrated Data Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Record Index Agency Data Record Index SDE CCC CSU DOL DAS/BEST

18 PATH as P20WIN Demo CCC & CSU DOL

19 3 User Roles Data Consumer Create Query Downloads Query Data System Admin Approved Query Design Schedules Execution Approved Data Download Data Steward Data Management

20 Query Workflow Create Query Data Consumer Approve Query Agency Sys Admin Schedule Execution Agency Sys Admin Approve Download Agency Sys Admin Download Data Data Consumer

21 What you don’t get - PII Last NameFirst Name DOBSSN AppelApril 01/01/ BerryJohn 02/02/ CaratColleen 03/03/ ErnstMax 04/04/ GomezGloria 05/05/ HurstWilliam 06/06/ KellerHelene 07/07/ MartinezPedro 08/08/ RodriguezFelix 09/09/ SmithPeggy 10/10/ CCC & CSU Record Locator GMajorGrad Stat Degree Code F English YesBA 87562M Physical Ed NoAA F Computer Science YesBS M Political Science YesBA M Ecology NoBS M Psychology YesBA F Math YesBS M Biology YesBS F Anthropology NoBA M Agriculture YesAS DOL Record Locator NAICS Code QtrWage $68, $50, $11, $37, $6, $12, $78, $44, $25, $21,215 Record Locator GenderMajor Grad Stat Degree Code NAICS Code QtrWage MEcologyNoBA $37, MBiologyYesBS $44,852 Record Locator Data Output

22 The Mortality After Prison Study Objective: Link DOC’s OBIS system with DPH death records 2 Specific questions motivating this project: How many former prisoners are dead? What do they die of, and when?

23 Methods Former DOC inmates linked to DPH death records using Pi -- DOC releases beginning April 1974 DPH deaths 1980 – 2010 Matching Fields: first name, last name, sex, date of birth

24 Causes of Death of Former Inmates CT Male Population CT Former Inmates %Cause of Death (ICD 10)% 9Atherosclerotic heart disease8.1Accidental poisoning by and exposure to narcotics 7Malignant Neoplasm Of Bronchus Or Lung6.3Malignant Neoplasm Of Bronchus Or Lung 5.1Acute myocardial infarction4.8Atherosclerotic heart disease 3.6Chronic obstructive pulmonary disease4.4Atherosclerotic cardiovascular disease 3.4Atherosclerotic cardiovascular disease3.5Assault by other and unspecified firearm discharge 3Malignant neoplasm of prostate2.8Acute myocardial infarction 2.4Stroke2.3 Accidental poisoning by drugs, medicaments and biological substances 2.3Pneumonia2.2Cirrhosis of liver 2Unspecified dementia2.2Chronic obstructive pulmonary disease 1.8Malignant neoplasm of colon2.1 Intentional self-harm by hanging, strangulation and suffocation 1.8Cardiac arrest1.9 Person injured in unspecified motor-vehicle accident, traffic 1.8Heart failure1.4HIV disease resulting in other specified conditions 1.7Sepsis1.3Unspecified diabetes mellitus 1.6Malignant neoplasm of pancreas1.3Sepsis 1.5Unspecified diabetes mellitus1.1Liver cell carcinoma 1.5Alzheimer's disease1.1Cardiomyopathy 1.1Malignant neoplasm of pancreas 1.1Hypertensive heart disease 1.1Intentional self-harm by firearm discharge

25 Causes of Death of Former Inmates CT Male Population CT Former Inmates %Cause of Death (ICD 10)% 9Atherosclerotic heart disease8.1Accidental poisoning by and exposure to narcotics 7Malignant Neoplasm Of Bronchus Or Lung6.3Malignant Neoplasm Of Bronchus Or Lung 5.1Acute myocardial infarction4.8Atherosclerotic heart disease 3.6Chronic obstructive pulmonary disease4.4Atherosclerotic cardiovascular disease 3.4Atherosclerotic cardiovascular disease3.5Assault by other and unspecified firearm discharge 3Malignant neoplasm of prostate2.8Acute myocardial infarction 2.4Stroke2.3 Accidental poisoning by drugs, medicaments and biological substances 2.3Pneumonia2.2Cirrhosis of liver 2Unspecified dementia2.2Chronic obstructive pulmonary disease 1.8Malignant neoplasm of colon2.1 Intentional self-harm by hanging, strangulation and suffocation 1.8Cardiac arrest1.9 Person injured in unspecified motor-vehicle accident, traffic 1.8Heart failure1.4HIV disease resulting in other specified conditions 1.7Sepsis1.3Unspecified diabetes mellitus 1.6Malignant neoplasm of pancreas1.3Sepsis 1.5Unspecified diabetes mellitus1.1Liver cell carcinoma 1.5Alzheimer's disease1.1Cardiomyopathy 1.1Malignant neoplasm of pancreas 1.1Hypertensive heart disease 1.1Intentional self-harm by firearm discharge

26 Causes of Death of Former Inmates CT Male Population CT Former Inmates %Cause of Death (ICD 10)% 9Atherosclerotic heart disease8.1Accidental poisoning by and exposure to narcotics 7Malignant Neoplasm Of Bronchus Or Lung6.3Malignant Neoplasm Of Bronchus Or Lung 5.1Acute myocardial infarction4.8Atherosclerotic heart disease 3.6Chronic obstructive pulmonary disease4.4Atherosclerotic cardiovascular disease 3.4Atherosclerotic cardiovascular disease3.5Assault by other and unspecified firearm discharge 3Malignant neoplasm of prostate2.8Acute myocardial infarction 2.4Stroke2.3 Accidental poisoning by drugs, medicaments and biological substances 2.3Pneumonia2.2Cirrhosis of liver 2Unspecified dementia2.2Chronic obstructive pulmonary disease 1.8Malignant neoplasm of colon2.1 Intentional self-harm by hanging, strangulation and suffocation 1.8Cardiac arrest1.9 Person injured in unspecified motor-vehicle accident, traffic 1.8Heart failure1.4HIV disease resulting in other specified conditions 1.7Sepsis1.3Unspecified diabetes mellitus 1.6Malignant neoplasm of pancreas1.3Sepsis 1.5Unspecified diabetes mellitus1.1Liver cell carcinoma 1.5Alzheimer's disease1.1Cardiomyopathy 1.1Malignant neoplasm of pancreas 1.1Hypertensive heart disease 1.1Intentional self-harm by firearm discharge Cirrhosis of liver Liver cell carcinoma Accidental poisoning by and exposure to narcotics Accidental poisoning by drugs, medicaments and biological substances 13.7%

27 Why Should the State Care? Former DOC inmates enrolled in Medicaid upon release Alcohol and drug abuse/dependence associated with HIGH rates of ER utilization Inpatient hospital costs for substance dependent patients significantly higher Deaths due to long term alcohol/drug abuse costly

28 Remote Components Metadata Editor Extract, Transform and Load Module Main Components Integration Engine User Interface Security Workflow Module Query Engine with Filtering PATH Components DB of Clustered Indices De-identified Integrated Data Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Record Index Agency Data Record Index DAS/BEST Metadata Editor & ETL Probabilistic Integrator - Pi UI, Security, Workflow, Query Engine Integration Engine ` UI, Security, Workflow, Query Engine

29 Security Personally Identifiable information never written outside of Agency Data Center Encrypted transfer of all data PII and Facts never transmitted together Audit logs No Query without Owner’s Approval Ease of Use System Administration Data Management Query Filtering Query results delivered as de- identified data PATH Functionality DB of Clustered Indices De-identified Integrated Data Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Record Index Agency Data Record Index DAS/BEST Metadata Editor & ETL Probabilistic Integrator - Pi UI, Security, Workflow, Query Engine Integration Engine ` UI, Security, Workflow, Query Engine PII & Facts separate Xfer Encrypted Xfer Query Filtering No PII Sys Admin Approval req’d Audit logs No PII Data Mgmt

30 Remote Components Metadata Editor Extract, Transform and Load Module Main Components Integration Engine User Interface Security Workflow Module Query Engine with Filtering DB of Clustered Indices De-identified Integrated Data Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Metadata Editor & ETL Record Index Agency Data Record Index Metadata Editor & ETL UI, Security, Workflow, Query Engine Integration Engine ` UI, Security, Workflow, Query Engine Competitor Components PII & Facts separate Xfer Encrypted Xfer Query Filtering No PII Sys Admin Approval req’d Audit logs No PII Data Mgmt

31 Desktop Integration Engine 1. Minimal Security No Encrypted Transfer of Data No Audit Logs Dump of Facts with PII No Secure Logins FTP or Thumb Drive Transfers No Anonymized Data 2. No Access Control - No Approval Workflow 3. No Chain of Custody Assurance – Possibility for Cherry-Picked Data Copies of Agency Data PII Visible Integrated Data Agency Data Integration Engine ` Competitor Deficits

32 Take a Test Drive Get a Login & Password Quick Start Guide Test Report Summary Full Documentation Full Test Report


Download ppt "Connecticut is Data Rich but Information Poor. Our Vision: Connecting the Silos."

Similar presentations


Ads by Google