Presentation is loading. Please wait.

Presentation is loading. Please wait.

Knowledge Discovery From Massive Healthcare Claims Data

Similar presentations


Presentation on theme: "Knowledge Discovery From Massive Healthcare Claims Data"— Presentation transcript:

1 Knowledge Discovery From Massive Healthcare Claims Data
Varun Chandola, Sreenivas Sukumar, Jack Schryver Presented by Anatoli Shein

2 Motivation: US health care
2008: 15.2% of GDP 2017: 19.5% of GDP Anatoli Shein 5/25/2018

3 Goal: Improve cost-care ratio
Improve healthcare operations. Reduce fraud, waste, and abuse. Anatoli Shein 5/25/2018

4 Big Data Analytics in HealthCare
Anatoli Shein 5/25/2018

5 Big Data in HealthCare Categorized
Anatoli Shein 5/25/2018

6 Data quality and availability
Clinical Data, Behavior data, and Pharmaceutical Data: Useful but unavailable Anatoli Shein 5/25/2018

7 Data quality and availability
Health insurance Data Available but needs preparation Anatoli Shein 5/25/2018

8 State of the Art Analytics for Massive HealthCare Data:
Network analysis Text mining Temporal analysis Higher order feature construction Anatoli Shein 5/25/2018

9 Health Insurance 85% of Americans have it It’s data is stored to :
Track payments Address fraud Address economic challenges. Strong analytic insight into healthcare. Anatoli Shein 5/25/2018

10 Health Insurance Data Model
Fee-for-service model Provider -> Service -> Patient -> Cost -> Justification -> Payor Anatoli Shein 5/25/2018

11 Data Maintained for Operation
Claims information Patient enrollment and eligibility Provider enrollment Anatoli Shein 5/25/2018

12 Challenges and Opportunities
Fraud Waste Abuse Anatoli Shein 5/25/2018

13 Fraud Billing for not provided services Large scale fraud
Anatoli Shein 5/25/2018

14 Waste Improper payments Double payments Duplicate claims
Outdated fee schedule Anatoli Shein 5/25/2018

15 Abuse Prospective payment system Upcoding Anatoli Shein 5/25/2018

16 Data Used Claims data (48 million beneficiaries in the US) from transactional data warehouses Provider enrollment data (from private organizations) Fraudulent providers (from Office of Inspector General’s exclusion) The rest are treated as non-fraudulent Anatoli Shein 5/25/2018

17 Claims Data Anatoli Shein 5/25/2018

18 Analysis Identification of typical treatment profiles
Identification of costly areas Anatoli Shein 5/25/2018

19 Text Analysis, profile building
Apache Mahout Hadoop Based technology Map Reduce Anatoli Shein 5/25/2018

20 Entities as Documents Document-term matrixes
P(providers) B(beneficiaries) C(procedures) G(diagnoses) D(drugs) Ex: PG (providers/diagnoses) Anatoli Shein 5/25/2018

21 Anatoli Shein 5/25/2018

22 Interesting find Some seemingly different diagnosis codes got grouped to the same topics Ex: Diabetes and Dermatoses Anatoli Shein 5/25/2018

23 Social Network Analysis
Estimate the risk of a provider fraud before making any claims by constructing social network Anatoli Shein 5/25/2018

24 Provider Network Anatoli Shein 5/25/2018

25 Texas Provider Network
Anatoli Shein 5/25/2018

26 Extracting Features from Provider Network
Anatoli Shein 5/25/2018

27 Information complexity measure
Most distinguishing features showed to be: Node degree Number of fraudulent providers in 2-hop network Eigenvector centrality Current-flow closeness centrality Anatoli Shein 5/25/2018

28 Anatoli Shein 5/25/2018

29 Temporal Feature Construction
By looking at provider data over time we can find anomalies Increase in number of patients Taking patients with conditions different from their past profiles Anatoli Shein 5/25/2018

30 Fraudulent Provider Detection
Anatoli Shein 5/25/2018

31 Conclusions Introduced domain of “big” healthcare claims data
Analyzed health care claims data on a country level using state of art analytics for massive data Problem was transformed to well known analysis problems in the data mining community Several approaches presented for identifying fraud, waste and abuse Anatoli Shein 5/25/2018

32 Thank you. Questions? Anatoli Shein 5/25/2018


Download ppt "Knowledge Discovery From Massive Healthcare Claims Data"

Similar presentations


Ads by Google