Presentation is loading. Please wait.

Presentation is loading. Please wait.

Modeling and Detecting Anomalous Topic Access Siddharth Gupta 1, Casey Hanson 2, Carl A Gunter 3, Mario Frank 4, David Liebovitz 4, Bradley Malin 6 1,2,3,4.

Similar presentations


Presentation on theme: "Modeling and Detecting Anomalous Topic Access Siddharth Gupta 1, Casey Hanson 2, Carl A Gunter 3, Mario Frank 4, David Liebovitz 4, Bradley Malin 6 1,2,3,4."— Presentation transcript:

1 Modeling and Detecting Anomalous Topic Access Siddharth Gupta 1, Casey Hanson 2, Carl A Gunter 3, Mario Frank 4, David Liebovitz 4, Bradley Malin 6 1,2,3,4 Department of Computer Science, 3,5 Department of Medicine, 6 Department of Biomedical Informatics 1,2,3 University of Illinois at Urbana-Champaign, 4 University of California, Berkeley, 5 Northwestern University, 6 Vanderbilt University

2 Motivation and Challenges Our Contributions Dataset Description Random Topic Access (RTA) Model Random Topic Access Detection (RTAD) Model Evaluation and Results Outline of the talk

3 Reported on April 2013 The University of Florida : 2 offenders illegitimately accessed 15,000 patients over 3 years (March 2009- October 2012). Personal information, including names, addresses, date of birth, medical record numbers and Social Security numbers were compromised for the purposes of billing fraud. One of the offender was the insider in the hospital without prior. How can we efficiently model and detect these types of attacks in the healthcare system. EMR Access Breach

4 Two broad classes of threats: Inside Threats: the behaviors of hospital users (staff) that adversely affects the healthcare institution, where they commit financial frauds, medical identity thefts and curiosity accesses to EMR. Outside Threats: an outsider entity hires an insider to commit fraud, a visitor accessing records on open computers in some scenarios, untrustable patient seeking information about other patient’s records. Ramifications: Irreversible violation of patient privacy and subsequent high cost for hospitals. Deterrent: The current legal deterrent is a number of legal regulations, such as the HIPAA and HITECH, which impose specific privacy rules for patients and financial penalties for violating them Motivation

5 Build a classifier on labeled data to differentiate anomalous users from legitimate users. Real healthcare data is not labeled. Current methods use injection of synthetic anomalous users and evaluate on them. Classical Detection Methodologies

6 In Healthcare information systems the primary mechanism for generating anomalous users is to associate users with random patients in the dataset. We call such a system, ROA (random object access). The resulting user doesn’t appear to be a plausible attacker in the real hospital setting. Random Object Access

7 Random Topic Access (RTA): we introduce and study a random topic access model or RTA aimed at users whose access may be illegitimate but is not fully random because it is focused on common semantic themes. User Simulation: we utilize the latent topic framework to simulate illegitimate users and model them as samples from a Dirichlet distribution over topic multinomials. Anomaly Detection Framework: study RTA to detect and evaluate the users having suspicious access patterns. Our Contributions

8 Data Set Fig a) Summary Statistics for Audit Logs Fig b) Summary Statistics for Patient Records

9 Random Topic Access (RTA) Model: a mechanism for utilizing latent topic structures to represent real users in the population and allow for the synthetic generation of semantically relevant anomalous users. Topic modeling can provide a concise description of how a user behaves in the context of his peers and the meaning of that behavior. Model users as samples from a Dirichlet distribution over topic multinomials. Random Topic Access (RTA) Model

10 Latent Dirichlet Allocation (LDA) Diagnosis Raw Feature Patient... 10101 LDA Diagnosis Topic Feature Patient 10.20.10.70

11 Topic Distributions

12 Topics Distributions Diagnosis Topics Neoplasm TopicObstetric Topic Kidney Topic

13 Characterizing Users

14 Multidimensional Scaling: Patient Diagnosis

15 RTA: Simulating Users a.) Directed or Masquerading User (α<1) : an anomalous user of some specialty gains sole access to the terminal of another user in the hospital. b.) Purely Random User (α=1): user is characterized by completely random behavior, with little semantic congruence to the hospital setting c.) Indirect User: user type resembles an even blend of the topics of many specialized users

16 Population Distribution α = 0.01 α = 0.1 α = 1 α = 100 A. Directed Users B. Purely Random Users C. Indirected Users

17 Role Distribution NMH Resident Fellow CPOE Masquerading Users Purely Random Users Indirect Users Anomalous Users Real Users

18 Random Topic Access Detection (RTAD)

19 Results - I

20 Results - II

21 Thank You ! Contact: sid88in@gmail.com Sponsors:


Download ppt "Modeling and Detecting Anomalous Topic Access Siddharth Gupta 1, Casey Hanson 2, Carl A Gunter 3, Mario Frank 4, David Liebovitz 4, Bradley Malin 6 1,2,3,4."

Similar presentations


Ads by Google