Presentation is loading. Please wait.

Presentation is loading. Please wait.

Experience-based access management & privacy-preserving record linkage elizabeth ashley durham thursday, november 11, 2010.

Similar presentations


Presentation on theme: "Experience-based access management & privacy-preserving record linkage elizabeth ashley durham thursday, november 11, 2010."— Presentation transcript:

1 experience-based access management & privacy-preserving record linkage elizabeth ashley durham thursday, november 11, 2010

2 roadmap experience-based access management privacy-preserving record linkage – definition – steps in record linkage – experiment – conclusions – open research questions in record linkage 2TRUST 2010

3 roadmap experience-based access management privacy-preserving record linkage – definition – steps in record linkage – experiment – conclusions – open research questions in record linkage 3TRUST 2010

4 access management Least Privilege: How can we limit provider access to only the information required to do their job? Identity and Access Management (IAM) – ex: role-based access controls IAM in health care organizations – complex workflow – routine emergencies TRUST 20104

5 the problem with access controls 5 Ideal Model Ideal Model Enforced Control the problem TRUST 2010 L. Røstad and N. Øystein. Access control and integration of health care systems: an experience report and future challenges. Proc. Availability, Reliability & Security, 2007; study: 43% of providers accessed records for which they did not have permissions

6 6 the experience-based access management (EBAM) lifecycle Ideal Model Ideal Model Enforced Control Access Log Expected Model TRUST 2010 C. Gunter, D. Liebovitz, and B. Malin. “EBAM: Experience-Based Access Management for Healthcare”. USENIX HealthSec’10 workshop For more information, see: USENIX Health Security workshop: Copy of the paper: Video of the presentation:

7 7 record linkage in surveillance access logs “Karen Lewis” human resources “Karen Lewis” hospital privacy office TRUST 2010

8 roadmap experience-based access management privacy-preserving record linkage – definition – steps in record linkage – experiment – conclusions – open research questions in record linkage 8TRUST 2010

9 privacy-preserving record linkage (pprl) set of records from dataholder Aset of records from dataholder B First Name Last Name Birth Day Birth Month Birth Year Gender KarynLewis28Sept1990F MartySmith19Apr1982M JonSmyth04Feb1960M JoyBeck08May1980F LauraRoot27Aug1945F 9 First Name Last Name Birth Day Birth Month Birth Year Gender JohnSmith01Feb1960M BobBeck19Mar1980M BobTaylor07Jun1972M KarenLewis28Sept1990F AliceTodd27Aug1965F TRUST 2010

10 roadmap experience-based access management privacy-preserving record linkage – definition – applications – steps in record linkage – experiment – conclusions – open research questions in record linkage 10TRUST 2010

11 steps in record linkage blocking field comparison record pair comparison record pair classification matches non-matches * * * I assume a common schema and method of data standardization. I also assume that the records from an institution have been deduplicated (i.e., record linkage has been applied within each institution such that an individual is represented by only a single record within an institution.) 11 TRUST 2010

12 JohnSmith04Mar1962M record a: record b: comparison vector: field comparison 12 First Name Last Name Birth Day Birth Month Birth Year Gender fields: TRUST 2010

13 roadmap experience-based access management privacy-preserving record linkage – definition – applications – steps in record linkage – experiment – conclusions – open research questions in record linkage 13TRUST 2010

14 privacy-preserving field comparison experiment the dataset 1,000 records from the North Carolina Voter Registration database fields: 1,000 “corrupted” records repeated 100 times to examine statistical significance P. Christen and A. Pudjijono, “Accurate Synthetic Generation of Realistic Personal Information.” Proceedings of the 13 th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, TRUST 2010 Last Name First Name Middle Name Birth State CityStateStreet Name Street Type Street Suffix RaceGender data corrupter KATHRYNMCMILLAN KATHYMEMILLAN

15 privacy-preserving field comparison experiment option 1: hash & compare option 2: secure edit similarity option 3: bloom filter 15TRUST 2010

16 privacy-preserving field comparison option 1: hash & compare JohnSmith04Mar1962M record a: record b: comparison vector: xy9lbr3fxtvesvr3dns 16 SHA-1, “salting” used to prevent dictionary attack TRUST 2010

17 privacy-preserving field comparison experiment option 2: secure edit similarity 17 edit distance: the minimal number of insertions, deletions, and substitutions required to convert one string into another edit similarity: “secure” edit distance: calculated by iteratively using homomorphic encryption to compute the value of each cell of the matrix used in the dynamic programming algorithm to calculate edit distance W. Du, M. J. Atallah, “Protocols for Secure Remote Database Access with Approximate Matching, Technical Report”, CERIAS, Purdue Uni- versity, TRUST 2010

18 privacy-preserving field comparison experiment option 3: Bloom filters record arecord b john jon _jjoohhnn_ _jjoonn_ h1h1 h2h α:α: β:β: 1,000 bits & 30 hash functions (all variations of SHA-1, “salting” used to prevent dictionary attack) 18 Rainer Schnell, Tobias Bachteler, and Jorg Reiher. “Privacy-preserving record linkage using Bloom filters,” BMC Medical Informatics and Decision Making (9) TRUST 2010

19 privacy-preserving field comparison experiment run time 2.5 GHz quad core PC with 4GB of memory Elizabeth Durham, Yuan Xue, Murat Kantarcioglu, and Bradley Malin. Submitted to Information Fusion TRUST 2010

20 privacy-preserving field comparison experiment correctness Elizabeth Durham, Yuan Xue, Murat Kantarcioglu, and Bradley Malin. Submitted to Information Fusion TRUST 2010

21 roadmap experience-based access management privacy-preserving record linkage – definition – applications – steps in record linkage – experiment – conclusions – open research questions in record linkage 21TRUST 2010

22 conclusions TRUST hash & comparebloom filter secure edit distance accuracy: speed: security: overall:

23 roadmap experience-based access management privacy-preserving record linkage – definition – applications – steps in record linkage – experiment – conclusions – open research questions in record linkage 23TRUST 2010

24 centralized distributed 24 open research questions in record linkage TRUST 2010

25 thanks NLM 2-T15LM NIH R01 LM TRUST 2010 NSF CNS (EBAM) NSF CCF (TRUST) ebam privacy-preserving record linkage

26 roadmap experience-based access management privacy-preserving record linkage – definition – applications – steps in record linkage – experiment design results – open research questions in record linkage 26 blocking field comparison record pair comparison record pair classification TRUST 2010

27 John Smith, … Bob Beck, … Bob Taylor, … Karen Lewis, … Alice Todd, … Jon Smyth, … Joy Beck, …Marty Smith, …Karyn Lewis, …Laura Root, … |A||B| = 25 record pair comparisons John Smith, … Bob Beck, … Bob Taylor, … Karen Lewis, … Alice Todd, … Jon Smyth, … Joy Beck, …Karyn Lewis, … Laura Root, … Marty Smith, … 4 record pair comparisons no blocking blocking (first letter of last name) blocking 27 = match = non-match TRUST 2010

28 roadmap experience-based access management privacy-preserving record linkage – definition – applications – steps in record linkage – experiment design results – open research questions in record linkage 28 blocking field comparison record pair comparison record pair classification TRUST 2010

29 continuous fellegi-sunter 29 * Note this assumes a uniform distribution of similarity scores. Edward H. Porter and William E. Winkler, “Approximate String Comparison and its Effect on an Advanced Record Linkage System”, Research Report RR97/02, U.S. Census Bureau TRUST 2010

30 conditional probability vectors: m[i] = P(a[i] == b[i] | (a,b) is a match)* u[i] = P(a[i] == b[i] | (a,b) is a non-match) * where i = 1, …, # fields weight vectors: agreement weight: w a [i] = log(m[i] / u[i]) disagreement weight: w d [i] = log(1-m[i] / 1-u[i]) scoring: Fellegi-Sunter (FS) * The Expectation Maximization (EM) algorithm, or a subset of records for which the true match status is known, can be used to determine these conditional probabilities. record pair comparison 30 calculated once per record linkage over all record pairs calculated for each record pair

31 conditional probability vectors:weight vectors: 31 fellegi-sunter I. Fellegi and A. Sunter, "A theory for record linkage.” Journal of the American Statistical Society, TRUST 2010

32 roadmap experience-based access management privacy-preserving record linkage – definition – applications – steps in record linkage – experiment design results – open research questions in record linkage 32 blocking field comparison record pair comparison record pair classification TRUST 2010

33 match score record pair classification non-match match non-match match non-match 33 record pair classification mimiwilliams fbillrogers m mimiwilliams f mimiwilliams f billrogers m billrogers m billrogers m jackabbott m jackabbott m jackabbott m momowilliams f billrogers m billrogers m williamrogers m momowilliams f momowilliams f williamrogers m williamrogers m TRUST 2010

34 34 open research questions in record linkage John Smith, … Bob Beck, … Bob Taylor, … Karen Lewis, … Alice Todd, … Jon Smyth, … Joy Beck, …Marty Smith, …Karyn Lewis, …Laura Root, … |A||B| = 25 record pair comparisons John Smith, … Bob Beck, … Bob Taylor, … Karen Lewis, … Alice Todd, … Jon Smyth, … Joy Beck, …Karyn Lewis, … Laura Root, … Marty Smith, … 4 record pair comparisons no blocking blocking (first letter of last name) = match = non-match TRUST 2010

35 open research questions in record linkage first name last name birth date gender billrogers m mimiwilliams f jackabbott m first name last name birth date gender billrogers m momowilliams f williamrogers m first name last name birth date gender billrogers m mimiwilliams f jackabbott m first name last name birth date gender billrogers m momowilliams f williamrogers m actual predicted TRUST


Download ppt "Experience-based access management & privacy-preserving record linkage elizabeth ashley durham thursday, november 11, 2010."

Similar presentations


Ads by Google