Download presentation
Presentation is loading. Please wait.
Published byEstella Cory Newton Modified over 6 years ago
1
Real-time Protection for Open Beacon Network
Open Data and Big Science S07 Diyue Bu, Xiaofeng Wang, Haixu Tang School of Informatics, Computing and Engineering Indiana University Bloomington
2
Disclosure I and my spouse/partner have no relevant relationships with commercial interests to disclose. AMIA | amia.org
3
Learning Objectives After participating in this session the learner should be better able to: Have an idea about the Beacon network. Know the potential attacks on the Beacon network. Know the mechanism of real-time flipping mitigation method. AMIA | amia.org
4
Presentation Outline Background Introduction
Beacon Network Attacks on Beacon Network Existing Mitigation Methods Proposed Method: Real-time Flipping (RTF) Method Experiments & Results Secure-Beacon Implementation AMIA | amia.org
5
The Beacon Network “The Beacon Network is a search engine across the world's public beacons. It enables global discovery of genetic mutations, federated across a large and growing network of shared genetic datasets.” Reference: Global Alliance for Genomics and Health. A federated ecosystem for sharing genomic, clinical data. Science 352, 1278–1280 (2016). What's a Beacon? Beacon is a genetic mutation sharing platform developed by the Global Alliance for Genomics and Health. A beacon is web service that any institution can implement to share genetic data. A beacon answers questions of the form "Do you have information about the following mutation?" and responds with one of "Yes" or "No", among potentially more information. A site offering this service is called a "beacon". This open web service is designed both to be technically simple while providing data generators options for distributing data through proportional safeguards. AMIA | amia.org
6
Attacks on Beacon Network
Shringarpure and Bustamante (SB) Attack [1] “Optimal” Attack [2] Inference attack using linkage disequilibrium (LD) & Markov chain model [3] Reference: 1. Shringarpure, Suyash S., and Carlos D. Bustamante. "Privacy risks from genomic data-sharing beacons." The American Journal of Human Genetics 97.5 (2015): 2. Raisaro, Jean Louis, et al. "Addressing Beacon re-identification attacks: quantification and mitigation of privacy risks." Journal of the American Medical Informatics Association (2017) 3. Nora von Thenen, et al. Inference Attacks Against Genomic Data-Sharing Beacons. GenoPri17 linkage disequilibrium AMIA | amia.org
7
SB (LRT) Attack Given the responses of n queries
H0: The queried victim’s genome is not in the target database. H1: The queried victim’s genome is in the target database. The power of the test: re-identification risk of an individual genome in a genomic database Reference: Shringarpure, Suyash S., and Carlos D. Bustamante. "Privacy risks from genomic data-sharing beacons." The American Journal of Human Genetics 97.5 (2015): Power of test: indicates the confidence of the attackers can conclude that the victim (with queried variants) is present in the target database the probability of correctly reject the null hypothesis over multiple tests AMIA | amia.org
8
Attacks on Beacon Network
Shringarpure and Bustamante (SB) Attack [1] An inference attack based on log-likelihood ratio test “Optimal” Attack [2] Query variants in rare-first order Inference attack using linkage disequilibrium (LD) & Markov chain model [3] Reference: 1. Shringarpure, Suyash S., and Carlos D. Bustamante. "Privacy risks from genomic data-sharing beacons." The American Journal of Human Genetics 97.5 (2015): 2. Raisaro, Jean Louis, et al. "Addressing Beacon re-identification attacks: quantification and mitigation of privacy risks." Journal of the American Medical Informatics Association (2017) 3. Nora von Thenen, et al. Inference Attacks Against Genomic Data-Sharing Beacons. GenoPri17 linkage disequilibrium AMIA | amia.org
9
Previous Mitigation Methods
Random Flipping (RF) Method [1] Randomly mask a proportion (ℇ) of rare SNPs Query Budget Method [1] remove individual’s genome information if high re-identification risk detected Reference: 1. Raisaro, Jean Louis, et al. "Addressing Beacon re-identification attacks: quantification and mitigation of privacy risks." Journal of the American Medical Informatics Association (2017) Rare SNPs: carried by only one individual in the database Mask/flip SNPs: a mechanism used by several mitigation method, which is to switch the answer of a query from Yes to No. We note that we do not switch the answer of a query from No to Yes, under which case the search result may confuse the researchers AMIA | amia.org
10
Previous Mitigation Methods
Strategic Flipping (SF) Method [1] Mask k percent of variants with largest discriminative power Eliminating Random Positions & Biased Randomized Response [2] Pitfall: flip out-of-target variants Reference: 1. Wan, Zhiyu, et al. "Controlling the signal: Practical privacy protection of genomic data sharing through Beacon services." BMC medical genomics 2. Al Aziz, Md Momin, et al. "Aftermath of bustamante attack on genomic beacon service." BMC medical genomics AMIA | amia.org
11
Real-time Flipping (RTF) Method
Mask different proportion of rare SNPs from each individual More efficient masking Real-time performance Better utility Safer environment Lower re-identification risk Utility: # of correctly answered queries/total # of queries The smaller the pvalue, the larger the noise AMIA | amia.org
12
Experiments Phase 3 of 1000 Genomes Project
Beacon database: 1,235 non-relative individuals Control group: 300 genomes Perform LRT attack in the order of Random Rare-first (“optimal” attack) [1] Discriminative-first [2] Typical user (statistics from Beacon Browser logs) [1] Reference: 1. Raisaro, Jean Louis, et al. “Addressing Beacon re-identification attacks: quantification and mitigation of privacy risks.” Journal of the American Medical Informatics Association (2017) Wan, Zhiyu, et al. "Controlling the signal: Practical privacy protection of genomic data sharing through Beacon services." BMC medical genomics Assume afs are known, ExAC Control group is used to validate to power of LRT attack AMIA | amia.org
13
Experiments Statistics
Beacon database 5,046,666 variants from Chr10 and Chr21 2,002,246 (39.7%) rare variants Parameters Random flipping method: ℇ = 0.15 (flip 15% rare SNPs) Strategic flipping method: k = 5 (flip 5%SNPs with largest discriminative power) 3,992,219 variants from Chr10; 1,054,447 variants from Chr21 1,588,903 (39.8%) rare variants; 413,343 (39.2%) rare variants AMIA | amia.org
14
Re-identification Risk (power of LRT test)
Under rare first order: RF’s power increases to 1.0 (100% re-identification risk [1]) when 1000 rare SNPs queried, RTF remains around 0.3 (low re-identification risk [1]). Under Other three query orders: SF’s power increase to 1.0 (100% re-identification risk [1]) when 1000 rare SNPs queried, RTF remains around 0.3 (low re- identification risk [1]). Reference: 1. Raisaro, Jean Louis, et al. "Addressing Beacon re-identification attacks: quantification and mitigation of privacy risks." Journal of the American Medical Informatics Association (2017) Chr1 & Chr21 for random & typical user AMIA | amia.org
15
The percentage of flipped rare SNPs
Rare first: RF’s power increases to 1.0 when 1000 rare SNPs queried, RTF remains around 0.3. Other three: SF’s power increase to 1.0 when 1000 rare SNPs queried, RTF remains around 0.3. AMIA | amia.org
16
The percentage of flipped rare SNPs
Rare first: RF’s power increases to 1.0 when 1000 rare SNPs queried, RTF remains around 0.3. Other three: SF’s power increase to 1.0 when 1000 rare SNPs queried, RTF remains around 0.3. AMIA | amia.org
17
The percentage of flipped SNPs
The percentage of flipped rare SNPs under different query patterns AMIA | amia.org
18
The percentage of flipped SNPs
The percentage of flipped rare SNPs under different query patterns AMIA | amia.org
19
Secure-Beacon Workflow
AMIA | amia.org
20
Secure-Beacon Interface
AMIA | amia.org
21
AMIA | amia.org
22
Acknowledgements NIH R01HG and U01EB NSF CNS Indiana University Initiative of Precision Health AMIA | amia.org
23
AMIA is the professional home for more than 5,400 informatics professionals, representing frontline clinicians, researchers, public health experts and educators who bring meaning to data, manage information and generate new knowledge across the research and healthcare enterprise. AMIA | amia.org
24
Email me at: diybu(at)indiana.edu
Thank you! me at: diybu(at)indiana.edu
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.