Private Release of Graph Statistics using Ladder Functions J.ZHANG, G.CORMODE, M.PROCOPIUC, D.SRIVASTAVA, X.XIAO.

Slides:



Advertisements
Similar presentations
DIFFERENTIAL PRIVACY REU Project Mentors: Darakhshan Mir James Abello Marco A. Perez.
Advertisements

Location Recognition Given: A query image A database of images with known locations Two types of approaches: Direct matching: directly match image features.
Wavelet and Matrix Mechanism CompSci Instructor: Ashwin Machanavajjhala 1Lecture 11 : Fall 12.
Publishing Set-Valued Data via Differential Privacy Rui Chen, Concordia University Noman Mohammed, Concordia University Benjamin C. M. Fung, Concordia.
Differentially Private Transit Data Publication: A Case Study on the Montreal Transportation System ` Introduction With the deployment of smart card automated.
Correlation Search in Graph Databases Yiping Ke James Cheng Wilfred Ng Presented By Phani Yarlagadda.
Finding your friends and following them to where you are by Adam Sadilek, Henry Kautz, Jeffrey P. Bigham Presented by Guang Ling 1.
Private Analysis of Graph Structure With Vishesh Karwa, Sofya Raskhodnikova and Adam Smith Pennsylvania State University Grigory Yaroslavtsev
Raef Bassily Adam Smith Abhradeep Thakurta Penn State Yahoo! Labs Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds Penn.
Seminar in Foundations of Privacy 1.Adding Consistency to Differential Privacy 2.Attacks on Anonymized Social Networks Inbal Talgam March 2008.
2. Attacks on Anonymized Social Networks. Setting A social network Edges may be private –E.g., “communication graph” The study of social structure by.
Abstract Shortest distance query is a fundamental operation in large-scale networks. Many existing methods in the literature take a landmark embedding.
Differential Privacy (2). Outline  Using differential privacy Database queries Data mining  Non interactive case  New developments.
1.2 – Open Sentences and Graphs
Chapter 5 Objectives 1. Find ordered pairs associated with two equations 2. Solve a system by graphing 3. Solve a system by the addition method 4. Solve.
Introduction Social Media Mining. 2 Measures and Metrics 2 Social Media Mining Introduction Facebook How does Facebook use your data? Where do you think.
Differentially Private Data Release for Data Mining Benjamin C.M. Fung Concordia University Montreal, QC, Canada Noman Mohammed Concordia University Montreal,
Differentially Private Transit Data Publication: A Case Study on the Montreal Transportation System Rui Chen, Concordia University Benjamin C. M. Fung,
Graph-based consensus clustering for class discovery from gene expression data Zhiwen Yum, Hau-San Wong and Hongqiang Wang Bioinformatics, 2007.
Multiplicative Weights Algorithms CompSci Instructor: Ashwin Machanavajjhala 1Lecture 13 : Fall 12.
R 18 G 65 B 145 R 0 G 201 B 255 R 104 G 113 B 122 R 216 G 217 B 218 R 168 G 187 B 192 Core and background colors: 1© Nokia Solutions and Networks 2014.
Private Analysis of Graphs
1 IEEE Trans. on Smart Grid, 3(1), pp , Optimal Power Allocation Under Communication Network Externalities --M.G. Kallitsis, G. Michailidis.
6. Experimental Analysis Visible Boltzmann machine with higher-order potentials: Conditional random field (CRF): Exponential random graph model (ERGM):
Preserving Link Privacy in Social Network Based Systems Prateek Mittal University of California, Berkeley Charalampos Papamanthou.
APPLYING EPSILON-DIFFERENTIAL PRIVATE QUERY LOG RELEASING SCHEME TO DOCUMENT RETRIEVAL Sicong Zhang, Hui Yang, Lisa Singh Georgetown University August.
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Charu Aggarwal + * Department of Computer Science, University of Texas at Dallas + IBM T. J. Watson.
Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.
1 On Querying Historical Evolving Graph Sequences Chenghui Ren $, Eric Lo *, Ben Kao $, Xinjie Zhu $, Reynold Cheng $ $ The University of Hong Kong $ {chren,
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Michael Baron + * Department of Computer Science, University of Texas at Dallas + Department of Mathematical.
Predicting Product Adoption in Large-Scale Social Networks Offensive: Hao Chen.
Tracking with Unreliable Node Sequences Ziguo Zhong, Ting Zhu, Dan Wang and Tian He Computer Science and Engineering, University of Minnesota Infocom 2009.
Privacy-Aware Personalization for Mobile Advertising
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
Differentially Private Data Release for Data Mining Noman Mohammed*, Rui Chen*, Benjamin C. M. Fung*, Philip S. Yu + *Concordia University, Montreal, Canada.
Protecting Sensitive Labels in Social Network Data Anonymization.
Differentially Private Marginals Release with Mutual Consistency and Error Independent of Sample Size Cynthia Dwork, Microsoft TexPoint fonts used in EMF.
RESOURCES, TRADE-OFFS, AND LIMITATIONS Group 5 8/27/2014.
The Sparse Vector Technique CompSci Instructor: Ashwin Machanavajjhala 1Lecture 12 : Fall 12.
Personalized Social Recommendations – Accurate or Private? A. Machanavajjhala (Yahoo!), with A. Korolova (Stanford), A. Das Sarma (Google) 1.
Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.
PRISM: Private Retrieval of the Internet’s Sensitive Metadata Ang ChenAndreas Haeberlen University of Pennsylvania.
Differential Privacy Some contents are borrowed from Adam Smith’s slides.
Differential Privacy (1). Outline  Background  Definition.
Differential Privacy Xintao Wu Oct 31, Sanitization approaches Input perturbation –Add noise to data –Generalize data Summary statistics –Means,
Center-Piece Subgraphs: Problem definition and Fast Solutions Hanghang Tong Christos Faloutsos Carnegie Mellon University.
1 Differential Privacy Cynthia Dwork Mamadou H. Diallo.
Yang, et al. Differentially Private Data Publication and Analysis. Tutorial at SIGMOD’12 Part 4: Data Dependent Query Processing Methods Yin “David” Yang.
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
Space for things we might want to put at the bottom of each slide. Part 6: Open Problems 1 Marianne Winslett 1,3, Xiaokui Xiao 2, Yin Yang 3, Zhenjie Zhang.
Sergey Yekhanin Institute for Advanced Study Lower Bounds on Noise.
Inferring Networks of Diffusion and Influence
Heuristic & Approximation
Private Data Management with Verification
A paper on Join Synopses for Approximate Query Answering
Understanding Generalization in Adaptive Data Analysis
Privacy-preserving Release of Statistics: Differential Privacy
Graph Analysis with Node Differential Privacy
Kijung Shin1 Mohammad Hammoud1
Designing Private Forums
Chapter 3 Section 4.
Differential Privacy in Practice
Large Graph Mining: Power Tools and a Practitioner’s guide
Finding Fastest Paths on A Road Network with Speed Patterns
Binghui Wang, Le Zhang, Neil Zhenqiang Gong
Published in: IEEE Transactions on Industrial Informatics
Multiple DAGs Learning with Non-negative Matrix Factorization
Some contents are borrowed from Adam Smith’s slides
Differential Privacy (1)
Differential Privacy.
Presentation transcript:

Private Release of Graph Statistics using Ladder Functions J.ZHANG, G.CORMODE, M.PROCOPIUC, D.SRIVASTAVA, X.XIAO

Overview  The Problem: Private Release of Graph Statistics  Differential Privacy on Graph  Two “Solutions”: Global Sensitivity (GS) and Local Sensitivity (LS)  Global Sensitivity  Local Sensitivity  Ladder Functions: From LS to GS  Formal Results and Contributions  Experiments

Overview  The Problem: Private Release of Graph Statistics  Differential Privacy on Graph  Two “Solutions”: Global Sensitivity (GS) and Local Sensitivity (LS)  Global Sensitivity  Local Sensitivity  Ladder Functions: From LS to GS  Formal Results and Contributions  Experiments

Data Release companyinstitute public adversary

Private Data Release private algorithm user Objective 1: the noisy answer should reveal little about any individual in the database Objective 2: the noisy answer should be as accurate as possible

Overview  The Problem: Private Release of Graph Statistics  Differential Privacy on Graph  Two “Solutions”: Global Sensitivity (GS) and Local Sensitivity (LS)  Global Sensitivity  Local Sensitivity  Ladder Functions: From LS to GS  Formal Results and Contributions  Experiments

Differential Privacy on Graph Differentially private algorithm injects noise into the query answer, in order to cover the maximum impact of a relationship (an edge). Differentially private answer: binary answer + 1*noise 1 0

Differential Privacy on Graph Differentially private algorithm injects noise into the query answer, in order to cover the maximum impact of a relationship (an edge). Query: how many edges? Differentially private answer: true answer + 1*noise 6 5 neighboring 1514

Differential Privacy on Graph Differentially private algorithm injects noise into the query answer, in order to cover the maximum impact of a relationship (an edge). Query: how many triangles? Differentially private answer: true answer + 4*noise 2016 neighboring If there are n nodes: true answer + (n-2)*noise

Differential Privacy on Graph Differentially private algorithm injects noise into the query answer, in order to cover the maximum impact of a relationship (an edge). 4 In a real dataset CondMat from SNAP, #triangle = 173,361 while n-2 = 23,131 Query: how many triangles? Differentially private answer: true answer + 4*noise

Overview  The Problem: Private Release of Graph Statistics  Differential Privacy on Graph  Two “Solutions”: Global Sensitivity (GS) and Local Sensitivity (LS)  Global Sensitivity  Local Sensitivity  Ladder Functions: From LS to GS  Formal Results and Contributions  Experiments

Formal Definition [TCC’06]

Overview  The Problem: Private Release of Graph Statistics  Differential Privacy on Graph  Two “Solutions”: Global Sensitivity (GS) and Local Sensitivity (LS)  Global Sensitivity  Local Sensitivity  Ladder Functions: From LS to GS  Formal Results and Contributions  Experiments

Global Sensitivity [TCC’06]

Global Sensitivity

Overview  The Problem: Private Release of Graph Statistics  Differential Privacy on Graph  Two “Solutions”: Global Sensitivity (GS) and Local Sensitivity (LS)  Global Sensitivity  Local Sensitivity  Ladder Functions: From LS to GS  Formal Results and Contributions  Experiments

Local Sensitivity [STOC’07] GS = 4 LS(g) = 1 OR

Global Sensitivity

Local Sensitivity

GS = 4 LS(g) = 1 OR

Local Sensitivity

Overview  The Problem: Private Release of Graph Statistics  Differential Privacy on Graph  Two “Solutions”: Global Sensitivity (GS) and Local Sensitivity (LS)  Global Sensitivity  Local Sensitivity  Ladder Functions: From LS to GS  Formal Results and Contributions  Experiments

Ladder Functions We change slope gradually.

Ladder Functions We change slope gradually.

Ladder Functions We change slope gradually.

Ladder Functions: Summary ladder functions

Overview  The Problem: Private Release of Graph Statistics  Differential Privacy on Graph  Two “Solutions”: Global Sensitivity (GS) and Local Sensitivity (LS)  Global Sensitivity  Local Sensitivity  Ladder Functions: From LS to GS  Formal Results and Contributions  Experiments

Formal Results *In the paper, we present a discrete version of ladder functions

Graph Statistics triangle counting

Graph Statistics triangle counting

Formal Results For efficient sampling, we adopt Exponential Mechanism (pls refer to Section 3.2 of the paper).

Our Contributions triangle counting These two queries have been solved before. The solutions are also local sensitivity based. But they are either achieving a weakened version of differential privacy or less accurate. Our solution is the most accurate, with a pure differential privacy guarantee.

Our Contributions triangle counting Our solution is the first local sensitivity based solution with pure differential privacy guarantee. Our solution is the most accurate.

Overview  The Problem: Private Release of Graph Statistics  Differential Privacy on Graph  Two “Solutions”: Global Sensitivity (GS) and Local Sensitivity (LS)  Global Sensitivity  Local Sensitivity  Ladder Functions: From LS to GS  Formal Results and Contributions  Experiments

Experiment Results: CondMat triangle counting

Conclusions  We show that the pure differential privacy guarantee can be obtained for graph statistics, specifically subgraph counts.  We propose ladder functions which combine the merits of both GS and LS.  Future work includes extending the ladder framework to  large scale graphs like Facebook friendship graph  functions outside the domain of graphs, e.g., median of an array, machine learning tasks Thank you!