Private Analysis of Graphs

Slides:



Advertisements
Similar presentations
I have a DREAM! (DiffeRentially privatE smArt Metering) Gergely Acs and Claude Castelluccia {gergely.acs, INRIA 2011.
Advertisements

Wavelet and Matrix Mechanism CompSci Instructor: Ashwin Machanavajjhala 1Lecture 11 : Fall 12.
Publishing Set-Valued Data via Differential Privacy Rui Chen, Concordia University Noman Mohammed, Concordia University Benjamin C. M. Fung, Concordia.
Differentially Private Recommendation Systems Jeremiah Blocki Fall A: Foundations of Security and Privacy.
Iowa State University Department of Computer Science Center for Computational Intelligence, Learning, and Discovery Harris T. Lin and Vasant Honavar. BigData2013.
Private Analysis of Graph Structure With Vishesh Karwa, Sofya Raskhodnikova and Adam Smith Pennsylvania State University Grigory Yaroslavtsev
Raef Bassily Adam Smith Abhradeep Thakurta Penn State Yahoo! Labs Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds Penn.
The End of Anonymity Vitaly Shmatikov. Tastes and Purchases slide 2.
1 Differentially Private Analysis of Graphs and Social Networks Sofya Raskhodnikova Pennsylvania State University.
Maximizing the Spread of Influence through a Social Network By David Kempe, Jon Kleinberg, Eva Tardos Report by Joe Abrams.
Los Angeles September 27, 2006 MOBICOM Localization in Sparse Networks using Sweeps D. K. Goldenberg P. Bihler M. Cao J. Fang B. D. O. Anderson.
An brief tour of Differential Privacy Avrim Blum Computer Science Dept Your guide:
Department of Computer Science, University of Maryland, College Park, USA TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:
Explorations in Anonymous Communication Andrew Bortz with Luis von Ahn Nick Hopper Aladdin Center, Carnegie Mellon University, 8/19/2003.
1 Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles The 3rd ACM Conference on Recommender Systems, New.
Anatomy: Simple and Effective Privacy Preservation Israel Chernyak DB Seminar (winter 2009)
Malicious parties may employ (a) structure-based or (b) label-based attacks to re-identify users and thus learn sensitive information about their rating.
April 13, 2010 Towards Publishing Recommendation Data With Predictive Anonymization Chih-Cheng Chang †, Brian Thompson †, Hui Wang ‡, Danfeng Yao † †‡
2. Attacks on Anonymized Social Networks. Setting A social network Edges may be private –E.g., “communication graph” The study of social structure by.
The community-search problem and how to plan a successful cocktail party Mauro SozioAris Gionis Max Planck Institute, Germany Yahoo! Research, Barcelona.
Abstract Shortest distance query is a fundamental operation in large-scale networks. Many existing methods in the literature take a landmark embedding.
The Union-Split Algorithm and Cluster-Based Anonymization of Social Networks Brian Thompson Danfeng Yao Rutgers University Dept. of Computer Science Piscataway,
Structure based Data De-anonymization of Social Networks and Mobility Traces Shouling Ji, Weiqing Li, and Raheem Beyah Georgia Institute of Technology.
Right Protection via Watermarking with Provable Preservation of Distance-based Mining Spyros Zoumpoulis Joint work with Michalis Vlachos, Nick Freris and.
Hashing it Out in Public Common Failure Modes of DHT-based Anonymity Schemes Andrew Tran, Nicholas Hopper, Yongdae Kim Presenter: Josh Colvin, Fall 2011.
Differentially Private Data Release for Data Mining Benjamin C.M. Fung Concordia University Montreal, QC, Canada Noman Mohammed Concordia University Montreal,
Differentially Private Transit Data Publication: A Case Study on the Montreal Transportation System Rui Chen, Concordia University Benjamin C. M. Fung,
+ Differential Privacy in the Streaming World Aleksandar (Sasho) Nikolov Rutgers University.
Privacy and trust in social network
The Complexity of Differential Privacy Salil Vadhan Harvard University TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:
Approximating the MST Weight in Sublinear Time Bernard Chazelle (Princeton) Ronitt Rubinfeld (NEC) Luca Trevisan (U.C. Berkeley)
Overview of Privacy Preserving Techniques.  This is a high-level summary of the state-of-the-art privacy preserving techniques and research areas  Focus.
Preserving Link Privacy in Social Network Based Systems Prateek Mittal University of California, Berkeley Charalampos Papamanthou.
APPLYING EPSILON-DIFFERENTIAL PRIVATE QUERY LOG RELEASING SCHEME TO DOCUMENT RETRIEVAL Sicong Zhang, Hui Yang, Lisa Singh Georgetown University August.
Network Characterization via Random Walks B. Ribeiro, D. Towsley UMass-Amherst.
Protecting Sensitive Labels in Social Network Data Anonymization.
1 Sublinear Algorithms Lecture 1 Sofya Raskhodnikova Penn State University TexPoint fonts used in EMF. Read the TexPoint manual before you delete this.
A Graph-based Friend Recommendation System Using Genetic Algorithm
Resisting Structural Re-identification in Anonymized Social Networks Michael Hay, Gerome Miklau, David Jensen, Don Towsley, Philipp Weis University of.
Personalized Social Recommendations – Accurate or Private? A. Machanavajjhala (Yahoo!), with A. Korolova (Stanford), A. Das Sarma (Google) 1.
Raef Bassily Computer Science & Engineering Pennsylvania State University New Tools for Privacy-Preserving Statistical Analysis Yahoo! Labs Sunnyvale February.
Boosting and Differential Privacy Cynthia Dwork, Microsoft Research TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A.
Privacy vs. Utility Xintao Wu University of North Carolina at Charlotte Nov 10, 2008.
Community-enhanced De-anonymization of Online Social Networks Shirin Nilizadeh, Apu Kapadia, Yong-Yeol Ahn Indiana University Bloomington CCS 2014.
m-Privacy for Collaborative Data Publishing
Graph Data Management Lab, School of Computer Science Personalized Privacy Protection in Social Networks (VLDB2011)
Differential Privacy Xintao Wu Oct 31, Sanitization approaches Input perturbation –Add noise to data –Generalize data Summary statistics –Means,
Private Release of Graph Statistics using Ladder Functions J.ZHANG, G.CORMODE, M.PROCOPIUC, D.SRIVASTAVA, X.XIAO.
Privacy Preserving in Social Network Based System PRENTER: YI LIANG.
Yang, et al. Differentially Private Data Publication and Analysis. Tutorial at SIGMOD’12 Part 4: Data Dependent Query Processing Methods Yin “David” Yang.
1 Link Privacy in Social Networks Aleksandra Korolova, Rajeev Motwani, Shubha U. Nabar CIKM’08 Advisor: Dr. Koh, JiaLing Speaker: Li, HueiJyun Date: 2009/3/30.
Differential Privacy with Bounded Priors: Reconciling Utility and Privacy in Genome-Wide Association Studies Florian Tramèr, Zhicong Huang, Erman Ayday,
Privacy Issues in Graph Data Publishing Summer intern: Qing Zhang (from NC State University) Mentors: Graham Cormode and Divesh Srivastava.
Xiaowei Ying, Kai Pan, Xintao Wu, Ling Guo Univ. of North Carolina at Charlotte SNA-KDD June 28, 2009, Paris, France Comparisons of Randomization and K-degree.
Cohesive Subgraph Computation over Large Graphs
Property Testing (a.k.a. Sublinear Algorithms )
Private Data Management with Verification
Trustworthiness Management in the Social Internet of Things
Techniques for Achieving Vertex Level Differential Privacy
Approximating the MST Weight in Sublinear Time
Privacy-preserving Release of Statistics: Differential Privacy
Graph Analysis with Node Differential Privacy
Differential Privacy in Practice
CIS 700: “algorithms for Big Data”
Differential Privacy and Statistical Inference: A TCS Perspective
Michael Ernst CSE 140 University of Washington
Classification Trees for Privacy in Sample Surveys
Differentially Private Analysis of Graphs and Social Networks
Published in: IEEE Transactions on Industrial Informatics
Planting trees in random graphs (and finding them back)
Presentation transcript:

Private Analysis of Graphs Sofya Raskhodnikova Penn State University, on sabbatical at BU for 2013-2014 privacy year Joint work with Shiva Kasiviswanathan (GE Research), Kobbi Nissim (Ben-Gurion, Harvard, BU), Adam Smith (Penn State, BU)

Publishing information about graphs Many types of data can be represented as graphs, where nodes correspond to individuals and edges capture relationships between them. Examples include … In many situations, somebody might want to publish or release some information about these graphs: say, for research or oversight or advertising purposes. However, one has to be careful about how it is done because these graphs contain very sensitive information. This is the graph of romantic relationships in one American high school from a famous sociological study. In this case, the researchers decided to publish the entire largest connected component of the graph, after removing all information associated with each node, except for gender. Is this graph really anonymized? Taking a closer look at this blue node and its connections, one might wonder how the researchers managed to get this boy to sign the consent form for releasing his data. Even though they released no “identifying information”, only “graph data”, other participants in this study who knew just a bit about him, could have learned much more. You might say that I picked the most interesting node in this graph. And you’d be right. However, there is lots of curious observations one can make about this graph. For instance, there is only one pink node with four neighbors. More generally, it has been pointed out that in real-world social networks, if we look at relatively small neighborhoods, each node’s neighborhood is unique. Many types of data can be represented as graphs “Friendships” in online social network Financial transactions Email communication Health networks (of doctors and patients) Romantic relationships image source http://community.expressor-software.com/blogs/mtarallo/36-extracting-data-facebook-social-graph-expressor-tutorial.html Privacy is a big issue! American J. Sociology,   Bearman, Moody, Stovel

Private analysis of graph data Graph G Users Trusted curator Government, researchers, businesses (or) malicious adversary ( ) queries answers Two conflicting goals: utility and privacy image source http://www.queticointernetmarketing.com/new-amazing-facebook-photo-mapper/

Private analysis of graph data Graph G Users Trusted curator Government, researchers, businesses (or) malicious adversary ( ) internet queries answers social networks Why is it hard? Presence of external information Can’t assume we know the sources “Anonymization” schemes are regularly broken anonymized datasets image source http://www.queticointernetmarketing.com/new-amazing-facebook-photo-mapper/

Some published attacks Reidentifying individuals based on external sources Social networks [Backstrom Dwork Kleinberg 07, Narayanan Shmatikov 09] Computer networks [Coull Wright Monrose Collins Reiter 07, Ribeiro Chen Miklau Townsley 08] Genetic data (GWAS) [Homer et al. 08, ...] Microtargeted advertising [Korolova 11] Recommendation systems [Calandrino Kiltzer Narayanan Felten Shmatikov 11] Composition attacks Combining independent anonymized releases [Ganta Kasiviswanathan Smith 08] Reconstruction attacks Combining multiple noisy statistics [Dinur Nissim 03, …] Hospital A Attacker Hospital B

Who’d want to de-anonymize a social network graph? Government agency interested in surveillance. A phisher or a spammer in order to craft a highly individualized, believable message. Marketers. Stalkers, nosy colleagues, employers or neighbors. image sources © Depositphotos.com/fabioberti.it, Andrew Joyner, http://dukeromkey.com/

Private analysis of graph data Graph G Users Trusted curator Government, researchers, businesses (or) malicious adversary ( ) queries answers Two conflicting goals: utility and privacy utility: accurate answers privacy: ? A definition that quantifies privacy loss composes is robust to external information image source http://www.queticointernetmarketing.com/new-amazing-facebook-photo-mapper/

Differential privacy (for graph data) This is the standard definition of differential privacy. The only innovation is that the usual picture with discs got updated to a graph. An algorithm … the usual condition holds. What does it mean for two graphs to be neighbors? Graph G Users Trusted curator Government, researchers, businesses (or) malicious adversary ( ) queries A answers Intuition: neighbors are datasets that differ only in some information we’d like to hide (e.g., one person’s data) Differential privacy [Dwork McSherry Nissim Smith 06] An algorithm A is 𝝐-differentially private if for all pairs of neighbors 𝑮, 𝑮′ and all sets of answers S: 𝑷𝒓 𝑨 𝑮 ∈𝑺 ≤ 𝒆 𝝐 𝑷𝒓 𝑨 𝑮 ′ ∈𝑺 image source http://www.queticointernetmarketing.com/new-amazing-facebook-photo-mapper/

Two variants of differential privacy for graphs Node differential privacy is more in the spirit of protecting privacy of each individual. However, this definition is significantly harder to satisfy because you have to cover much larger changes in the graph. Edge differential privacy Two graphs are neighbors if they differ in one edge. Node differential privacy Two graphs are neighbors if one can be obtained from the other by deleting a node and its adjacent edges. G: G′: G: G′:

Node differentially private analysis of graphs Graph G Users Trusted curator Government, researchers, businesses (or) malicious adversary ( ) queries A answers Two conflicting goals: utility and privacy Impossible to get both in the worst case Previously: no node differentially private algorithms that are accurate on realistic graphs image source http://www.queticointernetmarketing.com/new-amazing-facebook-photo-mapper/

Our contributions First node differentially private algorithms that are accurate for sparse graphs node differentially private for all graphs accurate for a subclass of graphs, which includes graphs with sublinear (not necessarily constant) degree bound graphs where the tail of the degree distribution is not too heavy dense graphs Techniques for node differentially private algorithms Methodology for analyzing the accuracy of such algorithms on realistic networks Concurrent work on node privacy [Blocki Blum Datta Sheffet 13]

Our contributions: algorithms Scale-free = network whose distribution follows a power law. Node differentially private algorithms for releasing number of edges counts of small subgraphs (e.g., triangles, 𝒌-triangles, 𝒌-stars) degree distribution Accuracy analysis of our algorithms for graphs with not-too-heavy-tailed degree distribution: with 𝜶-decay for constant 𝛼>1 Notation: 𝒅 = average degree 𝑷 𝒅 = fraction of nodes in G of degree ≥𝑑 Every graph satisfies 1-decay Natural graphs (e.g., “scale-free” graphs, Erdos-Renyi) satisfy 𝛼>1 … … Frequency A graph G satisfies 𝜶-decay if for all 𝑡>1: 𝑃 𝑡⋅ 𝑑 ≤ 𝑡 −𝛼 ≤𝒕 −𝜶 … … … 𝒅 𝑡⋅ 𝒅 Degrees

Our contributions: accuracy analysis Node differentially private algorithms for releasing number of edges counts of small subgraphs (e.g., triangles, 𝒌-triangles, 𝒌-stars) degree distribution Accuracy analysis of our algorithms for graphs with not-too-heavy-tailed degree distribution: with 𝜶-decay for constant 𝛼>1 … … A graph G satisfies 𝜶-decay if for all 𝑡>1: 𝑃 𝑡⋅ 𝑑 ≤ 𝑡 −𝛼 (1+o(1))-approximation } 𝐀 𝛜,𝛂 𝐆 −𝐃𝐞𝐠𝐃𝐢𝐬𝐭𝐫𝐢𝐛(𝐆) 𝟏 =𝐨 𝟏

Previous work on differentially private computations on graphs Edge differentially private algorithms number of triangles, MST cost [Nissim Raskhodnikova Smith 07] degree distribution [Hay Rastogi Miklau Suciu 09, Hay Li Miklau Jensen 09, Karwa Slavkovic 12] small subgraph counts [Karwa Raskhodnikova Smith Yaroslavtsev 11] cuts [Blocki Blum Datta Sheffet 12] Edge private against Bayesian adversary (weaker privacy) small subgraph counts [Rastogi Hay Miklau Suciu 09] Node zero-knowledge private (stronger privacy) average degree, distances to nearest connected, Eulerian, cycle-free graphs (privacy only for bounded-degree graphs) [Gehrke Lui Pass 12]

Differential privacy basics Graph G Users Trusted curator Government, researchers, businesses (or) malicious adversary ( ) statistic f A approximation to f(G) How accurately can an 𝝐-differentially private algorithm release f(G)?

Global sensitivity framework [DMNS’06] The first upper bound on the error was given in the paper that defined DP by DMNS. Global sensitivity of a function 𝑓 is For every function 𝑓, there is an 𝜖-differentially private algorithm that w.h.p. approximates 𝑓 with additive error 𝝏𝒇 𝝐 . Examples: 𝑓 − (G) is the number of edges in G. 𝑓 △ (G) is the number of triangles in G. 𝝏𝒇= max 𝐧𝐨𝐝𝐞 𝐧𝐞𝐢𝐠𝐡𝐛𝐨𝐫𝑠 𝐺,𝐺′ 𝑓 𝐺 −𝑓 𝐺 ′ 𝝏 𝒇 − = 𝑛. 𝝏 𝒇 △ = 𝒏 𝟐 .

“Projections” on graphs of small degree The starting point of our algorithms is a simple observation that global sensitivity is much smaller if we restrict our attention to bounded-degree graphs. Let 𝓖 = family of all graphs, 𝓖 𝑑 = family of graphs of degree ≤𝑑. Notation. 𝝏𝒇 = global sensitivity of 𝒇 over 𝓖. 𝝏 𝒅 𝒇 = global sensitivity of 𝒇 over 𝓖 𝑑 . Observation. 𝝏 𝒅 𝒇 is low for many useful 𝑓. Examples: 𝝏 𝒅 𝒇 − = 𝒅 (compare to 𝝏 𝒇 − = 𝒏) 𝝏 𝒅 𝒇 △ = 𝒅 𝟐 (compare to 𝝏 𝒇 △ = 𝒏 𝟐 ) Idea: ``Project’’ on graphs in 𝓖 𝑑 for a carefully chosen d << n. 𝓖 𝓖 𝑑 Goal: privacy for all graphs

Method 1: Lipschitz extensions Release 𝑓′ via GS framework [DMNS’06] Requires designing Lipschitz extension for each function 𝑓 we base ours on maximum flow and linear and convex programs A function 𝑓′ is a Lipschitz extension of 𝑓 from 𝓖 𝑑 to 𝓖 if 𝑓′ agrees with 𝑓 on 𝓖 𝑑 and 𝝏𝒇′ = 𝝏 𝒅 𝒇 𝓖 𝓖 𝑑 high 𝝏𝒇 𝝏𝒇′ = 𝝏 𝒅 𝒇 low 𝝏 𝒅 𝒇 𝒇 ′ =𝒇

Lipschitz extension of 𝒇 − : flow graph For a graph G=(V, E), define flow graph of G: Add edge (𝑢,𝑣′) iff 𝑢,𝑣 ∈ 𝐸. 𝒗 𝐟𝐥𝐨𝐰 (G) is the value of the maximum flow in this graph. Lemma. 𝒗 𝐟𝐥𝐨𝐰 (G)/2 is a Lipschitz extension of 𝒇 − . 1 1 1' 𝑑 𝑑 2 2' s 3 3' t 4 4' 5 5'

Lipschitz extension of 𝒇 − : flow graph For a graph G=(V, E), define flow graph of G: Add edge (𝑢,𝑣′) iff 𝑢,𝑣 ∈ 𝐸. 𝒗 𝐟𝐥𝐨𝐰 (G) is the value of the maximum flow in this graph. Lemma. 𝒗 𝐟𝐥𝐨𝐰 (G)/2 is a Lipschitz extension of 𝒇 − . Proof: (1) 𝒗 𝐟𝐥𝐨𝐰 (G) = 𝟐𝒇 − (G) for all G∈ 𝓖 𝑑 (2) 𝝏 𝒗 𝐟𝐥𝐨𝐰 = 2⋅ 𝝏 𝒅 𝒇 − 1 1/ 1 1' deg 𝑣 / 𝑑 deg 𝑣 / 𝑑 2 2' s 3 3' t 4 4' 5 5'

Lipschitz extension of 𝒇 − : flow graph For a graph G=(V, E), define flow graph of G: 𝒗 𝐟𝐥𝐨𝐰 (G) is the value of the maximum flow in this graph. Lemma. 𝒗 𝐟𝐥𝐨𝐰 (G)/2 is a Lipschitz extension of 𝒇 − . Proof: (1) 𝒗 𝐟𝐥𝐨𝐰 (G) = 𝟐𝒇 − (G) for all G∈ 𝓖 𝑑 (2) 𝝏 𝒗 𝐟𝐥𝐨𝐰 = 2⋅ 𝝏 𝒅 𝒇 − = 2𝒅 1 1 1' 𝑑 𝑑 2 2' s 3 3' t 4 4' 𝑑 𝑑 5 5' 6 6'

Lipschitz extensions via linear/convex programs For a graph G=([n], E), define LP with variables 𝑥 𝑇 for all triangles 𝑇: 𝒗 𝐋𝐏 (G) is the value of LP. Lemma. 𝒗 𝐋𝐏 (G) is a Lipschitz extension of 𝒇 △ . Can be generalized to other counting queries Other queries use convex programs Maximize 0≤ 𝑥 𝑇 ≤1 for all triangles 𝑇 for all nodes 𝑣 𝑇=△ of 𝐺 𝑥 𝑇 𝑇:𝑣∈𝑉(𝑇) 𝑥 𝑇 ≤ 𝒅 𝟐 = 𝝏 𝒅 𝒇 △

Method 2: Generic reduction to privacy over 𝓖 𝑑 Input: Algorithm B that is node-DP over 𝓖 𝑑 Output: Algorithm A that is node-DP over 𝓖, has accuracy similar to B on “nice” graphs Time(A) = Time(B) + O(m+n) Reduction works for all functions 𝑓 How it works: Truncation T(G) outputs G with nodes of degree >𝑑 removed. Answer queries on T(G) instead of G 𝓖 𝓖 𝑑 high 𝝏𝒇 𝑻 low 𝝏 𝒅 𝒇 via Smooth Sensitivity framework [NRS’07] via finding a DP upper bound ℓ on local sensitivity [Dwork Lei 09, KRSY’11] and running any algorithm that is 𝝐 ℓ -node-DP over 𝓖 𝑑 G A T T(G) query f 𝒇(𝑻 𝑮 )+ noise( 𝑺 𝑻 𝑮 ⋅ 𝝏 𝒅 𝒇) S 𝑺 𝑻 (G)

Generic Reduction via Truncation Truncation T(G) removes nodes of degree >𝑑. On query 𝑓, answer A G =𝑓 𝑇 𝐺 +𝑛𝑜𝑖𝑠𝑒 How much noise? Local sensitivity of 𝑇 as a map 𝑔𝑟𝑎𝑝ℎ𝑠 →{𝑔𝑟𝑎𝑝ℎ𝑠} 𝑑𝑖𝑠𝑡 𝐺, 𝐺 ′ =# 𝑛𝑜𝑑𝑒 𝑐ℎ𝑎𝑛𝑔𝑒𝑠 𝑡𝑜 𝑔𝑜 𝑓𝑟𝑜𝑚 𝐺 𝑡𝑜 𝐺’ Lemma. 𝐿 𝑆 𝑇 𝐺 ≤1+max ( 𝑛 𝑑 , 𝑛 𝑑+1 ), where 𝑛 𝑖 = #{nodes of degree 𝑖}. Global sensitivity is too large. Frequency Nodes that determine 𝐿 𝑆 𝑇 (𝐺) … … d Degrees 𝐿 𝑆 𝑇 𝐺 = max 𝐺 ′ : 𝐧𝐞𝐢𝐠𝐡𝐛𝐨𝐫 of 𝐺 𝑑𝑖𝑠𝑡 𝑇 𝐺 ,𝑇 𝐺 ′

Smooth Sensitivity of Truncation Smooth Sensitivity Framework [NRS ‘07] 𝑺 𝒇 𝑮 is a smooth bound on local sensitivity of 𝑓 if 𝑺 𝒇 𝑮 ≥𝑳 𝑺 𝒇 (𝑮) 𝑺 𝒇 𝑮 ≤ 𝒆 𝝐 𝑺 𝒇 (𝑮′) for all neighbors 𝑮 and 𝑮′ Lemma. 𝑆 𝑇 𝐺 = max 𝑘≥0 𝑒 −𝜖𝑘 1+ 𝑖=𝑑− 𝑘+1 𝑑− 𝑘+1 𝑛 𝑖 is a smooth bound for 𝑻, computable in time 𝑂(𝑚+𝑛) “Chain rule”: 𝑺 𝑻 𝑮 ⋅ 𝝏 𝒅 𝒇 is a smooth bound for 𝒇∘𝑻 G A T T(G) query f 𝒇(𝑻 𝑮 )+ noise( 𝑺 𝑻 𝑮 ⋅ 𝝏 𝒅 𝒇) S 𝑺 𝑻 (G)

Utility of the Truncation Mechanism Lemma. ∀𝐺,𝑑 If we truncate to a random degree in 2𝑑,3𝑑 , 𝑬 𝑆 𝑇 𝐺 ≤( 𝑖=𝑑 𝑛−1 𝑛 𝑖 ) 3 log 𝑛 𝜖𝑑 + 1 𝜖 +1. Application to releasing the degree distribution: an 𝜖-node differentially private algorithm 𝐴 𝜖,𝛼 such that 𝐴 𝜖,𝛼 𝐺 −𝐷𝑒𝑔𝐷𝑖𝑠𝑡𝑟𝑖𝑏(𝐺) 1 =𝑜 1 with probability at least 2 3 if 𝐺 satisfies 𝛼-decay for 𝛼>2. Utility: If G is d-bounded, expected noise magnitude is 𝑂 𝜕 3𝑑 𝑓 𝜖 2 . G A T T(G) query f 𝒇(𝑻 𝑮 )+ noise( 𝑺 𝑻 𝑮 ⋅ 𝝏 𝒅 𝒇) S 𝑺 𝑻 (G)

Techniques used to obtain our results Node differentially private algorithms for releasing number of edges counts of small subgraphs (e.g., triangles, 𝒌-triangles, 𝒌-stars) degree distribution via Lipschitz extensions } via generic reduction

Conclusions It is possible to design node differentially private algorithms with good utility on sparse graphs One can first test whether the graph is sparse privately Directions for future work Node-private algorithm for releasing cuts Node-private synthetic graphs What are the right notions of privacy for graph data?