Seminar in Foundations of Privacy 1.Adding Consistency to Differential Privacy 2.Attacks on Anonymized Social Networks Inbal Talgam March 2008.

Slides:



Advertisements
Similar presentations
1+eps-Approximate Sparse Recovery Eric Price MIT David Woodruff IBM Almaden.
Advertisements

Subspace Embeddings for the L1 norm with Applications Christian Sohler David Woodruff TU Dortmund IBM Almaden.
Robin Hogan A variational scheme for retrieving rainfall rate and hail intensity.
I have a DREAM! (DiffeRentially privatE smArt Metering) Gergely Acs and Claude Castelluccia {gergely.acs, INRIA 2011.
Wavelet and Matrix Mechanism CompSci Instructor: Ashwin Machanavajjhala 1Lecture 11 : Fall 12.
Differentially Private Recommendation Systems Jeremiah Blocki Fall A: Foundations of Security and Privacy.
Foundations of Cryptography Lecture 10 Lecturer: Moni Naor.
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Fast Algorithms For Hierarchical Range Histogram Constructions
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Private Analysis of Graph Structure With Vishesh Karwa, Sofya Raskhodnikova and Adam Smith Pennsylvania State University Grigory Yaroslavtsev
Raef Bassily Adam Smith Abhradeep Thakurta Penn State Yahoo! Labs Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds Penn.
Foundations of Privacy Lecture 4 Lecturer: Moni Naor.
Foundations of Privacy Lecture 6 Lecturer: Moni Naor.
Kunal Talwar MSR SVC [Dwork, McSherry, Talwar, STOC 2007] TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A AA A.
Differential Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009.
Chapter 5 Orthogonality
Security in Databases. 2 Srini & Nandita (CSE2500)DB Security Outline review of databases reliability & integrity protection of sensitive data protection.
Privacy without Noise Yitao Duan NetEase Youdao R&D Beijing China CIKM 2009.
Computing Sketches of Matrices Efficiently & (Privacy Preserving) Data Mining Petros Drineas Rensselaer Polytechnic Institute (joint.
Introduction to Boosting Aristotelis Tsirigos SCLT seminar - NYU Computer Science.
Foundations of Privacy Lecture 11 Lecturer: Moni Naor.
Security in Databases. 2 Outline review of databases reliability & integrity protection of sensitive data protection against inference multi-level security.
6 6.3 © 2012 Pearson Education, Inc. Orthogonality and Least Squares ORTHOGONAL PROJECTIONS.
Online Learning Algorithms
How Robust are Linear Sketches to Adaptive Inputs? Moritz Hardt, David P. Woodruff IBM Research Almaden.
Differentially Private Transit Data Publication: A Case Study on the Montreal Transportation System Rui Chen, Concordia University Benjamin C. M. Fung,
Multiplicative Weights Algorithms CompSci Instructor: Ashwin Machanavajjhala 1Lecture 13 : Fall 12.
NUS CS5247 A dimensionality reduction approach to modeling protein flexibility By, By Miguel L. Teodoro, George N. Phillips J* and Lydia E. Kavraki Rice.
ORDINARY DIFFERENTIAL EQUATION (ODE) LAPLACE TRANSFORM.
Foundations of Privacy Lecture 6 Lecturer: Moni Naor.
SVM by Sequential Minimal Optimization (SMO)
Topics on Final Perceptrons SVMs Precision/Recall/ROC Decision Trees Naive Bayes Bayesian networks Adaboost Genetic algorithms Q learning Not on the final:
Differential Privacy - Apps Presented By Nikhil M Chandrappa 1.
CS 8751 ML & KDDSupport Vector Machines1 Support Vector Machines (SVMs) Learning mechanism based on linear programming Chooses a separating plane based.
Adaptive CSMA under the SINR Model: Fast convergence using the Bethe Approximation Krishna Jagannathan IIT Madras (Joint work with) Peruru Subrahmanya.
Differentially Private Data Release for Data Mining Noman Mohammed*, Rui Chen*, Benjamin C. M. Fung*, Philip S. Yu + *Concordia University, Montreal, Canada.
Differentially Private Marginals Release with Mutual Consistency and Error Independent of Sample Size Cynthia Dwork, Microsoft TexPoint fonts used in EMF.
CSCI 256 Data Structures and Algorithm Analysis Lecture 14 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some.
AN ORTHOGONAL PROJECTION
Disclosure risk when responding to queries with deterministic guarantees Krish Muralidhar University of Kentucky Rathindra Sarathy Oklahoma State University.
The Sparse Vector Technique CompSci Instructor: Ashwin Machanavajjhala 1Lecture 12 : Fall 12.
Personalized Social Recommendations – Accurate or Private? A. Machanavajjhala (Yahoo!), with A. Korolova (Stanford), A. Das Sarma (Google) 1.
Efficient computation of Robust Low-Rank Matrix Approximations in the Presence of Missing Data using the L 1 Norm Anders Eriksson and Anton van den Hengel.
1 IPAM 2010 Privacy Protection from Sampling and Perturbation in Surveys Natalie Shlomo and Chris Skinner Southampton Statistical Sciences Research Institute.
MAT 2401 Linear Algebra 5.3 Orthonormal Bases
Foundations of Privacy Lecture 5 Lecturer: Moni Naor.
Differential Privacy Some contents are borrowed from Adam Smith’s slides.
OR Chapter 7. The Revised Simplex Method  Recall Theorem 3.1, same basis  same dictionary Entire dictionary can be constructed as long as we.
Maximization Problems with Submodular Objective Functions Moran Feldman Publication List Improved Approximations for k-Exchange Systems. Moran Feldman,
An Introduction to Differential Privacy and its Applications 1 Ali Bagherzandi Ph.D Candidate University of California at Irvine 1- Most slides in this.
Differential Privacy (1). Outline  Background  Definition.
Differential Privacy Xintao Wu Oct 31, Sanitization approaches Input perturbation –Add noise to data –Generalize data Summary statistics –Means,
Maximizing Symmetric Submodular Functions Moran Feldman EPFL.
CSCI-256 Data Structures & Algorithm Analysis Lecture Note: Some slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. 18.
Private Release of Graph Statistics using Ladder Functions J.ZHANG, G.CORMODE, M.PROCOPIUC, D.SRIVASTAVA, X.XIAO.
A Kernel Approach for Learning From Almost Orthogonal Pattern * CIS 525 Class Presentation Professor: Slobodan Vucetic Presenter: Yilian Qin * B. Scholkopf.
1 Differential Privacy Cynthia Dwork Mamadou H. Diallo.
Approximation Algorithms based on linear programming.
Sergey Yekhanin Institute for Advanced Study Lower Bounds on Noise.
REVIEW Linear Combinations Given vectors and given scalars
Private Data Management with Verification
Privacy-preserving Release of Statistics: Differential Privacy
Sum of Squares, Planted Clique, and Pseudo-Calibration
Differential Privacy in Practice
Current Developments in Differential Privacy
Foundations of Privacy Lecture 7
Published in: IEEE Transactions on Industrial Informatics
Some contents are borrowed from Adam Smith’s slides
Differential Privacy.
Presentation transcript:

Seminar in Foundations of Privacy 1.Adding Consistency to Differential Privacy 2.Attacks on Anonymized Social Networks Inbal Talgam March 2008

1. Adding Consistency to Differential Privacy

Differential Privacy 1977 Dalenius - The risk to one’s privacy is the same with or without access to the DB Dwork & Naor – Impossibe (auxiliary info) Dwork et al – The risk is the same with or without participating in the DB. Plus: Strong mechanism of Calibrated Noise to achieve DP while maintaining accuracy Barak et al - Adding consistency.

Setting – Contingency Table and Marginals k binary attributes n participants DB … Terminology: Contingency table (private), marginals (public). ##… 2 k attribute settings 0…00…1… Contingency Table 83… 2 j attribute settings 09… 2 i attribute settings Marginals j << k

Main Contribution Solve following consistency problem: At low accuracy cost 20… Marginals Noise NaN-0.5… Contingency Table +

Outline Discussion of: 1.Privacy 2.Accuracy & Consistency Key method - Fourier basis The algorithm –Part I –Part II

Privacy – Definition Intuition: The risk is the same with or without participating in the DB Definition: DB 1 DB 2 Differing on 1 element A randomized function K gives ε -differential privacy if for all DB 1, DB 2 differing on at most 1 element

Privacy - Mechanism Noise Pls let me know f ( DB ) DB Goal: Noise K ( DB ) = f ( DB )+ Noise Laplace noise: Pr[ K ( DB )= a ] exp (|| f ( DB ) - a|| 1 / σ)

The Calibrated Noise Mechanism for DP Main idea: Amount of noise to add to f(DB) is calibrated according to the sensitivity of f, denoted Δf. Definition: All useful functions should be insensitive… (e.g. marginals) For f : D → R d, the L 1-sensitivity of f is for all DB 1, DB 2 differing on at most 1 element

The Calibrated Noise Mechanism – How Much Noise Main result: To ensure ε-differential privacy for a query of sensitivity Δf, add Laplace noise with σ = Δf/ε. Why does it work? Remember: Laplace: Definition: Pr[ K ( DB )= a ] exp (|| f ( DB ) - a||1 / σ)

Accuracy & Consistency 83… Contingency Table 20… Marginals Noise + NaN-0.5… New Table Compromise consistency May lead to technical problems and confusion So smoking is one of the leading causes of statistics? 83… Contingency Table + Noise 32… Marginals Compromise accuracy Non-calibrated, binomial noise Var=Θ(2 k )

Key Approach Non-redundant representation Specific for required marginals 83… Contingency Table 20… Marginals + Small number of coefficients of the Fourier basis Consistency: Any set of Fourier coefficients correspond to a (fractional and possibly negative) contingency table. Accuracy: Few Fourier coefficients are needed for low- order marginals, so low sensitivity and small error. Noise + Linear Programming + Rounding

Accuracy – What is Guaranteed Let C be a set of original marginals, each on ≤ j attributes. Let C’ be the result marginals. With probability 1-δ, : Remark: Advantage of working in the interactive model. DB

Outline Discussion of: 1.Privacy 2.Accuracy & Consistency Key method - Fourier basis The algorithm –Part I –Part II

Notation & Preliminaries ||x|| 1 = ? We say α ≤ β if β has all α’s attributes (and more) e.g ≤ 0111 but not 0110 ≤ 0101 Introduce the linear marginal operator C β β determines attributes Remember: x α, α ≤ β, C β (x), C β (x) γ ##… Contingency Table x 0…0 x 0…1 x α where 20… Marginal C β (x) :

The Fourier Basis –Orthonormal basis for space of contingency tables x (R 2 k ). Motivation: Any marginal C β (x) can be written as a combination of few f α ’s. –How few? Depends on order of marginal. f α : …

Writing marginals in Fourier Basis Theorem: Marginal of x with attributes β Write x in Fourier basis Linearity Proof. For any coordinate By definition of marginal operator and Fourier vector

Outline Discussion of: 1.Privacy 2.Accuracy & Consistency Key method - Fourier basis The algorithm –Part I – adding calibrated noise –Part II – non-negativity by linear programming

Algorithm – Part I INPUT: Required marginals {C β } {f α } = Fourier vectors needed to write marginals Releasing marginals {C β (x)} = releasing coeffs OUTPUT: Noisy coeffs {Φ α } METHOD: Add calibrated noise Sensitivity depends on |{α}| on order of C β ’s

Part II – Non-negativity by LP INPUT: Noisy coeffs {Φ α } OUTPUT: Non-negative contingency table x' METHOD: Minimize difference between Fourier coefficients Most entries x' γ in a vertex solution are 0  Rounding adds small error minimize b subject to: x ' γ ≥ 0 | Φ α - | ≤ b

Algorithm Summary Input: Contingency table x, required marginals {C β } Output: Marginals {C β } of new contingency table x'' {f α } = Fourier vectors needed to write marginals Compute noisy Fourier coefficients {Φ α } Find non-negative x' with nearly the correct Fourier coefficients Round to x'' Part I Part II

Accuracy Guarantee - Revisited With probability 1-δ, #Coefficients

Summary & Open Questions Algorithm for marginals release Guarantees privacy, accuracy & consistency –Consistency: can reconstruct a synthetic, consistent table –Accuracy: error increases smoothly with order of marginals Open questions: –Improving efficiency –Effect of noise on marginals’ statistical properties

Any Questions?