Privacy-MaxEnt: Integrating Background Knowledge in Privacy Quantification Wenliang (Kevin) Du, Zhouxuan Teng, and Zutao Zhu. Department of Electrical.

Slides:

Advertisements

Similar presentations

The microcanonical ensemble Finding the probability distribution We consider an isolated system in the sense that the energy is a constant of motion. We.

Advertisements

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.

Introduction Finding the solutions to a system of linear equations requires graphing multiple linear inequalities on the same coordinate plane. Most real-world.

CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.

3.2 OLS Fitted Values and Residuals -after obtaining OLS estimates, we can then obtain fitted or predicted values for y: -given our actual and predicted.

ICCV 2007 tutorial Part III Message-passing algorithms for energy minimization Vladimir Kolmogorov University College London.

Maximum Likelihood-Maximum Entropy Duality : Session 1 Pushpak Bhattacharyya Scribed by Aditya Joshi Presented in NLP-AI talk on 14 th January, 2014.

Infinite Horizon Problems

Integrating Bayesian Networks and Simpson’s Paradox in Data Mining Alex Freitas University of Kent Ken McGarry University of Sunderland.

FIN 685: Risk Management Topic 5: Simulation Larry Schrenk, Instructor.

Support Vector Machines (and Kernel Methods in general)

Maximum likelihood (ML) and likelihood ratio (LR) test

Lecture 5: Learning models using EM

Curve-Fitting Regression

Maximum likelihood (ML)

1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.

Support Vector Machines Based on Burges (1998), Scholkopf (1998), Cristianini and Shawe-Taylor (2000), and Hastie et al. (2001) David Madigan.

Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Maximum likelihood (ML) and likelihood ratio (LR) test

Project  Now it is time to think about the project  It is a team work Each team will consist of 2 people  It is better to consider a project of your.

Constrained Optimization Rong Jin. Outline  Equality constraints  Inequality constraints  Linear Programming  Quadratic Programming.

Statistical Inference and Regression Analysis: GB Professor William Greene Stern School of Business IOMS Department Department of Economics.

1 When Does Randomization Fail to Protect Privacy? Wenliang (Kevin) Du Department of EECS, Syracuse University.

Lecture 10: Support Vector Machines

Privacy Preserving OLAP Rakesh Agrawal, IBM Almaden Ramakrishnan Srikant, IBM Almaden Dilys Thomas, Stanford University.

Today Logistic Regression Decision Trees Redux Graphical Models

July 3, Department of Computer and Information Science (IDA) Linköpings universitet, Sweden Minimal sufficient statistic.

Constrained Optimization Rong Jin. Outline  Equality constraints  Inequality constraints  Linear Programming  Quadratic Programming.

7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.

Maximum likelihood (ML)

CHAPTER 15 S IMULATION - B ASED O PTIMIZATION II : S TOCHASTIC G RADIENT AND S AMPLE P ATH M ETHODS Organization of chapter in ISSO –Introduction to gradient.

1. An Overview of the Data Analysis and Probability Standard for School Mathematics? 2.

Operations Research Models

Absolute Value Equalities and Inequalities Absolute value: The distance from zero on the number line. Example: The absolute value of 7, written as |7|,

by B. Zadrozny and C. Elkan

Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department.

Background Knowledge Attack for Generalization based Privacy- Preserving Data Mining.

Refined privacy models

1 ECE-517 Reinforcement Learning in Artificial Intelligence Lecture 7: Finite Horizon MDPs, Dynamic Programming Dr. Itamar Arel College of Engineering.

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.

7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.

Solving Linear Programming Problems: The Simplex Method

Solving Linear Equations To Solve an Equation means... To isolate the variable having a coefficient of 1 on one side of the equation. Examples x = 5.

Maximum Entropy (ME) Maximum Entropy Markov Model (MEMM) Conditional Random Field (CRF)

1 Markov Decision Processes Infinite Horizon Problems Alan Fern * * Based in part on slides by Craig Boutilier and Daniel Weld.

Estimating Component Availability by Dempster-Shafer Belief Networks Estimating Component Availability by Dempster-Shafer Belief Networks Lan Guo Lane.

PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.

1 Markov Decision Processes Infinite Horizon Problems Alan Fern * * Based in part on slides by Craig Boutilier and Daniel Weld.

Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.

Prototype Classification Methods Fu Chang Institute of Information Science Academia Sinica ext. 1819

Lecture 2: Statistical learning primer for biologists

Support Vector Machines

(iii) Lagrange Multipliers and Kuhn-tucker Conditions D Nagesh Kumar, IISc Introduction to Optimization Water Resources Systems Planning and Management:

Mathematical Analysis of MaxEnt for Mixed Pixel Decomposition

Lecture 3: MLE, Bayes Learning, and Maximum Entropy

A Brief Maximum Entropy Tutorial Presenter: Davidson Date: 2009/02/04 Original Author: Adam Berger, 1996/07/05

1 Introduction Optimization: Produce best quality of life with the available resources Engineering design optimization: Find the best system that satisfies.

Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”

Updating Probabilities Ariel Caticha and Adom Giffin Department of Physics University at Albany - SUNY MaxEnt 2006.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Jensen’s Inequality (Special Case) EM Theorem.

Approximation Algorithms based on linear programming.

Proof And Strategies Chapter 2. Lecturer: Amani Mahajoub Omer Department of Computer Science and Software Engineering Discrete Structures Definition Discrete.

Deriving Private Information from Association Rule Mining Results Zutao Zhu, Guan Wang, and Wenliang Du ICDE /3/181.

Information Complexity Lower Bounds

LECTURE 10: EXPECTATION MAXIMIZATION (EM)

Parametric Methods Berlin Chen, 2005 References:

Chapter 7: Systems of Equations and Inequalities; Matrices

Propagation of Error Berlin Chen

Propagation of Error Berlin Chen

Presentation transcript:

Privacy-MaxEnt: Integrating Background Knowledge in Privacy Quantification Wenliang (Kevin) Du, Zhouxuan Teng, and Zutao Zhu. Department of Electrical Engineering & Computer Science Syracuse University, Syracuse, New York.

Introduction  Privacy-Preserving Data Publishing.  The impact of background knowledge: How does it affect privacy? How to measure its impact on privacy?  Integrate background knowledge in privacy quantification. Privacy-MaxEnt: A systematic approach. Based on well-established theories.  Evaluation.

Privacy-Preserving Data Publishing  Data disguise methods Randomization Generalization (e.g. Mondrian) Bucketization (e.g. Anatomy)  Our Privacy-MaxEnt method can be applied to Generalization and Bucketization. We pick Bucketization in our presentation.

Data Sets IdentifierQuasi-Identifier (QI)Sensitive Attribute (SA)

Bucketized Data P( Breast cancer | { female, college }, bucket=1 ) = 1/4 P( Breast cancer | { female, junior }, bucket=2 ) = 1/3 Quasi-Identifier (QI)Sensitive Attribute (SA)

Impact of Background Knowledge  Background Knowledge: It’s rare for male to have breast cancer.  This analysis is hard for large data sets.

Previous Studies  Martin, et al. ICDE’07. First formal study on background knowledge  Chen, LeFevre, Ramakrishnan. VLDB’07. Improves the previous work.  They deal with rule-based knowledge. Deterministic knowledge.  Background knowledge can be much more complicated. Uncertain knowledge

Complicated Background Knowledge  Rule-based knowledge: P (s | q) = 1. P (s | q) = 0.  Probability-Based Knowledge P (s | q) = 0.2. P (s | Alice) = 0.2.  Vague background knowledge 0.3 ≤ P (s | q) ≤ 0.5.  Miscellaneous types P (s | q 1 ) + P (s | q 2 ) = 0.7 One of Alice and Bob has “Lung Cancer”.

Challenges  How to analyze privacy in a systematic way for large data sets and complicated background knowledge?  Directly computing P( S | Q ) is hard.  What do we want to compute? P( S | Q ), given the background knowledge and the published data set. P(S | Q ) is primitive for most privacy metrics.

Our Approach Background Knowledge Published Data Public Information Constraints on x Constraints on x Solve x Consider P( S | Q ) as variable x (a vector). Most unbiased solution

Maximum Entropy Principle  “Information theory provides a constructive criterion for setting up probability distributions on the basis of partial knowledge, and leads to a type of statistical inference which is called the maximum entropy estimate. It is least biased estimate possible on the given information.” — by E. T. Jaynes, 1957.

The MaxEnt Approach Background Knowledge Published Data Public Information Constraints on P( S | Q ) Constraints on P( S | Q ) Estimate P( S | Q ) Maximum Entropy Estimate

Entropy Because H(S | Q, B) = H(Q, S, B) – H(Q, B) Constraint should use P(Q, S, B) as variables

Maximum Entropy Estimate  Let vector x = P(Q, S, B).  Find the value for x that maximizes its entropy H(Q, S, B), while satisfying h 1 (x) = c 1, …, h u (x) = c u : equality constraints g 1 (x) ≤ d 1, …, g v (x) ≤ d v : inequality constraints  A special case of Non-Linear Programming.

Constraints from Knowledge  Linear model: quite generic.  Conditional probability: P (S | Q) = P(Q, S) / P(Q).  Background knowledge has nothing to do with B: P(Q, S) = P(Q, S, B=1) + … + P(Q, S, B=m). Background Knowledge Constraints on P(Q, S, B)

Constraints from Published Data  Constraints Truth and only the truth. Absolutely correct for the original data set. No inference. Published Data Set D’ Constraints on P(Q, S, B)

Assignment and Constraints Observation: the original data is one of the assignments Constraint: true for all possible assignments

QI Constraint Constraint: Example:

SA Constraint Constraint: Example:

Zero Constraint  P(q, s, b) = 0, if q or s does not appear in Bucket b.  We can reduce the number of variables.

Theoretic Properties  Soundness: Are they correct? Easy to prove.  Completeness: Have we missed any constraint? See our theorems and proofs.  Conciseness: Are there redundant constraints? Only one redundant constraint in each bucket.  Consistency: Is our approach consistent with the existing methods (i.e., when background knowledge is Ø).

Completeness w.r.t Equations  Have we missed any equality constraint? Yes! If F 1 = C 1 and F 2 = C 2 are constraints, F 1 + F 2 = C 1 + C 2 is too. However, it is redundant.  Completeness Theorem: U: our constraint set. All linear constraints can be written as the linear combinations of the constraints in U.

Completeness w.r.t Inequalities  Have we missed any inequalities constraint? Yes! If F = C, then F ≤ C+0.2 is also valid (redundant).  Completeness Theorem: Our constraint set is also complete in the inequality sense.

Putting Them Together Background Knowledge Published Data Public Information Constraints on P( S | Q ) Constraints on P( S | Q ) Estimate P( S | Q ) Maximum Entropy Estimate Tools: LBFGS, TOMLAB, KNITRO, etc.

Inevitable Questions:  Where do we get background knowledge?  Do we have to be very very knowledgeable?  For P (s | q) type of knowledge: All useful knowledge is in the original data set. Association rules:  Positive: Q  S  Negative: Q  ¬ S, ¬ Q  S, ¬ Q  ¬ S Bound the knowledge in our study.  Top-K strongest association rules.

Knowledge about Individuals Knowledge 1: Alice has either s 1 or s 4. Constraint: Knowledge 1: Two people among Alice, Bob, and Charlie have s 4. Constraint: Alice: (i 1, q 1 ) Bob: (i 4, q 2 ) Charlie: (i 9, q 5 )

Evaluation  Implementation: Lagrange multipliers: Constrained Optimization  Unconstrained Optimization LBFGS: solving the unconstrained optimization problem.  Pentium 3Ghz CPU with 4GB memory.

Privacy versus Knowledge Estimation Accuracy: KL Distance between P (MaxEnt) (S | Q) and P (Original) (S | Q).

Privacy versus # of QI attributes

Performance vs. Knowledge

Running Time vs. Data Size

Iteration vs. Data size

Conclusion  Privacy-MaxEnt is a systematic method Model various types of knowledge Model the information from the published data Based on well-established theory.  Future work Reducing the # of constraints Vague background knowledge Background knowledge about individuals