Multiplicative Weights Algorithms CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 13 : 590.03 Fall 12.

Slides:



Advertisements
Similar presentations
Pretty-Good Tomography Scott Aaronson MIT. Theres a problem… To do tomography on an entangled state of n qubits, we need exp(n) measurements Does this.
Advertisements

Primal Dual Combinatorial Algorithms Qihui Zhu May 11, 2009.
Wavelet and Matrix Mechanism CompSci Instructor: Ashwin Machanavajjhala 1Lecture 11 : Fall 12.
Simulatability “The enemy knows the system”, Claude Shannon CompSci Instructor: Ashwin Machanavajjhala 1Lecture 6 : Fall 12.
Linear Classifiers (perceptrons)
Data Mining Classification: Alternative Techniques
Foundations of Privacy Lecture 4 Lecturer: Moni Naor.
Divide and Conquer. Recall Complexity Analysis – Comparison of algorithm – Big O Simplification From source code – Recursive.
Nattee Niparnan. Recall  Complexity Analysis  Comparison of Two Algos  Big O  Simplification  From source code  Recursive.
The number of edge-disjoint transitive triples in a tournament.
Computability and Complexity 20-1 Computability and Complexity Andrei Bulatov Random Sources.
Seminar in Foundations of Privacy 1.Adding Consistency to Differential Privacy 2.Attacks on Anonymized Social Networks Inbal Talgam March 2008.
Introduction to Approximation Algorithms Lecture 12: Mar 1.
Approximation Algoirthms: Semidefinite Programming Lecture 19: Mar 22.
Lecture: Dudu Yanay.  Input: Each instance is associated with a rank or a rating, i.e. an integer from ‘1’ to ‘K’.  Goal: To find a rank-prediction.
Semidefinite Programming
Probably Approximately Correct Model (PAC)
Ensemble Learning: An Introduction
Computability and Complexity 24-1 Computability and Complexity Andrei Bulatov Approximation.
Foundations of Privacy Lecture 11 Lecturer: Moni Naor.
Experts and Boosting Algorithms. Experts: Motivation Given a set of experts –No prior information –No consistent behavior –Goal: Predict as the best expert.
For Better Accuracy Eick: Ensemble Learning
Machine learning Image source:
Neural Networks Lecture 8: Two simple learning algorithms
Differentially Private Data Release for Data Mining Benjamin C.M. Fung Concordia University Montreal, QC, Canada Noman Mohammed Concordia University Montreal,
Foundations of Privacy Lecture 6 Lecturer: Moni Naor.
The Multiplicative Weights Update Method Based on Arora, Hazan & Kale (2005) Mashor Housh Oded Cats Advanced simulation methods Prof. Rubinstein.
EM and expected complete log-likelihood Mixture of Experts
Model Inference and Averaging
Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.
Privacy by Learning the Database Moritz Hardt DIMACS, October 24, 2012.
1 Multiplicative Weights Update Method Boaz Kaminer Andrey Dolgin Based on: Aurora S., Hazan E. and Kale S., “The Multiplicative Weights Update Method:
Classification: Feature Vectors
The Sparse Vector Technique CompSci Instructor: Ashwin Machanavajjhala 1Lecture 12 : Fall 12.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Personalized Social Recommendations – Accurate or Private? A. Machanavajjhala (Yahoo!), with A. Korolova (Stanford), A. Das Sarma (Google) 1.
ISQS 6347, Data & Text Mining1 Ensemble Methods. ISQS 6347, Data & Text Mining 2 Ensemble Methods Construct a set of classifiers from the training data.
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
Boosting and Differential Privacy Cynthia Dwork, Microsoft Research TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A.
Foundations of Privacy Lecture 5 Lecturer: Moni Naor.
MCMC (Part II) By Marc Sobel. Monte Carlo Exploration  Suppose we want to optimize a complicated distribution f(*). We assume ‘f’ is known up to a multiplicative.
Linear Program Set Cover. Given a universe U of n elements, a collection of subsets of U, S = {S 1,…, S k }, and a cost function c: S → Q +. Find a minimum.
Lecture 2: Statistical learning primer for biologists
Security in Outsourced Association Rule Mining. Agenda  Introduction  Approximate randomized technique  Encryption  Summary and future work.
1 January 24, 2016Data Mining: Concepts and Techniques 1 Data Mining: Concepts and Techniques — Chapter 7 — Classification Ensemble Learning.
Vasilis Syrgkanis Cornell University
Predictive Application- Performance Modeling in a Computational Grid Environment (HPDC ‘99) Nirav Kapadia, José Fortes, Carla Brodley ECE, Purdue Presented.
Differential Privacy Xintao Wu Oct 31, Sanitization approaches Input perturbation –Add noise to data –Generalize data Summary statistics –Means,
Yang, et al. Differentially Private Data Publication and Analysis. Tutorial at SIGMOD’12 Part 4: Data Dependent Query Processing Methods Yin “David” Yang.
On-Line Algorithms in Machine Learning By: WALEED ABDULWAHAB YAHYA AL-GOBI MUHAMMAD BURHAN HAFEZ KIM HYEONGCHEOL HE RUIDAN SHANG XINDI.
Output Perturbation with Query Relaxation By: XIAO Xiaokui and TAO Yufei Presenter: CUI Yingjie.
1 Machine Learning Lecture 8: Ensemble Methods Moshe Koppel Slides adapted from Raymond J. Mooney and others.
Machine Learning: Ensemble Methods
Private Data Management with Verification
Artificial Intelligence
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Understanding Generalization in Adaptive Data Analysis
Privacy-preserving Release of Statistics: Differential Privacy
A Simple Artificial Neuron
CS 4/527: Artificial Intelligence
Differential Privacy in Practice
CS 188: Artificial Intelligence
Introduction to Data Mining, 2nd Edition
Foundations of Privacy Lecture 7
Generalization bounds for uniformly stable algorithms
Published in: IEEE Transactions on Industrial Informatics
Clustering.
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Gentle Measurement of Quantum States and Differential Privacy *
Presentation transcript:

Multiplicative Weights Algorithms CompSci Instructor: Ashwin Machanavajjhala 1Lecture 13 : Fall 12

This Class A Simple Multiplicative Weights Algorithm Multiplicative Weights for privately publishing synthetic data Lecture 13 : Fall 122

Multiple Experts Problem Lecture 13 : Fall 123 Will it rain today? Yes No What is the best prediction based on these experts?

Multiple Experts Problem Suppose we know the best expert (who makes the least error), then we can just return that expert says. – This is the best we can hope for. We don’t know who the best expert is. – But we can learn … we know whether it rained or not at the end of the day. Qn: Is there an algorithm that learns over time who the best expert is, and has an accuracy that close to the best expert? Lecture 13 : Fall 124

Weighted Majority Algorithm Lecture 13 : Fall 125 “Experts”Algorithm W1W1 W2W2 W3W3 W4W4 Y1Y1 Y2Y2 Y3Y3 Y4Y4 [Littlestone&Warmuth ‘94]

Weighted Majority Algorithm Lecture 13 : Fall 126 “Experts”AlgorithmTruth 1111 Yes No Yes! 1-ε [Littlestone&Warmuth ‘94]

Multiplicative Weights Algorithm Maintain weights (or probability distribution) over experts. Answering/Prediction: Answer using weighted majority, OR Randomly pick an expert based on current probability distribution. Use random experts answer. Update: Observe truth. Decrease weight (or probability) assigned to the experts who are wrong. Lecture 13 : Fall 127

Error Analysis Theorem: After t steps, let m(t,j) be the number of errors made by expert j let m(t) be the number of errors made by algorithm let n be the number of experts, Lecture 13 : Fall 128 [Arora, Hazan, Kale ‘05]

Error Analysis: Proof Let φ(t) = Σw i. Then, φ(1) = n. When the algorithm makes a mistake, φ(t+1) ≤ φ(t) (1/2 + ½(1-ε)) = φ(t)(1-ε/2) When the algorithm is correct, φ(t+1) ≤ φ(t) Therefore, φ(t) ≤ n(1-ε/2) m(t) Lecture 13 : Fall 129

Error Analysis: Proof φ(t) ≤ n(1-ε/2) m(t) Also, W j (t) = (1-ε) m(t,j) φ(t) ≥ W j (t) => n(1-ε/2) m(t) ≥ (1-ε) m(t,j) Hence, m(t) ≥ 2/ε ln n + 2(1+ε)m(t,j) Lecture 13 : Fall 1210

Multiplicative Weights This algorithm technique has been used to solve a number of problems – Packing and covering Linear programs (Plotkin-Shmoys-Tardos) – Log n approximation for many NP-hard problems (set cover …) – Boosting – Zero sum games – Network congestion – Semidefinite programs Lecture 13 : Fall 1211 [Arora, Hazan, Kale ‘05]

This Class A Simple Multiplicative Weights Algorithm Multiplicative Weights for privately publishing synthetic data Lecture 13 : Fall 1212

Workload-aware Synthetic Data Generation Input: Q, a workload of (expected/typical) linear queries of the form Σ x q(x), and each q(x) is in the range [-1,1] D, a database instance T, number of iterations ε, differential privacy parameter Output: A, a synthetically generated dataset such that for all q in Q, q(A) is close to q(D) Lecture 13 : Fall 1213

Multiplicative Weights Algorithm Let n be the number of records in D, and N be the number of values in the domain. Initialization Let A 0 be a weight function that assigns n/N weight to each value in the domain. Lecture 13 : Fall 1214

Multiplicative Weights Let n be the number of records in D, and N be the number of values in the domain. Let A 0 be a weight function that assigns n/N weight to each value in the domain. In iteration j in {1,2,…, T}, Pick query q from Q with maximum error – Error = q(D) – q(A i-1 ) Lecture 13 : Fall 1215

Multiplicative Weights Let n be the number of records in D, and N be the number of values in the domain. Let A 0 be a weight function that assigns n/N weight to each value in the domain. In iteration j in {1,2,…, T}, Pick query q from Q with maximum error Compute m = q(D) Lecture 13 : Fall 1216

Multiplicative Weights Let n be the number of records in D, and N be the number of values in the domain. Let A 0 be a weight function that assigns n/N weight to each value in the domain. In iteration j in {1,2,…, T}, Pick query q from Q with maximum error Compute m = q(D) Update Weights – A i (x) A i-1 (x) ∙ exp( q(x) ∙ (m – q(A i-1 ))/2n ) Output: average i (A i ) Lecture 13 : Fall 1217

Update rule A i (x) A i-1 (x) ∙ exp( q(x) ∙ (m – q(A i-1 ))/2n ) Lecture 13 : Fall 1218 If q(D) – q(A) > 0, then increase the weight of records with q(x) > 0, and decrease the weight of records with q(x) < 0 If q(D) – q(A) 0, and increase the weight of records with q(x) < 0

Error Analysis Theorem: For any database D, and any set of linear queries Q, MWEM outputs an A such that: Lecture 13 : Fall 1219

Error Analysis: Proof Consider the potential function: Lecture 13 : Fall 1220

Error Analysis: Proof Lecture 13 : Fall 1221

Synthetic Data Generation with Privacy Input: Q, a workload of (expected/typical) linear queries of the form Σ x q(x), and each q(x) is in the range [-1,1] D, a database instance T, number of iterations ε, differential privacy parameter Output: A, a synthetically generated dataset such that for all q in Q, q(A) is close to q(D) Lecture 13 : Fall 1222

MWEM Let n be the number of records in D, and N be the number of values in the domain. Initialization Let A 0 be a weight function that assigns n/N weight to each value in the domain. Lecture 13 : Fall 1223 [Hardt, Ligett & McSherry‘12]

MWEM Let n be the number of records in D, and N be the number of values in the domain. Let A 0 be a weight function that assigns n/N weight to each value in the domain. In iteration j in {1,2,…, T}, Pick query q from Q with max error using Exponential Mechanism – Parameter: ε/2T – Score function: |q(A i-1 ) – q(D)| Lecture 13 : Fall 1224 [Hardt, Ligett & McSherry‘12] More likely to pick those queries for which the answer on the synthetic data is very different from the answer on the true data.

MWEM Let n be the number of records in D, and N be the number of values in the domain. Let A 0 be a weight function that assigns n/N weight to each value in the domain. In iteration j in {1,2,…, T}, Pick query q from Q with max error using Exponential Mechanism Compute m = q(D) using Laplace Mechanism – Parameter: ε/2T – m = q(D) + Lap(2T/ε) Lecture 13 : Fall 1225 [Hardt, Ligett & McSherry‘12]

MWEM Let n be the number of records in D, and N be the number of values in the domain. Let A 0 be a weight function that assigns n/N weight to each value in the domain. In iteration j in {1,2,…, T}, Pick query q from Q with max error using Exponential Mechanism Compute m = q(D) using Laplace Mechanism Update Weights – A i (x) A i-1 (x) ∙ exp( q(x) ∙ (m – q(A i-1 ))/2n ) Output: average i (A i ) Lecture 13 : Fall 1226 [Hardt, Ligett & McSherry‘12]

Update rule A i (x) A i-1 (x) ∙ exp( q(x) ∙ (m – q(A i-1 ))/2n) Lecture 13 : Fall 1227 If noisy q(D) – q(A) > 0, then increase the weight of records with q(x) > 0, and decrease the weight of records with q(x) < 0 If noisy q(D) – q(A) 0, and increase the weight of records with q(x) < 0

Error Analysis Theorem: For any database D, and any set of linear queries Q, with probability at least 1-2T/|Q|, MWEM outputs an A such that: Lecture 13 : Fall 1228

Error Analysis: Proof 1. But exponential mechanism picks qi, which might not have the maximum error! Lecture 13 : Fall 1229

Error Analysis: Proof 1. In each iteration with probability at least 1 – 1/|Q|, error in the query picked by exponential mechanism is smaller than max error by at most 2. We add noise to m = q(D). But with probability at least 1 – 1/|Q| in each iteration, the noise added by Laplace is at most Lecture 13 : Fall 1230

Error Analysis Theorem: For any database D, and any set of linear queries Q, with probability at least 1-2T/|Q|, MWEM outputs an A such that: Lecture 13 : Fall 1231

Optimizations Output A T rather than the average In update step, use queries picked in all previous rounds for which (m-q(A)) is large. Can improve the solution by initializing A 0 with noisy counts. Lecture 13 : Fall 1232

Next Class Implementations of Differential Privacy – How to write programs with differential privacy – Security issues due to incorrect implementation – How to convert any program to satisfy differential privacy Lecture 13 : Fall 1233

References Littlestone & Warmuth, “The weighted majority algorithm”, Information Computing ‘94 Arora, Hazan & Kale, “The multiplicative weights update method”, TR Princeton Univ, ’05 Hardt & Rothblum, “A multiplicative weights mechanism for privacy=preserving data analysis”, FOCS ’10 McSherry, Ligett, Hardt, “A simple and pracitical algorithm for differentially private data release”, NIPS ‘12 Lecture 13 : Fall 1234