Multiplicative Weights Algorithms CompSci Instructor: Ashwin Machanavajjhala 1Lecture 13 : Fall 12
This Class A Simple Multiplicative Weights Algorithm Multiplicative Weights for privately publishing synthetic data Lecture 13 : Fall 122
Multiple Experts Problem Lecture 13 : Fall 123 Will it rain today? Yes No What is the best prediction based on these experts?
Multiple Experts Problem Suppose we know the best expert (who makes the least error), then we can just return that expert says. – This is the best we can hope for. We don’t know who the best expert is. – But we can learn … we know whether it rained or not at the end of the day. Qn: Is there an algorithm that learns over time who the best expert is, and has an accuracy that close to the best expert? Lecture 13 : Fall 124
Weighted Majority Algorithm Lecture 13 : Fall 125 “Experts”Algorithm W1W1 W2W2 W3W3 W4W4 Y1Y1 Y2Y2 Y3Y3 Y4Y4 [Littlestone&Warmuth ‘94]
Weighted Majority Algorithm Lecture 13 : Fall 126 “Experts”AlgorithmTruth 1111 Yes No Yes! 1-ε [Littlestone&Warmuth ‘94]
Multiplicative Weights Algorithm Maintain weights (or probability distribution) over experts. Answering/Prediction: Answer using weighted majority, OR Randomly pick an expert based on current probability distribution. Use random experts answer. Update: Observe truth. Decrease weight (or probability) assigned to the experts who are wrong. Lecture 13 : Fall 127
Error Analysis Theorem: After t steps, let m(t,j) be the number of errors made by expert j let m(t) be the number of errors made by algorithm let n be the number of experts, Lecture 13 : Fall 128 [Arora, Hazan, Kale ‘05]
Error Analysis: Proof Let φ(t) = Σw i. Then, φ(1) = n. When the algorithm makes a mistake, φ(t+1) ≤ φ(t) (1/2 + ½(1-ε)) = φ(t)(1-ε/2) When the algorithm is correct, φ(t+1) ≤ φ(t) Therefore, φ(t) ≤ n(1-ε/2) m(t) Lecture 13 : Fall 129
Error Analysis: Proof φ(t) ≤ n(1-ε/2) m(t) Also, W j (t) = (1-ε) m(t,j) φ(t) ≥ W j (t) => n(1-ε/2) m(t) ≥ (1-ε) m(t,j) Hence, m(t) ≥ 2/ε ln n + 2(1+ε)m(t,j) Lecture 13 : Fall 1210
Multiplicative Weights This algorithm technique has been used to solve a number of problems – Packing and covering Linear programs (Plotkin-Shmoys-Tardos) – Log n approximation for many NP-hard problems (set cover …) – Boosting – Zero sum games – Network congestion – Semidefinite programs Lecture 13 : Fall 1211 [Arora, Hazan, Kale ‘05]
This Class A Simple Multiplicative Weights Algorithm Multiplicative Weights for privately publishing synthetic data Lecture 13 : Fall 1212
Workload-aware Synthetic Data Generation Input: Q, a workload of (expected/typical) linear queries of the form Σ x q(x), and each q(x) is in the range [-1,1] D, a database instance T, number of iterations ε, differential privacy parameter Output: A, a synthetically generated dataset such that for all q in Q, q(A) is close to q(D) Lecture 13 : Fall 1213
Multiplicative Weights Algorithm Let n be the number of records in D, and N be the number of values in the domain. Initialization Let A 0 be a weight function that assigns n/N weight to each value in the domain. Lecture 13 : Fall 1214
Multiplicative Weights Let n be the number of records in D, and N be the number of values in the domain. Let A 0 be a weight function that assigns n/N weight to each value in the domain. In iteration j in {1,2,…, T}, Pick query q from Q with maximum error – Error = q(D) – q(A i-1 ) Lecture 13 : Fall 1215
Multiplicative Weights Let n be the number of records in D, and N be the number of values in the domain. Let A 0 be a weight function that assigns n/N weight to each value in the domain. In iteration j in {1,2,…, T}, Pick query q from Q with maximum error Compute m = q(D) Lecture 13 : Fall 1216
Multiplicative Weights Let n be the number of records in D, and N be the number of values in the domain. Let A 0 be a weight function that assigns n/N weight to each value in the domain. In iteration j in {1,2,…, T}, Pick query q from Q with maximum error Compute m = q(D) Update Weights – A i (x) A i-1 (x) ∙ exp( q(x) ∙ (m – q(A i-1 ))/2n ) Output: average i (A i ) Lecture 13 : Fall 1217
Update rule A i (x) A i-1 (x) ∙ exp( q(x) ∙ (m – q(A i-1 ))/2n ) Lecture 13 : Fall 1218 If q(D) – q(A) > 0, then increase the weight of records with q(x) > 0, and decrease the weight of records with q(x) < 0 If q(D) – q(A) 0, and increase the weight of records with q(x) < 0
Error Analysis Theorem: For any database D, and any set of linear queries Q, MWEM outputs an A such that: Lecture 13 : Fall 1219
Error Analysis: Proof Consider the potential function: Lecture 13 : Fall 1220
Error Analysis: Proof Lecture 13 : Fall 1221
Synthetic Data Generation with Privacy Input: Q, a workload of (expected/typical) linear queries of the form Σ x q(x), and each q(x) is in the range [-1,1] D, a database instance T, number of iterations ε, differential privacy parameter Output: A, a synthetically generated dataset such that for all q in Q, q(A) is close to q(D) Lecture 13 : Fall 1222
MWEM Let n be the number of records in D, and N be the number of values in the domain. Initialization Let A 0 be a weight function that assigns n/N weight to each value in the domain. Lecture 13 : Fall 1223 [Hardt, Ligett & McSherry‘12]
MWEM Let n be the number of records in D, and N be the number of values in the domain. Let A 0 be a weight function that assigns n/N weight to each value in the domain. In iteration j in {1,2,…, T}, Pick query q from Q with max error using Exponential Mechanism – Parameter: ε/2T – Score function: |q(A i-1 ) – q(D)| Lecture 13 : Fall 1224 [Hardt, Ligett & McSherry‘12] More likely to pick those queries for which the answer on the synthetic data is very different from the answer on the true data.
MWEM Let n be the number of records in D, and N be the number of values in the domain. Let A 0 be a weight function that assigns n/N weight to each value in the domain. In iteration j in {1,2,…, T}, Pick query q from Q with max error using Exponential Mechanism Compute m = q(D) using Laplace Mechanism – Parameter: ε/2T – m = q(D) + Lap(2T/ε) Lecture 13 : Fall 1225 [Hardt, Ligett & McSherry‘12]
MWEM Let n be the number of records in D, and N be the number of values in the domain. Let A 0 be a weight function that assigns n/N weight to each value in the domain. In iteration j in {1,2,…, T}, Pick query q from Q with max error using Exponential Mechanism Compute m = q(D) using Laplace Mechanism Update Weights – A i (x) A i-1 (x) ∙ exp( q(x) ∙ (m – q(A i-1 ))/2n ) Output: average i (A i ) Lecture 13 : Fall 1226 [Hardt, Ligett & McSherry‘12]
Update rule A i (x) A i-1 (x) ∙ exp( q(x) ∙ (m – q(A i-1 ))/2n) Lecture 13 : Fall 1227 If noisy q(D) – q(A) > 0, then increase the weight of records with q(x) > 0, and decrease the weight of records with q(x) < 0 If noisy q(D) – q(A) 0, and increase the weight of records with q(x) < 0
Error Analysis Theorem: For any database D, and any set of linear queries Q, with probability at least 1-2T/|Q|, MWEM outputs an A such that: Lecture 13 : Fall 1228
Error Analysis: Proof 1. But exponential mechanism picks qi, which might not have the maximum error! Lecture 13 : Fall 1229
Error Analysis: Proof 1. In each iteration with probability at least 1 – 1/|Q|, error in the query picked by exponential mechanism is smaller than max error by at most 2. We add noise to m = q(D). But with probability at least 1 – 1/|Q| in each iteration, the noise added by Laplace is at most Lecture 13 : Fall 1230
Error Analysis Theorem: For any database D, and any set of linear queries Q, with probability at least 1-2T/|Q|, MWEM outputs an A such that: Lecture 13 : Fall 1231
Optimizations Output A T rather than the average In update step, use queries picked in all previous rounds for which (m-q(A)) is large. Can improve the solution by initializing A 0 with noisy counts. Lecture 13 : Fall 1232
Next Class Implementations of Differential Privacy – How to write programs with differential privacy – Security issues due to incorrect implementation – How to convert any program to satisfy differential privacy Lecture 13 : Fall 1233
References Littlestone & Warmuth, “The weighted majority algorithm”, Information Computing ‘94 Arora, Hazan & Kale, “The multiplicative weights update method”, TR Princeton Univ, ’05 Hardt & Rothblum, “A multiplicative weights mechanism for privacy=preserving data analysis”, FOCS ’10 McSherry, Ligett, Hardt, “A simple and pracitical algorithm for differentially private data release”, NIPS ‘12 Lecture 13 : Fall 1234