Presentation is loading. Please wait.

Presentation is loading. Please wait.

Carnegie Mellon Selecting Observations against Adversarial Objectives Andreas Krause Brendan McMahan Carlos Guestrin Anupam Gupta TexPoint fonts used in.

Similar presentations


Presentation on theme: "Carnegie Mellon Selecting Observations against Adversarial Objectives Andreas Krause Brendan McMahan Carlos Guestrin Anupam Gupta TexPoint fonts used in."— Presentation transcript:

1 Carnegie Mellon Selecting Observations against Adversarial Objectives Andreas Krause Brendan McMahan Carlos Guestrin Anupam Gupta TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A AAAAA A A

2 Observation selection problems Place sensors for building automation Monitor rivers, lakes using robots Detect contaminations in water networks Set V of possible observations (sensor locations,..) Want to pick subset A * µ V such that For most interesting utilities F, NP-hard!

3 Placement B = {S 1,…, S 5 } Key observation: Diminishing returns Placement A = {S 1, S 2 } Formalization: Submodularity For A µ B, F(A [ {S’}) – F(A) ¸ F(B [ {S’}) – F(B) Adding S’ will help a lot! Adding S’ doesn’t help much New sensor S’

4 Submodularity [with Guestrin, Singh, Leskovec, VanBriesen, Faloutsos, Glance] We prove submodularity for Mutual information F(A) = H(unobs) – H(unobs|A) UAI ’05, JMLR ’07 (Spatial prediction) Outbreak detection F(A) = Impact reduction sensing A KDD ’07(Water monitoring, …) Also submodular: Geometric coverage F(A) = area covered Variance reduction F(A) = Var(Y) – Var(Y|A) …

5 Why is submodularity useful? Theorem [Nemhauser et al ‘78] Greedy algorithm gives constant factor approximation F(A greedy ) ¸ (1-1/e) F(A opt ) Can get online (data dependent) bounds for any algorithm Can significantly speed up greedy algorithm Can use MIP / branch & bound for optimal solution ~63% 12345 Greedy Algorithm (forward selection)

6 Robust observation selection What if … … parameters  of model P(X V j  ) unknown / change? … sensors fail? … an adversary selects the outbreak scenario? More variability here now  new Attack here! Best placement for parameters  old Sensors

7 Robust prediction Instead: minimize “width” of the confidence bands For every location s 2 V, define F s (A) = Var(s) – Var(s|A) Minimize “width”  simultaneously maximize all F s (A) Each F s (A) is (often) submodular! [Das & Kempe ‘07] Low average variance (MSE) but high maximum (in most interesting part!) Typical objective: Minimize average variance (MSE) Confidence bands Horizontal positions V pH value -3 -2 0 1 2 3 -3 -2 0 1 2 3

8 -3 -2 0 1 2 3 Adversarial observation selection Given: Possible observations V, Submodular functions F 1,…,F m Want to solve Can model many problems this way: Width of confidence bands: F i is variance at location i unknown parameters:F i is info-gain with parameters  i adversarial outbreak scenarios:F i is utility for scenario i … Unfortunately, min i F i (A) is not submodular  One F i for each location i … …

9 How does greedy do? Set AF1F1 F2F2 min i F i {x}100 {y}020 {z}  {x,y}121 {x,z}1  {y,z}  2  Theorem: The problem max |A| · k min i F(A) does not admit any approximation unless P=NP  Optimal solution (k=2) Greedy picks z first Then, can choose only x or y  Greedy does arbitrarily badly. Is there something better?

10 Alternative formulation If somebody told us the optimal value, can we recover the optimal solution A * ? Need to solve dual problem Is this any easier? Yes, if we relax the constraint |A| · k

11 Solving the alternative problem Trick: For each F i and c, define truncation c |A| F i (A) F’ i (A) SetF1F1 F2F2 F’ 1 F’ 2 F’ avg,1 min i F i {x}1010½0 {y}0201½0 {z}  {x,y}121111 {x,z}1  1  (1+  )/2  {y,z}  2  1 (1+  )/2  min i F i (A) ¸ c  F’ avg,c (A) = c Lemma: F’ avg,c (A) is submodular!

12 Why is this useful? Can use the greedy algorithm to find (approximate) solution! Proposition: Greedy algorithm finds A G with |A G | ·  k and F’ avg,c (A G ) = c where  = 1+log max s  i F i ({s})

13 Back to our example Guess c=1 First pick x Then pick y  Optimal solution! How do we find c? SetF1F1 F2F2 min i F i F’ avg,1 {x}100½ {y}020½ {z}  {x,y}1211 {x,z}1  (1+  )/2 {y,z}  2  (1+  )/2

14 Submodular Saturation Algorithm Given set V, integer k and functions F 1,…,F m Initialize c min =0, c max = min i F i (V) Do binary search: c = (c min +c max )/2 Use greedy algorithm to find A G such that F’ avg,c (A G ) = c If |A G | >  k: decrease c max If |A G | ·  k: increase c min until convergence c max c min c |A G | ·  k  c too low |A G | >  k  c too high

15 Theoretical guarantees Theorem: If there were a polytime algorithm with better constant  < , then NP µ DTIME(n log log n ) Theorem: Saturate finds a solution A S such that min i F i (A S ) ¸ OPT k and |A S | ·  k where OPT k = max |A| · k min i F i (A)  = 1 + log max s  i F i ({s}) Theorem: The problem max |A| · k min i F(A) does not admit any approximation unless P=NP 

16 Experiments: Minimizing maximum variance in GP regression Robust biological experimental design Outbreak detection against adversarial contaminations Goals: Compare against state of the art Analyze appropriateness of “worst-case” assumption

17 0204060 0 0.05 0.1 0.15 0.2 0.25 Number of sensors Maximum marginal variance Greedy Simulated Annealing Saturate Spatial prediction Compare to state of the art [Sacks et.al. ’88, Wiens ’05, …] Highly tuned simulated annealing heuristics (7 parameters) Saturate is competitive & faster, better on larger problems Environmental monitoring Precipitation data better 020406080100 0.5 1 1.5 2 2.5 Number of sensors Maximum marginal variance Greedy Saturate Simulated Annealing

18 Maximum vs. average variance Minimizing the worst-case leads to good average- case score, not vice versa Environmental monitoring Precipitation data better

19 Outbreak detection Results even more prominent on water network monitoring (12,527 nodes) Water networks better Water networks

20 Robust experimental design Learn parameters  of nonlinear function y i = f(x i,  ) + w Choose stimuli x i to facilitate MLE of  Difficult optimization problem! Common approach: linearization! y i ¼ f(x i,  0 ) + r f  0 (x i ) T (  -  0 ) + w Allows nice closed form (fractional) solution! How should we choose  0 ??

21 Robust experimental design State-of-the-art: [Flaherty et al., NIPS ‘06] Assume perturbation on Jacobian r f  0 (x i ) Solve robust SDP against worst-case perturbation Minimize maximum eigenvalue of estimation error (E-optimality) This paper: Assume perturbation of initial parameter estimate  0 Use Saturate to perform well against all initial parameter estimates Minimize MSE of parameter estimate (Bayesian A-optimality, typically submodular!)

22 Experimental setup Estimate parameters of Michaelis-Menten model (to compare results) Evaluate efficiency of designs Loss of optimal design, knowing true parameter  true Loss of robust design, assuming (wrong) initial parameter  0

23 Robust design results Saturate more efficient than SDP if optimizing for high parameter uncertainty better Low uncertainty in  0 High uncertainty in  0 A BC A BC

24 Future (current) work Incorporating complex constraints (communication, etc.) Dealing with large numbers of objectives Constraint generation Improved guarantees for certain objectives (sensor failures) Trading off worst-case and average-case scores 0200400600800 0 500 1000 1500 2000 2500 3000 Expected score Adversarial score k=5 k=10 k=15 k=20

25 Conclusions Many observation selection problems require optimizing adversarially chosen submodular function Problem not approximable to any factor! Presented efficient algorithm: Saturate Achieves optimal score, with bounded increase in cost Guarantees are best possible under reasonable complexity assumptions Saturate performs well on real-world problems Outperforms state-of-the-art simulated annealing algorithms for sensor placement, no parameters to tune Compares favorably with SDP based solutions for robust experimental design


Download ppt "Carnegie Mellon Selecting Observations against Adversarial Objectives Andreas Krause Brendan McMahan Carlos Guestrin Anupam Gupta TexPoint fonts used in."

Similar presentations


Ads by Google