Download presentation

Presentation is loading. Please wait.

Published byJoseph Nash Modified over 3 years ago

1
Blind online optimization Gradient descent without a gradient Abie Flaxman CMU Adam Tauman Kalai TTI Brendan McMahan CMU

2
Standard convex optimization Convex feasible set S ½ < d Concave function f : S ! < } Goal: find x f(x) ¸ max z2S f(z) – = f(x*) - x* RdRd

3
Steepest ascent Move in the direction of steepest ascent Compute f(x) (rf(x) in higher dimensions) Works for convex optimization (and many other problems) x1x1 x2x2 x3x3 x4x4

4
Typical application Company produces certain numbers of cars per month Vector x 2 < d (#Corollas, #Camrys, …) Profit of company is concave function of production vector Maximize total (eq. average) profit PROBLEMS

5
Sequence of unknown concave functions period t: pick x t 2 S, find out only f t (x t ) convex Problem definition and results Theorem:

6
Online model Holds for arbitrary sequences Stronger than stochastic model: –f 1, f 2, …, i.i.d. from D –x * = arg min x2S E D [f(x)] expected regret

7
Outline Problem definition Simple algorithm Analysis sketch Variations Related work & applications

8
First try x1x1 f 1 (x 1 ) PROFIT #CAMRYS x2x2 f 2 (x 2 ) x3x3 f 3 (x 3 ) x4x4 f 4 (x 4 ) f1f1 f2f2 f3f3 f4f4 Zinkevich 03: If we could only compute gradients… x*

9
Idea: one point gradient PROFIT #CAMRYS x x+ x- With probability ½, estimate = f(x + )/ With probability ½, estimate = –f(x – )/ E[ estimate ] ¼ f(x)

10
d-dimensional online algorithm S x1x1 x2x2 x3x3 x4x4

11
Outline Problem definition Simple algorithm Analysis sketch Variations Related work & applications

12
Analysis ingredients E[1-point estimate] is gradient of is small Online gradient ascent analysis [Z03] Online expected gradient ascent analysis (Hidden complications)

13
1-pt gradient analysis PROFIT #CAMRYS x+ x-

14
1-pt gradient analysis (d-dim) E[1-point estimate] is gradient of is small 2 1

15
Online gradient ascent [Z03] (concave, bounded gradient)

16
Expected gradient ascent analysis Regular deterministic gradient ascent on g t (concave, bounded gradient)

17
Adaptive adversary…

18
Hidden complication… S

19
S

20
S

21
Thin sets are bad S

22
Hidden complication… Round sets are good …reshape into isotropic position [LV03]

23
Outline Problem definition Simple algorithm Analysis sketch Variations Related work & applications

24
Variations Works against adaptive adversary –Chooses f t knowing x 1, x 2, …, x t-1 Also works if we only get a noisy estimate of f t (x t ), i.e. E[h t (x t )|x t ]=f t (x t ) diameter gradient bound

25
Finite difference Related convex optimization Sighted (see entire function(s)) Blind (evaluations only) Regular (single f) Stochastic (dist over fs or dist over errors) Online (f 1, f 2, f 3, …) Gradient descent (stoch.) Gradient descent,...Ellipsoid, Random walk [BV02], Sim. annealing [KV05], Finite difference Gradient descent (online) [Z03] 1-pt. gradient appx. [BKM04] Finite difference [Kleinberg04] 1-pt. gradient appx. [G89,S97]

26
Related discrete optimization Linear function(s) over discrete set Sighted (see entire function(s)) Blind aka bandit (evaluations only) Regular (single f) Shortest path, max, … Stochastic (dist over fs) Huffman trees, … Online (f 1, f 2, f 3, …) Weighted majority, … Online linear optimization [Hannan57,KV03] Adversarial bandits, Blind linear optimization [AK04, MB04 (adaptive adversary)]

27
2 235 235 25 235 Switching lanes (experts) 031 503 034 230 S

28
2 235 235 25 235 Multi-armed bandit (experts) 1 0 0 0 S [R52,ACFS95,…]

29
Driving to work (online routing) Exponentially many paths… Exponentially many slot machines? Finite dimensions Exploration/exploitation tradeoff 25 [TW02,KV02, AK04,BM04] S

30
Online product design

31
One-dimensional problem easy Discretize, special case of multi-armed bandit problem 1/ slot machines No need for convexity d-dimensional problem harder Discretizing at granularity Exp many (1/ d ) slot machines ) exponential regret } High dimensions

32
Non-linear applications

33
Conclusions and future work Can learn to optimize a sequence of unrelated functions from evaluations Answer to: What is the sound of one hand clapping? Applications –Cholesterol –Paper airplanes –Advertising Future work –Many players using same algorithm (game theory)

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google