Sampling-based Approximation Algorithms for Multi-stage Stochastic Optimization Chaitanya Swamy University of Waterloo Joint work with David Shmoys Cornell.

Sampling-based Approximation Algorithms for Multi-stage Stochastic Optimization Chaitanya Swamy University of Waterloo Joint work with David Shmoys Cornell University

Stochastic Optimization Way of modeling uncertainty. Exact data is unavailable or expensive – data is uncertain, specified by a probability distribution. Want to make the best decisions given this uncertainty in the data. Applications in logistics, transportation models, financial instruments, network design, production planning, … Dates back to 1950’s and the work of Dantzig.

Stochastic Recourse Models Given:Probability distribution over inputs. Stage I :Make some advance decisions – plan ahead or hedge against uncertainty. Uncertainty evolves through various stages. Learn new information in each stage. Can take recourse actions in each stage – can augment earlier solution paying a recourse cost. Choose initial (stage I) decisions to minimize (stage I cost) + (expected recourse cost).

2-stage problem  2 decision points 0.2 0.02 0.3 0. 1 stage I stage II scenarios 0.5 0.2 0.3 0.4 stage I stage II scenarios in stage k k-stage problem  k decision points

2-stage problem  2 decision points Choose stage I decisions to minimize expected total cost = (stage I cost) + E all scenarios [cost of stages 2 … k]. 0.2 0.02 0.3 0. 1 stage I stage II scenarios 0.5 0.2 0.4 stage I stage II k-stage problem  k decision points 0.3 scenarios in stage k

Stochastic Set Cover (SSC) Universe U = {e 1, …, e n }, subsets S 1, S 2, …, S m  U, set S has weight w S. Deterministic problem: Pick a minimum weight collection of sets that covers each element. Stochastic version: Target set of elements to be covered is given by a probability distribution. –target subset A  U to be covered (scenario) is revealed after k stages –choose some sets S i initially – stage I –can pick additional sets S i in each stage paying recourse cost. stage I A1  UA1  U Ak  UAk  U Minimize Expected Total cost = E scenarios A  U [cost of sets picked for scenario A in stages 1, … k].

stage II A stage I stage III C B B B C C D A A D D 0.2 0.8 0.5 0.30.7

Stochastic Set Cover (SSC) Universe U = {e 1, …, e n }, subsets S 1, S 2, …, S m  U, set S has weight w S. Deterministic problem: Pick a minimum weight collection of sets that covers each element. Stochastic version: Target set of elements to be covered is given by a probability distribution. How is the probability distribution on subsets specified? A short (polynomial) list of possible scenarios Independent probabilities that each element exists A black box that can be sampled.

Approximation Algorithm Hard to solve the problem exactly. Even special cases are #P-hard. Settle for approximate solutions. Give polytime algorithm that always finds near-optimal solutions. A is a  -approximation algorithm if, A runs in polynomial time. A (I) ≤ . OPT(I) on all instances I,  is called the approximation ratio of A.

Previous Models Considered 2-stage problems –polynomial scenario model: Dye, Stougie & Tomasgard; Ravi & Sinha; Immorlica, Karger, Minkoff & Mirrokni. –Immorlica et al.: also consider independent activation model proportional costs: (stage II cost) = (stage I cost), e.g., w S A =. w S for each set S, in each scenario A. –Gupta, Pál, Ravi & Sinha: black-box model but also with proportional costs. –Shmoys, S (SS04): black-box model with arbitrary costs. gave an approximation scheme for 2-stage LPs + rounding procedure that “reduces” stochastic problems to their deterministic versions.

Previous Models (contd.) Multi-stage problems –Hayrapetyan, S & Tardos: O(k)-approximation algorithm for k-stage Steiner tree. –Gupta, Pál, Ravi & Sinha: also other k-stage problems. 2k-approximation algorithm for Steiner tree factors exponential in k for vertex cover, facility location. Both only consider proportional, scenario-dependent costs.

Our Results Give the first fully polynomial approximation scheme (FPAS) for a broad class of k-stage stochastic linear programs for any fixed k. –black-box model: arbitrary distribution. –no assumptions on costs. –algorithm is the Sample Average Approximation (SAA) method. First proof that SAA works for (a class of) k-stage LPs with poly-bounded sample size. Shapiro ’05: k-stage programs but with independent stages Kleywegt, Shapiro & Homem De-Mello ’0 1 : bounds for 2-stage programs S, Shmoys ’05: unpublished note that SAA works for 2-stage LPs Charikar, Chekuri & Pál ’05: another proof that SAA works for (a class of) 2-stage programs

Results (contd.) FPAS + rounding technique of SS04 gives approximation algorithms for k-stage stochastic integer programs. –no assumptions on distribution or costs –improve upon various results obtained in more restricted models: e.g., O(k)-approx. for k-stage vertex cover (VC), facility location. Munagala; Srinivasan: improved factor for k-stage VC to 2.

A Linear Program for 2-stage SSC x S : 1 if set S is picked in stage I y A,S : 1 if S is picked in scenario A Minimize ∑ S  S x S + ∑ A  U p A ∑ S W S y A,S s.t.∑ S:e  S x S + ∑ S:e  S y A,S ≥ 1 for each A  U, e  A x S, y A,S ≥ 0for each S, A Exponentially many variables and constraints. 0.2 0.02 0.3 stage I stage II scenario A  U p A : probability of scenario A  U. Let cost w A S = W S for each set S, scenario A. Equivalent compact, convex program: Minimize h(x) = ∑ S  S x S + ∑ A  U p A f A (x) s.t. 0 ≤ x S ≤ 1 for each S f A (x) = min { ∑ S W S y A,S :∑ S:e  S y A,S ≥ 1 – ∑ S:e  S x S for each e  A y A,S ≥ 0 for each S }

Wanted result: With poly-bounded N, x solves (SA-P)  h(x) ≈ OPT. Sample Average Approximation Sample Average Approximation (SAA) method: –Sample some N times from distribution –Estimate p A by q A = frequency of occurrence of scenario A = n A /N. True problem: min x  P ( h(x)= . x + ∑ A  U p A f A (x) ) (P) Sample average problem:min x  P ( h'(x)= . x + ∑ A  U q A f A (x) ) (SA-P) Size of (SA-P) as an LP depends on N – how large should N be? Possible approach: Try to show that h'(.) and h(.) take similar values. Problem: Rare scenarios can significantly influence value of h(.), but will almost never be sampled.

Wanted result: With poly-bounded N, x solves (SA-P)  h(x) ≈ OPT. Sample Average Approximation (SAA) method: –Sample some N times from distribution –Estimate p A by q A = frequency of occurrence of scenario A = n A /N. True problem: min x  P ( h(x)= . x + ∑ A  U p A f A (x) ) (P) Sample average problem:min x  P ( h'(x)= . x + ∑ A  U q A f A (x) ) (SA-P) Size of (SA-P) as an LP depends on N – how large should N be? Sample Average Approximation Possible approach: Try to show that h'(.) and h(.) take similar values. Problem: Rare scenarios can significantly influence value of h(.), but will almost never be sampled. Key insight: Rare scenarios do not much affect the optimal first-stage decisions x *  instead of function value, look at how function varies with x  show that “slopes” of h'(.) and h(.) are “close” to each other x h'(x) h(x) x*x* x

True problem: min x  P ( h(x)= . x + ∑ A  U p A f A (x) ) (P) Sample average problem:min x  P ( h'(x)= . x + ∑ A  U q A f A (x) ) (SA-P) Slope  subgradient Closeness-in-subgradients Closeness-in-subgradients: At “most” points u in P,  vector d' u such that (*)d' u is a subgradient of h'(.) at u, AND an  -subgradient of h(.) at u. True with high probability for h(.) and h'(.). Lemma: For any convex functions g(.), g'(.), if (*) holds then, x solves min x  P g'(x)  x is a near-optimal solution to min x  P g(x). d  m is a subgradient of h(.) at u, if  v, h(v) – h(u) ≥ d. (v–u). d is an  -subgradient of h(.) at u, if  v  P, h(v) – h(u) ≥ d. (v–u) – . h(v) – . h(u).

Closeness-in-subgradients Closeness-in-subgradients: At “most” points u in P,  vector d' u such that (*)d' u is a subgradient of h'(.) at u, AND an  -subgradient of h(.) at u. Lemma: For any convex functions g(.), g'(.), if (*) holds then, x solves min x  P g'(x)  x is a near-optimal solution to min x  P g(x). P u g(x) ≤ g(u) dudu Intuition: Minimizer of convex function is determined by subgradient. Ellipsoid-based algorithm of SS04 for convex minimization only uses (  -) subgradients: uses (  -) subgradient to cut ellipsoid at a feasible point u in P (*)  can run SS04 algorithm on both min x  P g(x) and min x  P g'(x) using same vector d' u to cut ellipsoid at u  P  algorithm will return x that is near-optimal for both problems. d  m is a subgradient of h(.) at u, if  v, h(v) – h(u) ≥ d. (v–u). d is an  -subgradient of h(.) at u, if  v  P, h(v) – h(u) ≥ d. (v–u) – . h(v) – . h(u).

Proof for 2-stage SSC True problem: min x  P ( h(x)= . x + ∑ A  U p A f A (x) ) (P) Sample average problem:min x  P ( h'(x)= . x + ∑ A  U q A f A (x) ) (SA-P) Let = max S W S /  S, z A  optimal dual solution for scenario A at point u  P. Facts from SS04: A.vector d u = {d u,S } with d u,S =  S – ∑ A p A ∑ e  A  S z A is subgradient of h(.) at u; can write d u,S = E[X S ] where X S =  S – ∑ e  A  S z A in scenario A B.X S  [–W S,  S ]  Var[X S ] ≤ W S 2 for every set S C.if d' = {d' S } is a vector such that |d' S – d u,S | ≤ .  S for every set S then, d' is an  -subgradient of h(.) at u. A  vector d' u with components d' u,S =  S – ∑ A q A ∑ e  A  S z A = E q [X S ] is a subgradient of h'(.) at u B, C  with poly( 2 /  2. log( 1 /  )) samples, d' u is an  -subgradient of h(.) at u with probability ≥ 1 –   polynomial samples ensure that with high probability, at “most” points u  P, d' u is an  -subgradient of h(.) at u property (*)

3-stage SSC True distribution pApA TATA stage I stage II stage III scenario (A,B) specifies set of elements to cover A p A,B Sampled distribution qAqA TATA stage I stage II stage III scenario (A,B) specifies set of elements to cover A q A,B –True distribution p A is estimated by q A –True distribution {p A,B } in T A is only estimated by distribution {q A,B }  True and sample average problems solve different recourse problems for a given scenario A True problem: min x  P (h(x) = . x + ∑ A p A f A (x))(3-P) Sample avg. problem: min x  P (h'(x) = . x + ∑ A q A g A (x))(3SA-P) f A (x), g A (x)  2-stage set-cover problems specified by tree T A

3-stage SSC (contd.) True problem: min x  P (h(x) = . x + ∑ A p A f A (x))(3-P) Sample avg. problem:min x  P (h'(x) = . x + ∑ A q A g A (x))(3SA-P) main difficulty: h(.) and h'(.) solve different recourse problems From current 2-stage theorem, can infer that for “most” x  P, any second-stage soln. y that minimizes g A (x) also “nearly” minimizes f A (x) – is this enough to prove desired theorem for h(.) and h'(.)? SupposeH(x)= min y a(x,y) H'(x)= min y b(x,y) s.t.  x, each y that minimizes b(x,. ) also minimizes a(x,. ) If x minimizes H'(.), does it also approximately minimize H(.)? NO: e.g.,a(x,y) = A(x)+(y – y 0 ) 2 b(x,y)= B(x)+(y – y 0 ) 2 where A(.)  f’n of x, B(.)  f’n of x a(.), b(.) are convex f’ns.

Proof sketch for 3-stage SSC True problem: min x  P ( h(x) = . x + ∑ A p A f A (x) ) (3-P) Sample avg. problem:min x  P ( h'(x) = . x + ∑ A q A g A (x) ) (3SA-P) Will show that h(.) and h'(.) are close in subgradient. main difficulty: h(.) and h'(.) solve different recourse problems Subgradient of h(.) at u is d u ;d u,S =  S – ∑ A p A (dual soln. to f A (u)) Subgradient of h'(.) at u is d' u ; d' u,S =  S – ∑ A q A (dual soln. to g A (u)) To show d' is an  -subgradient of h(.) need that: (dual soln. to g A (u)) is a near-optimal (dual soln. to f A (u)) This is a Sample average theorem for the dual of a 2-stage problem!

Proof sketch for 3-stage SSC True problem: min x  P ( h(x) = . x + ∑ A p A f A (x) ) (3-P) Sample average problem:min x  P ( h'(x) = . x + ∑ A q A g A (x) ) (3SA-P) Subgradient of h(.) at u is d u with d u,S =  S – ∑ A p A (dual soln. to f A (u)) Subgradient of h'(.) at u is d' u with d' u,S =  S – ∑ A q A (dual soln. to g A (u)) To show d' u is an  -subgradient of h(.) need that: (dual soln. to g A (u)) is a near-optimal (dual soln. to f A (u)) Idea: Show that the two dual objective f’ns. are close in subgradients Problem: Cannot get closeness-in-subgradients by looking at standard exponential size LP-dual of f A (x), g A (x)

f A (x) = min ∑ S w A S y A,S + ∑ scenarios (A,B), S p A,B. w A B,S z A,B,S s.t.∑ S:e  S y A,S + ∑ S:e  S z A,B,S ≥ 1 – ∑ S:e  S x S  scenarios (A,B),  e  E(A,B) y A,S, z A,B,S ≥ 0  scenarios (A,B),  S Dual is max ∑ A,B,e ( 1 – ∑ S:e  S x S )  A,B,e s.t. ∑ scenarios (A,B), e  S  A,B,e ≤ w A S  S ∑ e  S  A,B,e ≤ p A,B. w A B,S  scenarios (A,B),  S  A,B,e ≥ 0  scenarios (A,B),  e  E(A,B) pApA TATA stage I stage II stage III scenario (A,B) specifies set of elements to cover A p A,B

Proof sketch for 3-stage SSC True problem: min x  P ( h(x) = . x + ∑ A p A f A (x) ) (3-P) Sample average problem:min x  P ( h'(x) = . x + ∑ A q A g A (x) ) (3SA-P) Subgradient of h(.) at u is d u with d u,S =  S – ∑ A p A (dual soln. to f A (u)) Subgradient of h'(.) at u is d' u with d' u,S =  S – ∑ A q A (dual soln. to g A (u)) To show d' u is an  -subgradient of h(.) need that: (dual soln. to g A (u)) is a near-optimal (dual soln. to f A (u)) Idea: Show that the two dual objective f’ns. are close in subgradients Problem: Cannot get closeness-in-subgradients by looking at standard exponential size LP-dual of f A (x), g A (x) –formulate a new compact non-linear dual of polynomial size. –(approximate) subgradient of dual objective function comes from (near-) optimal solution to a 2-stage primal LP: use earlier SAA result. Recursively apply this idea to solve k-stage stochastic LPs.

Summary of Results Give the first approximation scheme to solve a broad class of k-stage stochastic linear programs for any fixed k. –prove that Sample Average Approximation method works for our class of k-stage programs. Obtain approximation algorithms for k-stage stochastic integer problems – no assumptions on costs or distribution. –k. log n-approx. for k-stage set cover. (Srinivasan: log n) –O(k)-approx. for k-stage vertex cover, multicut on trees, uncapacitated facility location (FL), some other FL variants. –( 1 +  )-approx. for multicommodity flow. Results improve previous results obtained in restricted k-stage models.

Open Questions Obtain approximation factors independent of k for k-stage (integer) problems: e.g., k-stage FL, k-stage Steiner tree Improve analysis of SAA method, or obtain some other (polynomial) sampling algorithm: –any  -approx. solution to constructed problem gives (  +  )-approx. solution to true problem –better dependence on k – are exp(k) samples required? –improved sample-bounds when stages are independent?

Thank You.

Sampling-based Approximation Algorithms for Multi-stage Stochastic Optimization Chaitanya Swamy University of Waterloo Joint work with David Shmoys Cornell.

Similar presentations

Presentation on theme: "Sampling-based Approximation Algorithms for Multi-stage Stochastic Optimization Chaitanya Swamy University of Waterloo Joint work with David Shmoys Cornell."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Sampling-based Approximation Algorithms for Multi-stage Stochastic Optimization Chaitanya Swamy University of Waterloo Joint work with David Shmoys Cornell.

Similar presentations

Presentation on theme: "Sampling-based Approximation Algorithms for Multi-stage Stochastic Optimization Chaitanya Swamy University of Waterloo Joint work with David Shmoys Cornell."— Presentation transcript:

Similar presentations

About project

Feedback