Scalable Multi-Stage Stochastic Programming

Name: Scalable Multi-Stage Stochastic Programming
Uploaded: 2017-12-27T10:22:32+00:00
Duration: PTM11S3
Channel: Lauren Oliver
Description: Scalable Multi-Stage Stochastic Programming

Scalable Multi-Stage Stochastic Programming
Cosmin Petra and Mihai Anitescu Mathematics and Computer Science Division Argonne National Laboratory LANS Informal Seminar March 2010 Work sponsored by U.S. Department of Energy Office of Nuclear Energy, Science & Technology

Problem formulation Multi-stage stochastic programming (SP) problem with recourse subj. to.

Deterministic approximation
Discrete and finite random variables Stage t=1 0.4 0.3 0.3 Stage t=2 50 45 49 Scenario Tree 0.4 0.6 0.5 0.5 0.35 0.35 0.3 Stage t=3 6 7 4 5 5 3 3 Sample average approximation (SAA) is used for continuous or discrete infinitely supported random variables.

The Deterministic SAA SP Problem
subj. to.

Two-Stage SP Problem subj. to. Block separable obj. fcn.
Half-arrow shaped Jacobian

Multistage SP Problems
Depth-first traversal of the scenario tree Nested half-arrow shaped Jacobian Block separable obj. func. 1 4 2 3 5 6 7 s.t.

Linear Algebra of Primal-Dual Interior-Point Methods (IPM)
Convex quadratic problem IPM Linear System Min subj. to. Two-stage SP Arrow-shaped linear system (via a permutation)

Linear Algebra for Multistage SP Problems
Two stages SP -> arrow shape T stages (T>2) Nested arrow shape: depth-first traversal of the scenario tree Each block diagonal has the structure 2-stage H has. For each non-leaf node a two-stage problem is solved.

The Direct Schur Complement Method (DSC)
Uses the arrow shape of H Solving Hz=r Implicit factorization Back substitution Diagonal solve Forward substitution

Parallelizing the Schur Complement
Scenario-based parallelization Bottlenecks from the Schur Complement Implicit factorization Back substitution Diagonal solve Forward substitution Gondzio OOPS, Zavala et.al., 2007 (in IPOPT) Solving S scenarios each with cost w, cost of SC is c, using P processes 10

The Preconditioned Schur Complement (PSC)
Goal: “Hide” the term from the parallel execution flow. How? Krylov subspace iterative methods for the solution of Solve iteratively A each iteration and are needed. A preconditioner M for C should be cheap to invert (or cheap to compute ) cluster the eigenvalues of

The Stochastic Preconditioner
The exact structure of C is IID subset of n scenarios: The stochastic preconditioner of is For C use the constraint preconditioner (Keller et. all., 2000)

But I said that … it has to be cheap to solve with the preconditioner
Solve with the factors of M Factorization of M done before C is computed Cost of the Krylov solve is slightly larger than before. Even when one process is dedicated to M (separate process)

Quality of the Stochastic Preconditioner
“Exponentially” better preconditioning Proof: Hoeffding inequality Assumptions on the problem’s random data Boundedness Uniform full rank of and

Quality of the Constraint Preconditioner
has an eigenvalue 1 with order of multiplicity . The rest of the eigenvalues satisfy Proof: based on Bergamaschi et. al., 2004

The Krylov Methods Used for
BiCGStab using constraint preconditioner M Projected CG (PCG) (Gould et. al., 2001) Preconditioned projection onto the Does not compute the basis for Instead,

Observations Real-life performance on linear SAA SP ( )
few Krylov iterations for more than half of IPM iterations several tenths of inner iterations as IPM approaches the solution PCG takes fewer iterations than BiCGStab Affected by the well-known ill-conditioning of IPMs. For convex quadratic SP the performance should improve.

A Parallel Interior-Point Solver for Stochastic Programming (PIPS)
Convex QP SAA SP problems Input: users specify the scenario tree Object-oriented design based on OOQP Linear algebra: tree vectors, tree matrices, tree linear systems Scenario based parallelism tree nodes (scenarios) are distributed across processors inter-process communication based on MPI dynamic load balancing Mehrotra predictor-corrector IPM

Tree Linear Algebra – Data, Operations & Linear Systems
Tree vector: b, c, x, etc Tree symmetric matrix: Q Tree general matrix: A Operations Linear systems: for each non-leaf node a two-stage problem is solved via Schur complement methods as previously described. Min subj. to. 1 4 2 3 5 6 7

Parallelization – Tree Distribution
The tree is distributed across processes. Example: 3 processes Dynamic load balancing of the tree Number partitioning problem --> graph partitioning --> METIS CPU 1 CPU 2 CPU 3 1 4 4 5 6 7 2 3

Numerical Experiments
Experiments on two 2-stage SP problems Economic Optimization of a Building Energy System Unit Commitment with Wind Power Generation The Building Energy System Problem The size of the problem was artificially increased There is no benefit of solving with that many CPUs The Unit Commitment Problem Original problem is using the Illinois wind farms grid Solved problems = 3 x Original pb. + up to 13 x more sampling data Strong scaling investigated

Economic Optimization of a Building Energy System
Zavala et. al., 2009 1.2 mil variables, few degrees of freedom Size of Schur complement = 357 Uncertainty (in RHS of the SP only)

PIPS on the Building Energy System Problem
The Direct Schur Complement – parallel scaling 97.6% parallel efficiency from 5 to 50 processors

Stochastic Unit Commitment with Wind Power Generation
Constantinescu et. al., 2009 MILP (relaxation solved) Largest instance: 664k variables 1.4M constraints 400 scenarios 2.2k Schur Complement Matrix. 70% efficiency from 10 to 200 CPUs Uncertainty is also only in RHS

PIPS on a Unit Commitment Problem
DSC on P processes vs PSC on P+1 process 120 scenarios Optimal use of PSC Factorization of the preconditioner can not be hidden anymore.

Conclusions The DSC method offers a good parallelism in an IPM framework. The PSC method improves the scalability. PIPS – solver for SP problems. PIPS seems to be ready for larger problems.

Future work New scalable methods for a more efficient software
Linear algebra/Optimization methods that take into account the characteristics of the particular sampling method. SAA error estimate. Looking for applications parallelize the 2nd stage sub-problems in a 2-stage setup use multi-stage SP with slim nodes PIPS Continuous case: IPM hot-start, other stochastic preconditioners, etc. Use with MILP/MINLP solvers. A ton of other small enhancements.

Thank you for your attention!
Questions?

Scalable Multi-Stage Stochastic Programming

Similar presentations

Presentation on theme: "Scalable Multi-Stage Stochastic Programming"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Scalable Multi-Stage Stochastic Programming

Similar presentations

Presentation on theme: "Scalable Multi-Stage Stochastic Programming"— Presentation transcript:

Similar presentations

About project

Feedback