# Introduction to Multistage Stochastic Programming

## Presentation on theme: "Introduction to Multistage Stochastic Programming"— Presentation transcript:

Introduction to Multistage Stochastic Programming
Ryan Goodfellow – COSMO Laboratory

Outline Overview of multistage optimization
Modelling example – investment portfolio Application – multistage optimization of long-term production schedules for open pit mines

References Books: Online lecture notes: Application:
Shapiro, Dentcheva, Ruszyynski (2009) Lectures on Stochastic Programming: Modeling and Theory. Chapter 3. E-book available through McGill library (i.e. FREE for students). King & Wallace (2012) Modeling with Stochastic Programming. Online lecture notes: Linderoth (2003) Multistage Stochastic Programming. Application: Boland, Dumitrescu, Froyland (2008) A Multistage Stochastic Programming Approach to Open Pit Mine Production Scheduling with Uncertain Geology

Introduction Two-stage stochastic optimization
Make a set of decisions (first-stage) Profit from outcome or clean up the mess (recourse) Multistage stochastic optimization Information is slowly revealed over time: Make a decision for today (t) based on what I know today. Observe the outcome from stochastic process in time t. Set t = t + 1. Go to step 1.

Introduction - Multistage
Notations used: Let ξt, t=1,…,T, represent a sequence of random variables with a specified probability distribution (stochastic process). Let ξ[t]:=(ξ1,…, ξt) denote the history of the process up to time t. Let xt represent the decision vector, chosen at stage t.

Introduction - Multistage
Non-anticipativity: the values of xt may depend on the information ξ[t], the data available up to time t, but may not be influenced by the result of future observations (ξt+1,…, ξT). xt is a stochastic process because it depends on ξ[t] – our decisions made today are influenced by previous decisions and outcomes.

Introduction - Multistage
Nested formulation for T-stage stochastic program: where:

Introduction - Multistage
Nested formulation for T-stage stochastic program: where: ξ1 :=(c1 ,A1 ,b1) are known first-stage information (non-random)

Introduction - Multistage
Nested formulation for T-stage stochastic program: where: ξt :=(ct ,Bt ,At ,bt) are data vectors where some or all elements may be random

Introduction - Multistage
Linear formulation: How can we formally introduce “history” into our models?

Introduction – Scenario Trees
Scenario trees are used to represent history of decision making. Period 1 Root node – f1(x1) is deterministic Period 2 Period 3 Period 4

Introduction – Scenario Trees
Scenario trees are used to represent history of decision making. Period 1 Period 2 Period 3 Period 4 S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 A path from the root to a leaf is called a “scenario”.

Introduction – Scenario Trees
Scenario trees are used to represent history of decision making. Period 1 0.6 0.4 Period 2 0.25 0.25 0.5 Period 3 0.1 0.9 1 Period 4 Probabilities can be assigned to branches – defines the conditional probability of moving from one node to the next.

Introduction – Scenario Trees
Scenario trees are used to represent history of decision making. Period 1 0.6 Period 2 0.25 Period 3 0.1 Period 4 Probability of scenario #1: (0.6)*(0.25)*(0.1) = 0.015

Introduction – Scenario Trees
Scenario trees are used to represent history of decision making. Period 1 Period 2 Period 3 Period 4 Constraint matrices (A4, B4, b4) may be equal, however decision history from (x1, x2, x3) is different.

Introduction – Scenario Trees
Scenario trees are used to define non-anticipativity constraints. Period 1 Period 2 Period 3 Period 4 x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 For the moment, let xkt represent the decisions made for scenario k and time t. Let xk represent the set of decisions (xk1,…,xkT)

Introduction – Multistage Formulation
Let k={1,…,K} denote the index of a given scenario. Let pk denote the probability of scenario k={1,…,K}. Let (Atk, Btk, xtk, btk) denote the decision variables and LHS/RHS coefficients for scenario k in period t={2,…,T}. Let ctk denote the objective function coefficient for decision vector xtk for scenario k in period t.

Introduction – Multistage Formulation
Multistage linear formulation:

Introduction – Scenario Trees
Scenario trees are used to define non-anticipativity constraints. Period 1 Period 2 Period 3 Period 4 x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 Without scenario trees or histories, solving x1-x10 would be solving each scenario independently.

Introduction – Scenario Trees
Scenario trees are used to define non-anticipativity constraints. Period 1 x11=x21=x31=x41=x51=x61=x71=x81=x91=x101 Period 2 x12=x22=x32=x42=x52=x62 x72=x82=x92=x102 Period 3 x13=x23 x43=x43=x53 x83=x93=x103 Period 4 x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 We can group xit variables according to the scenario tree.

Introduction – Multistage Formulation
Multistage linear formulation: Non-anticipativity constraints

Introduction – Aggregation
Multistage stochastic programming is useful to integrate flexibility in models. However, as the number of periods or branches increases, the scenario tree grows exponentially, making it very challenging to optimize. Many types of multistage problems are meant to be re- solved in each period. Solve multistage problem to get good answer for today. Tomorrow, observe outcome of random variables, re-optimize decisions. To control computational explosion, we can aggregate decisions.

Introduction – Aggregation
Scenario trees may be simplified through aggregation Period 1 0.6 Period 2 0.25 0.25 0.5 Period 3 0.1 0.9 1 Period 4 x14 x24 x34 x44 x54 x64 x74 x84 x94 x104 As depth of scenario tree increases, total probability for a deep node decreases, so it may be useful to simplify the problem.

Introduction – Aggregation
Scenario trees may be simplified through aggregation Period 1 Period 2 Period 3 Period 4 x14 x24 x34 x44 x54 x64 x74 x84 x94 x104 Probability: 0.015 0.135 0.3 As depth of scenario tree increases, total probability for a deep node decreases, so it may be useful to simplify the problem.

Introduction – Aggregation
Scenario trees may be simplified through aggregation Period 1 Period 2 Period 3 Period 4 x14=x24 x3 x44=x54=x64 x7 x84=x94=x104 Probability: 0.15 0.3 0.15

Introduction – Aggregation
Scenario trees may be simplified through aggregation Period 1 Period 2 Period 3 Period 4 x14=x24 x3 x44=x54=x64 x7 x84=x94=x104 Most models that include the time value of money are not heavily influenced by decisions made very far in the future – impact of aggregation may be negligible.

Example – Ryan & Veronica’s Retirement Jobs
Ryan and Veronica wish to have a large nest-egg for their retirement job. + + Secondary goal: Goats! Primary goal: Gastro-pub

Example – Ryan & Veronica’s Retirement Jobs
We have a set N of stocks that we can invest in. We have T={1,…,40} investment periods before retirement. Let ωit, i N, t T be the return on stock i in period t. If we exceed goal G, we get \$y to invest in the goat farm. If we don’t meet G, we will need to win \$r from the lottery to make up for the loss. We can invest \$bt in period t (hopefully not stochastic).

Example – Ryan & Veronica’s Retirement Jobs
Variables: xit: amount of money to invest in stock i during period t. y: money that we can spend on goats after retiring. w: money needed from lottery winnings to open the gastro-pub.

Example – Ryan & Veronica’s Retirement Jobs
Deterministic formulation: Deterministic decision made today. Amount of money available for re-investment next period. Determine how much \$ we have or need. Non-negativity constraints.

Example – Ryan & Veronica’s Retirement Jobs
Multistage formulation: I may be naïve, but not enough to think that investing in the stock market is a deterministic problem. I can simulate prices for stocks and combine them into scenarios. Scenarios do not have to be dependent on the performance of a single stock, but the general performance over all stocks in a given year. Create a binomial scenario tree based on “High/Low” returns for a given period. Let ps define the probability of scenario s occurring. Let define the set of scenarios that are indistinguishable to scenario s in period t.

Example – Ryan & Veronica’s Retirement Jobs
Investment Portfolio Binomial Tree Period 1 Low High Period 2 Low High High Period 3 Low High Low High Low High Low High Period 4

Example – Ryan & Veronica’s Retirement Jobs
Multistage formulation: Non-anticipativity constraints

Authors: Natashia Boland, Irina Dumitrescu, Gary Froyland
A Multistage Stochastic Programming Approach to Open Pit Mine Production Scheduling with Uncertain Geology Authors: Natashia Boland, Irina Dumitrescu, Gary Froyland Written: October, 2008

Outline Introduction Deterministic case: MIP production schedule
Stochastic case: Scenario-dependent mining & processing decisions Conclusions Discussion

Introduction From geological estimation to generating cash flows
The first step to start a mine is to drill and interpolate the remaining volume. Traditional geostatistical interpolation (kriging) produces a single “image” of what the deposit looks like. Drillholes & Orebody Estimated (Kriged) Deposit Cross-Section Plan View

Introduction From geological estimation to generating cash flows
Open pit mine production scheduling problem: Schedule the extraction sequence of blocks such that the net present value (NPV) of the mine is maximized. Cash flows generated directly related to when a block is mined. The catch: we aren’t certain about the value of a block until we have mined it. How can we decide which block to mine first?

Introduction Definitions
Volume of earth is discretized into K blocks: Empty volume of rock Discretized volume of rock Set of blocks K={1,2,…,K}

Introduction Definitions
In order to reduce the number of variables, the blocks are combined into N “aggregates”. Aggregates:

Introduction Definitions
In order to reduce the number of variables, the blocks are combined into N “aggregates”.

Introduction Variables

Deterministic Case MIP Production Schedule (D-MIP)
Objective function Mine/mill extraction relationship Mill production capacity Mine production capacity Precedence/time constraints Variable definitions

Stochastic Case Motivation:
Deterministic MIP ignores geological uncertainty Use sequential simulation to produce many equiprobable “images”/scenarios of what the deposit looks like. Goal: create a schedule that can adapt to changes in information using stochastic geological simulations. Simulated Deposits Deterministic schedule is only applicable if your 1 model is correct (not likely). What happens when you apply the same DMIP to all scenarios?

Stochastic Case Mining has endogenous uncertainty.
More information is revealed as the operation proceeds. Uncertainty is generally reduced as more information is revealed. The information that we reveal is based on the decisions that we made.

Stochastic Case Multistage stochastic programming accommodates endogenous uncertainty through “adaptive” scheduling. Rather than producing a schedule to accommodate all scenarios well, produce a series of schedules telling you how to adapt under certain conditions. Two proposed models for stochastic scheduling: Scenario-dependent processing decisions Scenario-dependent mining & processing decisions We are allowed to dynamically change mining sequence and processing decisions based on new information.

Stochastic Case Non-anticipativity constraints:
If multiple scenarios have a common history, then we must make the same set of decisions for each of those scenarios. In the mining context: we have mined the exact same areas of the mine, thus we have revealed the same geological information. We must make the same decisions for all of those scenarios for future mining/processing. we cannot act based on future information

Stochastic Case More definitions
Set of scenarios (geostatistical simulations): α-{r,s}-differentiator:

Stochastic Case Scenario-Dependent Mining & Processing Decisions
Premise: Mines are slow to react to changes. We can change our mining sequence, however there is a (1 year) delay between when we get information and when we can make changes for the sequence. Processing decisions are made ad-hoc. We need to redefine our variables for each scenario:

Stochastic Case Scenario-Dependent Processing Decisions (SMP-MIP)
Objective function Mine/mill relationship Mill production capacity Mine production capacity Non-anticipativity constraints Precedence/time constraints Variable definitions

Stochastic Case Scenario-Dependent Mining & Processing Decisions (SMP-MIP)
Non-anticipativity constraints:

Stochastic Case Scenario-Dependent Mining & Processing Decisions (SMP-MIP)

Stochastic Case Scenario-Dependent Mining & Processing Decisions (SMP-MIP)
Year 1: At the beginning of production, we start with the exact same information! We must make the same mining/extraction decisions (not necessarily processing)!

Stochastic Case Scenario-Dependent Mining & Processing Decisions (SMP-MIP)
Year 2: The SMP-MIP will be forced to satisfy the NAC’s. It will choose to mine this, even if we reveal Scenario 1 to be the real deposit. Under Scenario 1, we want to mine this. Under Scenario 2, we want to mine this.

Stochastic Case Scenario-Dependent Mining & Processing Decisions (SMP-MIP)
Year 3: We have 2 possible outcomes: 1) Don’t mine the aggregate (and stop entirely) 2) Mine the aggregate The optimizer will see a slight increase in profit, but if we look at the individual scenario, our profits are higher because we are able to control what we send to the mill. Under Scenario 1, we want to mine this. Under Scenario 2, we want to STOP.

Stochastic Case Scenario-Dependent Mining & Processing Decisions (SMP-MIP)
Year 3: Don’t mine the aggregate (stop mining entirely): Why? Expected value of scenarios < 0. The optimizer will see a slight increase in profit, but if we look at the individual scenario, our profits are higher because we are able to control what we send to the mill. Under Scenario 1, we want to mine this. Under Scenario 2, we want to STOP.

Stochastic Case Scenario-Dependent Mining & Processing Decisions (SMP-MIP)
Year 3: Mine the aggregate: Why? Expected value of scenarios > 0. Outcome: We finally get out of NAP constraints! (Yay!) We can now process the two however we need. The optimizer will see a slight increase in profit, but if we look at the individual scenario, our profits are higher because we are able to control what we send to the mill. Under Scenario 1, we want to mine this. Under Scenario 2, we want to STOP.

Stochastic Case Scenario-Dependent Mining & Processing Decisions (SMP-MIP)
Year 4: We are finally free of all NAC’s! We can continue mining under Scenario 1. The optimizer will see a slight increase in profit, but if we look at the individual scenario, our profits are higher because we are able to control what we send to the mill. Under Scenario 1, we want to mine this. Under Scenario 2, we want to STOP.

Conclusions This paper attempts to solve the problem of endogenous uncertainty using multistage stochastic programming. The output from the optimizer for the stochastic formulation is a set of policies for how to react if we reveal certain information. In a benchmark comparison, the SPM-MIP yielded an average increase in NPV of 2.13% over the D-MIP (out of a maximum 5.09% if we have perfect information before we mine).

Discussion Fundamental issues: Use of simulations:
COSMO uses simulations to describe uncertainty. Multi-stage approach uses simulations to describe possibilities. What if the sims don’t cover all possibilities? Re-optimize! Definition of history requires subjective parameter (α) Use of block aggregation & NAP constraints: This method cannot be applied without block aggregation – ultimately destroys value & may get different solutions (Δr,s) Use of output schedule policy: Solution has multiple options for how to mine the depsoit. How can an engineer take the solution and make it “mineable”?