Presentation is loading. Please wait.

Presentation is loading. Please wait.

MIT and James Orlin © 2003 1 Stochastic Dynamic Programming –Review –DP with probabilities.

Similar presentations


Presentation on theme: "MIT and James Orlin © 2003 1 Stochastic Dynamic Programming –Review –DP with probabilities."— Presentation transcript:

1 MIT and James Orlin © 2003 1 Stochastic Dynamic Programming –Review –DP with probabilities

2 MIT and James Orlin © 2003 2 Overview Objective: illustrate the use of DP with probabilities Seems more complex because it is a more complex decision at each stage But the optimal decision at each stage still depends on the previous stages.

3 MIT and James Orlin © 2003 3 Review of DP using stages Capital Budgeting, again Investment budget = $14,000

4 MIT and James Orlin © 2003 4 The Dynamic programming stages and states Let f(k,B) be the best NPV limited to stocks 1, 2, …, k only and using a budget of at most B. Stages: at stage k consider only stocks 1, 2, …, k State: B is the budget Compute f(1, B) for B = 0 to 14. Then compute f(2, B) for B = 0 to 14. Then compute f(3, B) for B = 0 to 14. etc.

5 MIT and James Orlin © 2003 5 Capital Budgeting: stage 1 Budget used up Consider stock 1: cost $5, NPV: $16 f(k, B) f(1,B) = 0 for B = 0 to 4 f(1, B) = 16 for B >= 5. 34567891011121314210B00016 S100

6 MIT and James Orlin © 2003 6 Capital Budgeting: stage 2 Budget used up Consider stock 1: cost $5, NPV: $16 f(k, B) f(2,B) = 0 for B = 0 to 4 f(2, B) = 16 for B = 5, 6 f(2, B) = 22 for B = 7 to 11 f(2, B) = 38 for B = 12 to 14 34567891011121314210B00016 S100 Consider stock 2: cost $7, NPV: $22 00016 22 38 S200

7 MIT and James Orlin © 2003 7 Capital Budgeting: stage 3, using DP Budget used up 34567891011121314210B00016 22 38 S200 Consider stock 3: cost $4, NPV: $12 f(2, B) We can compute f(3, B) using f(2, ) as input. We illustrate on f(3, 9). Don’t buy stock 3 $22 Buy stock 3 $12 $16 $28 Choose the best decision.

8 MIT and James Orlin © 2003 8 On the DP for the Capital Budgeting Problem Buy stock 3 Don’t buy stock 3 $22 $12 $16 $28 f(3,9) = max [ 12 + f(2, 5), f(2,9) ] f(3, B) = f(2, B) for B = 0, 1, 2, 3 f(3, B) = max [12 + f(2, B-4), f(2, B) ] for B = 4 to 14. In general, f(k, B) can be computed from f(k-1, · )

9 MIT and James Orlin © 2003 9 Decision Diagrams Buy stock 3 Don’t buy stock 3 $22 $12 $16 $28 The above diagram is a decision diagram. The optimal decision at each stage can be determined from decisions at previous stages. We may view the diagram as a “local decision diagram” since it involves only a small part of the overall decision. We use an extension of this approach when we deal with dynamic programming under uncertainty.

10 MIT and James Orlin © 2003 10 Dynamic Programming under uncertainty Next: we will permit uncertainties in our DPs. This is usually where DP gets much more powerful as a tool, but also more complex We illustrate with an example in warfare, or gaming if you prefer.

11 MIT and James Orlin © 2003 11 Destroying an enemy target: a bomber example You are a pilot in enemy territory. Your mission is to destroy an important target. You must get through. You have four minutes to reach your target, and have just been spotted by radar. Enemies have can launch up to one bomber per minute to prevent you from reaching the target. The probability of them launching a bomber in any minute is q i for i = 1 to 4.

12 MIT and James Orlin © 2003 12 A bomber example, continued To protect yourself, you have M missiles. Each has a probability of p j of destroying the bomber. Whenever you see a bomber, you must decide how many missiles to launch. If you do not destroy the bomber, then you will be destroyed. Determine a strategy for how many missiles to launch at each time, assuming you see a bomber attacking you. –Let f(k, m) be the number of missiles to launch assuming that you have k minutes left and have m missiles on hand. –A strategy is to determine f(k, m) for k = 1 to 4 and m = 1 to M.

13 MIT and James Orlin © 2003 13 Simulating the bomber example Each person has a die and a page describing the probabilities. Simulate 1 or more instances of the game. –We will discuss the results –Then we will show how to determine an optimal strategy using DP

14 MIT and James Orlin © 2003 14 What is the probability of surviving with 1 minutes remaining and 4 missiles left bomber launched? 1 minutes left, 4 missiles Fire yes hit? You win! yes no You win! no You lose. There is one minute left. You have 4 missiles remaining. The probability of a launched bomber is 2/3. The probability of a missile hitting the bomber is 1/3. If a bomber is launched, how many missiles do you fire. What is the probability of survival? 1 missile 2 missiles 3 missiles 4 missiles Step 1. Draw the diagram. Firing all missiles is clearly optimal with one minute to go.

15 MIT and James Orlin © 2003 15 Step 2. Fill in probabilities and end-values The probability of a launched bomber is 2/3. The probability of a missile hitting the bomber is 1/3. What is the probability of survival? bomber launched? 1 minutes left, 4 missiles Fire yes hit? You win! yes no You win! no You lose. 1 missile 2 missiles 3 missiles 4 missiles Fill in end values, prob. of survival 1 0 Fill in probabilities of events. 1/3 2/3 Probability of 4 missiles missing is (2/3) 4 = 16/81 16/81 65/81 1

16 MIT and James Orlin © 2003 16 1 Step 3. Compute values at each node. The probability of a launched bomber is 2/3. The probability of a missile hitting the bomber is 1/3. bomber launched? 1 minutes left, 4 missiles F yes H You win! yes no You win! no You lose. 1 missile 2 missiles 3 missiles 4 missiles 1 0 Compute values at each node, moving from right to left. 1/3 2/3 Value(B)= 1/3  1 + 2/3  65/81 = 211/243 16/81 65/81 211/243 211/243 =.868 B Value(F)= Value(H) = 65/81 Value(H)= 65/81  1 + 16/81  0

17 MIT and James Orlin © 2003 17 Carry out similar calculations for other values at stage 1, that is one minute remaining Probability of surviving 23456789101101.704.802.868.912.941.974.983.988.992.961.333.556 Number of missiles remaining Calculations for stage 1. We next do a stage 2 calculation, which will be typical of all other calculations.

18 MIT and James Orlin © 2003 18 Diagram for Determining Number of Missiles to Fire Fire hit? Lose Lose Lose Lose bomber launched? yes no yes no yes no yes no yes no 1 missile 2 missiles 3 missiles 4 missiles There are two minutes left. You have 4 missiles remaining. The probability of a launched bomber is 2/3. The probability of a missile hitting the bomber is 1/3. If a bomber is launched, how many missiles do you fire? 2 minutes left, 4 missiles Step 1, lay out the diagram.

19 MIT and James Orlin © 2003 19 Step 2. Fill in end values Fire hit? Lose Lose Lose Lose bomber launched? yes no yes no yes no yes no yes no 1 missile 2 missiles 3 missiles 4 missiles 2 minutes left. 4 missiles remaining. The probability of a launched bomber is 2/3. The probability of a missile hitting the bomber is 1/3. 2 minutes left, 4 missiles Fill in end values.868.802 0.704 0.566 0.333 0

20 MIT and James Orlin © 2003 20 2/3 Step 3. Fill in probabilities for events Fire hit? Lose Lose Lose Lose bomber launched? yes no yes no yes no yes no yes no 1 missile 2 missiles 3 missiles 4 missiles 2 minutes left. 4 missiles remaining. The probability of a launched bomber is 2/3. The probability of a missile hitting the bomber is 1/3. 1/3 2 minutes left, 4 missiles Fill in Probabilities.868.802 0.704 0.566 0.333 0 1/3 4/9 8/27 16/81 5/9 19/27 65/81 2/3

21 MIT and James Orlin © 2003 21 2/3 Step 4. Determine values of nodes and make decisions. F H1 H2 H3 H4 Lose Lose Lose Lose bomber launched? yes no yes no yes no yes no yes no 1 missile 2 missiles 3 missiles 4 missiles 2 minutes left. 4 missiles remaining. The probability of a launched bomber is 2/3. The probability of a missile hitting the bomber is 1/3. 1/3 2 minutes left, 4 missiles Determine node values..868.802 0.704 0.566 0.333 0 1/3 4/9 8/27 16/81 5/9 19/27 65/81 2/3 Value(H1) = 1/3 .802 + 2/3  0 =.2673.2673 Value(H2) = 5/9 .704 + 4/9  0 =.3909.3909.2673 Value(H3) = 19/27 .566 + 8/27  0 =.3909 Value(H4) = 65/81 .333 + 16/81  0 =.2673 Value(F) = max[Value(H1), Value(H2), Value(H3), Value(H4)] =.3909.3909.549 B Value(B) = 1/3 .868 + 2/3 .3909 =.550

22 MIT and James Orlin © 2003 22 Node values: again H1 H2 H3 H4 Lose Lose Lose Lose yes no yes no yes no yes no 1 missile 2 missiles 3 missiles 4 missiles.802 0.704 0.566 0.333 0 1/3 4/9 8/27 16/81 5/9 19/27 65/81 2/3 Value = 1/3 .802 + 2/3  0 =.2673.2673 Value = 5/9 .704 + 4/9  0 =.3909.3909.2673 Value = 19/27 .566 + 8/27  0 =.3909 Value = 65/81 .333 + 16/81  0 =.2673

23 MIT and James Orlin © 2003 23 Some comments on DP Seems complex, but the computations are all very similar. –easy to program (not so easy in Excel) –very efficient Useful in finance –investments over time –the outcome of an investment is uncertain Useful in inventory control –demands are uncertain –supplies must be ordered in advance

24 MIT and James Orlin © 2003 24 Probabilities of surviving Probability of reaching the target 23456789101101 missiles.704.802.868.912.941.974.983.988.992.961.333.556 1 minute.358.473.550.634.690.789.830.858.886.750.111.259 2 minutes.177.254.316.387.452.561.616.655.696.508.037.111 3 minutes.084.126.171.223.270.368.417.460.504.318.012.045 4 minutes Bomber spreadsheet

25 MIT and James Orlin © 2003 25 Summary for dynamic programming Useful in decision making over time Uses stages, states, optimal value functions Uses recursion Can incorporate probabilities Useful in inventory management, finance, shortest path, and much more


Download ppt "MIT and James Orlin © 2003 1 Stochastic Dynamic Programming –Review –DP with probabilities."

Similar presentations


Ads by Google