MIT and James Orlin © 2003 1 Stochastic Dynamic Programming –Review –DP with probabilities.

Slides:

Advertisements

Similar presentations

Decision Modeling Decision Analysis.

Advertisements

Making Simple Decisions

BU Decision Models Integer_LP1 Integer Optimization Summer 2013.

1 Overview of Simulation When do we prefer to develop simulation model over an analytic model? When not all the underlying assumptions set for analytic.

Analysis of Algorithms

Intro to Simulation (using Excel) DSC340 Mike Pangburn.

MIT and James Orlin © Dynamic Programming 1 –Recursion –Principle of Optimality.

6 - 1 Lecture 4 Analysis Using Spreadsheets. Five Categories of Spreadsheet Analysis Base-case analysis What-if analysis Breakeven analysis Optimization.

Operations management Session 17: Introduction to Revenue Management and Decision Trees.

In this handout Stochastic Dynamic Programming

Engineering Economic Analysis Canadian Edition

Chapter 14 Assessing the Value of IT. Traditional Financial Approaches  ROI – Return on Investments Each area is considered an investment center ROI.

1 SIMULATION – PART I Introduction to Simulation and Its Application to Yield Management For this portion of the session, the learning objectives are:

Project Risk Management

MIT and James Orlin © Dynamic Programming 2 –Review –More examples.

Extensions to Consumer theory Inter-temporal choice Uncertainty Revealed preferences.

While there is a generally accepted precise definition for the term "first order differential equation'', this is not the case for the term "Bifurcation''.

QR 38, 2/15/07 Extensive form games I.Writing down a game II.Finding the equilibrium III.Adding complexity.

Games of Incomplete Information. These games drop the assumption that players know each other’s preferences. Complete info: players know each other’s preferences.

Chapter 12: Inventory Control Models

Privileged and Confidential Strategic Approach to Asset Management Presented to October Urban Water Council Regional Seminar.

CS Reinforcement Learning1 Reinforcement Learning Variation on Supervised Learning Exact target outputs are not given Some variation of reward is.

FOUNDATION BUSINESS SIMULATION

6 - 1 Chapter 6: Analysis Using Spreadsheets The Art of Modeling with Spreadsheets S.G. Powell and K.R. Baker © John Wiley and Sons, Inc. PowerPoint Slides.

Decision Analysis (cont)

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 1 Chapter 1: The DP Algorithm To do:  sequential decision-making  state.

MBAD/F 619: Risk Analysis and Financial Modeling Instructor: Linda Leon Fall 2014

MIT and James Orlin © More Linear Programming Models.

Constrained Forecast Evaluation (CFE) Ronald P. Menich AGIFORS Res & YM 2-5 June 2003 HNL.

Uncertainty in Future Events Chapter 10: Newnan, Eschenbach, and Lavelle Dr. Hurley’s AGB 555 Course.

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 31 Alternative System Description If all w k are given initially as Then,

1 1 Slide © 2000 South-Western College Publishing/ITP Slides Prepared by JOHN LOUCKS.

Engineering Economic Analysis Canadian Edition

RISK BENEFIT ANALYSIS Special Lectures University of Kuwait Richard Wilson Mallinckrodt Professor of Physics Harvard University January 13th, 14th and.

Outline of Chapter 9: Using Simulation to Solve Decision Problems Real world decisions are often too complex to be analyzed effectively using influence.

Using Tree Diagrams to Represent a Sample Space. Imagine that a family decides to play a game each night. They all agree to use a tetrahedral die (i.e.,

MBA7020_01.ppt/June 13, 2005/Page 1 Georgia State University - Confidential MBA 7020 Business Analysis Foundations Introduction - Why Business Analysis.

Honors Track: Competitive Programming & Problem Solving Optimization Problems Kevin Verbeek.

FIN 614: Financial Management Larry Schrenk, Instructor.

Computer Simulation. The Essence of Computer Simulation A stochastic system is a system that evolves over time according to one or more probability distributions.

Models for Strategic Marketing Decision Making. Market Entry Decisions To enter first or to wait Sources of First-Mover Advantages –Technological leadership.

Author: Tadeusz Sawik Decision Support Systems Volume 55, Issue 1, April 2013, Pages 156–164 Adviser: Frank, Yeong-Sung Lin Presenter: Yi-Cin Lin.

Tuesday, April 30 Dynamic Programming – Recursion – Principle of Optimality Handouts: Lecture Notes.

Decision Theoretic Planning. Decisions Under Uncertainty  Some areas of AI (e.g., planning) focus on decision making in domains where the environment.

Introduction to Computational Modeling of Social Systems GeoContest Simulating Strategies of Conquest Nils Weidmann, CIS Room E.3

CAPITAL BUDGETING &FINANCIAL PLANNING. d. Now suppose this project has an investment timing option, since it can be delayed for a year. The cost will.

Thursday, May 2 Dynamic Programming – Review – More examples Handouts: Lecture Notes.

KABAM COLLIDER PROPOSAL TEAM THE VISIBLE HAND MEMBERS ZAREK BROT-GOLDBERG, PH.D. STUDENT IN ECONOMICS JORDAN OU, PH.D. CANDIDATE IN ECONOMICS.

2.3. Value of Information: Decision Trees and Backward Induction.

Greedy Algorithms BIL741: Advanced Analysis of Algorithms I (İleri Algoritma Çözümleme I)1.

1 1 © 2003 Thomson  /South-Western Slide Slides Prepared by JOHN S. LOUCKS St. Edward’s University.

Ence 627 Decision Analysis for Engineering Project Portfolio Selection: “Optimal Budgeting of Projects Under Uncertainty” Javier Ordóñez.

Introduction to Integer Programming Integer programming models Thursday, April 4 Handouts: Lecture Notes.

Class Business Debate #2 Upcoming Groupwork – Spreadsheet Spreadsheet.

Extensions to Consumer theory Inter-temporal choice Uncertainty Revealed preferences.

Reinforcement Learning. Overview Supervised Learning: Immediate feedback (labels provided for every input). Unsupervised Learning: No feedback (no labels.

Risk Analysis, Real Options, and Capital Budgeting

Part I Project Initiation.

Comparing Dynamic Programming / Decision Trees and Simulation Techniques BDAuU, Prof. Eckstein.

Analysis Using Spreadsheets

Artificial Intelligence in Game Design

Fundamentals of Production Planning and Control

Adviser: Frank,Yeong-Sung Lin Present by 瀅如

Decision Analysis Objective

Min Global Cut Animation

Dijkstra’s Algorithm for Shortest Paths

Monte Carlo Simulation

Eulerian Cycles in directed graphs

Discrete-time markov chain (continuation)

Presentation transcript:

MIT and James Orlin © Stochastic Dynamic Programming –Review –DP with probabilities

MIT and James Orlin © Overview Objective: illustrate the use of DP with probabilities Seems more complex because it is a more complex decision at each stage But the optimal decision at each stage still depends on the previous stages.

MIT and James Orlin © Review of DP using stages Capital Budgeting, again Investment budget = $14,000

MIT and James Orlin © The Dynamic programming stages and states Let f(k,B) be the best NPV limited to stocks 1, 2, …, k only and using a budget of at most B. Stages: at stage k consider only stocks 1, 2, …, k State: B is the budget Compute f(1, B) for B = 0 to 14. Then compute f(2, B) for B = 0 to 14. Then compute f(3, B) for B = 0 to 14. etc.

MIT and James Orlin © Capital Budgeting: stage 1 Budget used up Consider stock 1: cost $5, NPV: $16 f(k, B) f(1,B) = 0 for B = 0 to 4 f(1, B) = 16 for B >= B00016 S100

MIT and James Orlin © Capital Budgeting: stage 2 Budget used up Consider stock 1: cost $5, NPV: $16 f(k, B) f(2,B) = 0 for B = 0 to 4 f(2, B) = 16 for B = 5, 6 f(2, B) = 22 for B = 7 to 11 f(2, B) = 38 for B = 12 to B00016 S100 Consider stock 2: cost $7, NPV: $ S200

MIT and James Orlin © Capital Budgeting: stage 3, using DP Budget used up B S200 Consider stock 3: cost $4, NPV: $12 f(2, B) We can compute f(3, B) using f(2, ) as input. We illustrate on f(3, 9). Don’t buy stock 3 $22 Buy stock 3 $12 $16 $28 Choose the best decision.

MIT and James Orlin © On the DP for the Capital Budgeting Problem Buy stock 3 Don’t buy stock 3 $22 $12 $16 $28 f(3,9) = max [ 12 + f(2, 5), f(2,9) ] f(3, B) = f(2, B) for B = 0, 1, 2, 3 f(3, B) = max [12 + f(2, B-4), f(2, B) ] for B = 4 to 14. In general, f(k, B) can be computed from f(k-1, · )

MIT and James Orlin © Decision Diagrams Buy stock 3 Don’t buy stock 3 $22 $12 $16 $28 The above diagram is a decision diagram. The optimal decision at each stage can be determined from decisions at previous stages. We may view the diagram as a “local decision diagram” since it involves only a small part of the overall decision. We use an extension of this approach when we deal with dynamic programming under uncertainty.

MIT and James Orlin © Dynamic Programming under uncertainty Next: we will permit uncertainties in our DPs. This is usually where DP gets much more powerful as a tool, but also more complex We illustrate with an example in warfare, or gaming if you prefer.

MIT and James Orlin © Destroying an enemy target: a bomber example You are a pilot in enemy territory. Your mission is to destroy an important target. You must get through. You have four minutes to reach your target, and have just been spotted by radar. Enemies have can launch up to one bomber per minute to prevent you from reaching the target. The probability of them launching a bomber in any minute is q i for i = 1 to 4.

MIT and James Orlin © A bomber example, continued To protect yourself, you have M missiles. Each has a probability of p j of destroying the bomber. Whenever you see a bomber, you must decide how many missiles to launch. If you do not destroy the bomber, then you will be destroyed. Determine a strategy for how many missiles to launch at each time, assuming you see a bomber attacking you. –Let f(k, m) be the number of missiles to launch assuming that you have k minutes left and have m missiles on hand. –A strategy is to determine f(k, m) for k = 1 to 4 and m = 1 to M.

MIT and James Orlin © Simulating the bomber example Each person has a die and a page describing the probabilities. Simulate 1 or more instances of the game. –We will discuss the results –Then we will show how to determine an optimal strategy using DP

MIT and James Orlin © What is the probability of surviving with 1 minutes remaining and 4 missiles left bomber launched? 1 minutes left, 4 missiles Fire yes hit? You win! yes no You win! no You lose. There is one minute left. You have 4 missiles remaining. The probability of a launched bomber is 2/3. The probability of a missile hitting the bomber is 1/3. If a bomber is launched, how many missiles do you fire. What is the probability of survival? 1 missile 2 missiles 3 missiles 4 missiles Step 1. Draw the diagram. Firing all missiles is clearly optimal with one minute to go.

MIT and James Orlin © Step 2. Fill in probabilities and end-values The probability of a launched bomber is 2/3. The probability of a missile hitting the bomber is 1/3. What is the probability of survival? bomber launched? 1 minutes left, 4 missiles Fire yes hit? You win! yes no You win! no You lose. 1 missile 2 missiles 3 missiles 4 missiles Fill in end values, prob. of survival 1 0 Fill in probabilities of events. 1/3 2/3 Probability of 4 missiles missing is (2/3) 4 = 16/81 16/81 65/81 1

MIT and James Orlin © Step 3. Compute values at each node. The probability of a launched bomber is 2/3. The probability of a missile hitting the bomber is 1/3. bomber launched? 1 minutes left, 4 missiles F yes H You win! yes no You win! no You lose. 1 missile 2 missiles 3 missiles 4 missiles 1 0 Compute values at each node, moving from right to left. 1/3 2/3 Value(B)= 1/3  1 + 2/3  65/81 = 211/243 16/81 65/81 211/ /243 =.868 B Value(F)= Value(H) = 65/81 Value(H)= 65/81  /81  0

MIT and James Orlin © Carry out similar calculations for other values at stage 1, that is one minute remaining Probability of surviving Number of missiles remaining Calculations for stage 1. We next do a stage 2 calculation, which will be typical of all other calculations.

MIT and James Orlin © Diagram for Determining Number of Missiles to Fire Fire hit? Lose Lose Lose Lose bomber launched? yes no yes no yes no yes no yes no 1 missile 2 missiles 3 missiles 4 missiles There are two minutes left. You have 4 missiles remaining. The probability of a launched bomber is 2/3. The probability of a missile hitting the bomber is 1/3. If a bomber is launched, how many missiles do you fire? 2 minutes left, 4 missiles Step 1, lay out the diagram.

MIT and James Orlin © Step 2. Fill in end values Fire hit? Lose Lose Lose Lose bomber launched? yes no yes no yes no yes no yes no 1 missile 2 missiles 3 missiles 4 missiles 2 minutes left. 4 missiles remaining. The probability of a launched bomber is 2/3. The probability of a missile hitting the bomber is 1/3. 2 minutes left, 4 missiles Fill in end values

MIT and James Orlin © /3 Step 3. Fill in probabilities for events Fire hit? Lose Lose Lose Lose bomber launched? yes no yes no yes no yes no yes no 1 missile 2 missiles 3 missiles 4 missiles 2 minutes left. 4 missiles remaining. The probability of a launched bomber is 2/3. The probability of a missile hitting the bomber is 1/3. 1/3 2 minutes left, 4 missiles Fill in Probabilities /3 4/9 8/27 16/81 5/9 19/27 65/81 2/3

MIT and James Orlin © /3 Step 4. Determine values of nodes and make decisions. F H1 H2 H3 H4 Lose Lose Lose Lose bomber launched? yes no yes no yes no yes no yes no 1 missile 2 missiles 3 missiles 4 missiles 2 minutes left. 4 missiles remaining. The probability of a launched bomber is 2/3. The probability of a missile hitting the bomber is 1/3. 1/3 2 minutes left, 4 missiles Determine node values /3 4/9 8/27 16/81 5/9 19/27 65/81 2/3 Value(H1) = 1/3  /3  0 = Value(H2) = 5/9  /9  0 = Value(H3) = 19/27  /27  0 =.3909 Value(H4) = 65/81  /81  0 =.2673 Value(F) = max[Value(H1), Value(H2), Value(H3), Value(H4)] = B Value(B) = 1/3  /3 .3909 =.550

MIT and James Orlin © Node values: again H1 H2 H3 H4 Lose Lose Lose Lose yes no yes no yes no yes no 1 missile 2 missiles 3 missiles 4 missiles /3 4/9 8/27 16/81 5/9 19/27 65/81 2/3 Value = 1/3  /3  0 = Value = 5/9  /9  0 = Value = 19/27  /27  0 =.3909 Value = 65/81  /81  0 =.2673

MIT and James Orlin © Some comments on DP Seems complex, but the computations are all very similar. –easy to program (not so easy in Excel) –very efficient Useful in finance –investments over time –the outcome of an investment is uncertain Useful in inventory control –demands are uncertain –supplies must be ordered in advance

MIT and James Orlin © Probabilities of surviving Probability of reaching the target missiles minute minutes minutes minutes Bomber spreadsheet

MIT and James Orlin © Summary for dynamic programming Useful in decision making over time Uses stages, states, optimal value functions Uses recursion Can incorporate probabilities Useful in inventory management, finance, shortest path, and much more