Dynamic Programming Lecture 13 (5/21/2014). - A Forest Thinning Example - 1 260 2 650 3 535 4 410 5 850 6 750 7 650 8 600 9 500 10 400 11 260 0 50 100.

Slides:



Advertisements
Similar presentations
TWO STEP EQUATIONS 1. SOLVE FOR X 2. DO THE ADDITION STEP FIRST
Advertisements

$100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200.
$100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500.
Set Up Instructions Place a question in each spot indicated Place an answer in each spot indicated Remove this slide Save as a powerpoint slide show.
$100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300.
Demonstration Problem
Win Big AddingSubtractEven/Odd Rounding Patterns Q $100 Q $200 Q $300 Q $400 Q $500 Q $100 Q $200 Q $300 Q $400 Q $500 Last Chance.
$100 $400 $300$200$400 $200$100$100$400 $200$200$500 $500$300 $200$500 $100$300$100$300 $500$300$400$400$500.
GAME RULES Chose teams One team, picks one case The team opens up six of the remaining 25 cases. The Banker makes an offer. If the team declines they.
$100 $400 $300$200$400 $200$100$100$400 $200$200$500 $500$300 $200$500 $100$300$100$300 $500$300$400$400$500.
Dynamic Programming ACM Workshop 24 August Dynamic Programming Dynamic Programming is a programming technique that dramatically reduces the runtime.
Dynamic Programming Introduction Prof. Muhammad Saeed.
Inventory Management II Lot Sizing Lecture 5 ESD.260 Fall 2003 Caplice.
Inventory Management V Finite Planning Horizon Lecture 9 ESD.260 Fall 2003 Caplice.
$100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300.
Year 6 mental test 15 second questions Numbers and number system Numbers and the number system, Measures and Shape.
Ecocomic Modelling: PG1 Markov Chain and Its Use in Economic Modelling Markov process Transition matrix Convergence Likelihood function Expected values.
Welcome to Who Wants to be a Millionaire
Welcome to Who Wants to be a Millionaire
$100 $200 $300 $400 $100 $200 $300 $400 $100 $200 $300 $400 $100 $200 $300 $400 $100 $200 $300 $400.
Discrete time Markov Chain
$100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300.
Optimization 1/33 Radford, A D and Gero J S (1988). Design by Optimization in Architecture, Building, and Construction, Van Nostrand Reinhold, New York.
10 Year Old Topic 2 9 Year Old Topic 4 8 Year Old Topic 5 8 Year Old Topic 6 7 Year Old Topic 7 7 Year Old Topic 8 6 Year Old Topic 9 6 Year Old Topic.
Equal or Not. Equal or Not
Slippery Slope
Blank Jeopardy. Category #1 Category #2 Category #3 Category #4 Category #
A SIMPLE INTRODUCTION TO DYNAMIC PROGRAMMING PROBLEM SET-UP Problem is arrayed as a set of decisions made over time. System has a discrete state Each.
Fractions Simplify: 36/48 = 36/48 = ¾ 125/225 = 125/225 = 25/45 = 5/9
SMA 6304/MIT2.853/MIT2.854 Manufacturing Systems Lecture 19-20: Single-part-type, multiple stage systems Lecturer: Stanley B. Gershwin
1 OR II GSLM Outline  introduction to discrete-time Markov Chain introduction to discrete-time Markov Chain  problem statement  long-term.
1 Introduction to Discrete-Time Markov Chain. 2 Motivation  many dependent systems, e.g.,  inventory across periods  state of a machine  customers.
1 OR II GSLM Policy and Action  policy  the rules to specify what to do for all states  action  what to do at a state as dictated by the.
Value and Planning in MDPs. Administrivia Reading 3 assigned today Mahdevan, S., “Representation Policy Iteration”. In Proc. of 21st Conference on Uncertainty.
Probabilistic Reasoning over Time
Markov Decision Processes (MDPs) read Ch utility-based agents –goals encoded in utility function U(s), or U:S  effects of actions encoded in.
Dynamic Programming Rahul Mohare Faculty Datta Meghe Institute of Management Studies.
Markov Decision Process
MARKOV CHAIN EXAMPLE Personnel Modeling. DYNAMICS Grades N1..N4 Personnel exhibit one of the following behaviors: –get promoted –quit, causing a vacancy.
Dynamic Decision Processes
Chapter 11 Dynamic Programming.
An Introduction to Markov Decision Processes Sarah Hickmott
More RL. MDPs defined A Markov decision process (MDP), M, is a model of a stochastic, dynamic, controllable, rewarding process given by: M = 〈 S, A,T,R.
Revenue Management under a Markov Chain Based Choice Model
Dynamic Programming. Well known algorithm design techniques:. –Divide-and-conquer algorithms Another strategy for designing algorithms is dynamic programming.
Markov Decision Processes1 Definitions; Stationary policies; Value improvement algorithm, Policy improvement algorithm, and linear programming for discounted.
Decision Making in Robots and Autonomous Agents Decision Making in Robots and Autonomous Agents The Markov Decision Process (MDP) model Subramanian Ramamoorthy.
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 31 Alternative System Description If all w k are given initially as Then,
D Nagesh Kumar, IIScOptimization Methods: M5L2 1 Dynamic Programming Recursive Equations.
© 2015 McGraw-Hill Education. All rights reserved. Chapter 19 Markov Decision Processes.
Hidden Markov Model Multiarm Bandits: A Methodology for Beam Scheduling in Multitarget Tracking Presented by Shihao Ji Duke University Machine Learning.
Dynamic Programming Discrete time frame Multi-stage decision problem Solves backwards.
Tuesday, April 30 Dynamic Programming – Recursion – Principle of Optimality Handouts: Lecture Notes.
Bayesian Brain: Probabilistic Approaches to Neural Coding Chapter 12: Optimal Control Theory Kenju Doya, Shin Ishii, Alexandre Pouget, and Rajesh P.N.Rao.
Introduction and Preliminaries D Nagesh Kumar, IISc Water Resources Planning and Management: M4L1 Dynamic Programming and Applications.
Discrete-time Markov chain (DTMC) State space distribution
Water Resources Planning and Management Daene McKinney
Chapter 11 Dynamic Programming.
Discrete-time markov chain (continuation)
Dynamic Programming Copyright © 2007 Pearson Addison-Wesley. All rights reserved.
Markov Decision Processes
Dynamic Programming Lecture 13 (5/31/2017).
Markov Decision Processes
Discrete-time markov chain (continuation)
Dynamic Programming General Idea
Discrete-time markov chain (continuation)
Markov Decision Problems
Hidden Markov Models (cont.) Markov Decision Processes
Dynamic Programming General Idea
Dynamic Programming 動態規劃
Presentation transcript:

Dynamic Programming Lecture 13 (5/21/2014)

- A Forest Thinning Example Stage 1 Stage 2 Stage 3 Age 10 Age 20 Age 30 Age 10 Source: Dykstra’s Mathematical Programming for Natural Resource Management (1984)

Dynamic Programming Cont. Stages, states and actions (decisions) Backward and forward recursion

Solving Dynamic Programs Recursive relation at stage i (Bellman Equation):

Dynamic Programming Structural requirements of DP –The problem can be divided into stages –Each stage has a finite set of associated states (discrete state DP) –The impact of a policy decision to transform a state in a given stage to another state in the subsequent stage is deterministic –Principle of optimality: given the current state of the system, the optimal policy for the remaining stages is independent of any prior policy adopted

Examples of DP The Floyd-Warshall Algorithm (used in the Bucket formulation of ARM):

Examples of DP cont. Minimizing the risk of losing an endangered species (non-linear DP): Source: Buongiorno and Gilles’ (2003) Decision Methods for Forest Resource Management $2M $1M $0 $1M $2M Stage 1Stage 2Stage 3 $2M $0 $1M $0 $2M $0 $1M $0 $2M $1M $0

Minimizing the risk of losing an endangered species (DP example) Stages (t=1,2,3) represent $ allocation decisions for Project 1, 2 and 3; States (i=0,1,2) represent the budgets available at each stage t; Decisions (j=0,1,2) represent the budgets available at stage t+1; denotes the probability of failure of Project t if decision j is made (i.e., i-j is spent on Project t); and is the smallest probability of total failure from stage t onward, starting in state i and making the best decision j*. Then, the recursive relation can be stated as:

Markov Chains (based on Buongiorno and Gilles 2003) Transition probability matrix P Vector p t converges to a vector of steady-state probabilities p*. p* is independent of p 0 !

Markov Chains cont. Berger-Parker Landscape Index Mean Residence Times Mean Recurrence Times

Markov Chains cont. Forest dynamics (expected revenues or biodiversity with vs. w/o management:

Markov Chains cont. Present value of expected returns

Markov Decision Processes