Dynamic Programming Lecture 13 (5/21/2014). - A Forest Thinning Example - 1 260 2 650 3 535 4 410 5 850 6 750 7 650 8 600 9 500 10 400 11 260 0 50 100.

Slides:

Advertisements

Similar presentations

TWO STEP EQUATIONS 1. SOLVE FOR X 2. DO THE ADDITION STEP FIRST

Advertisements

$100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200.

$100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500.

Set Up Instructions Place a question in each spot indicated Place an answer in each spot indicated Remove this slide Save as a powerpoint slide show.

$100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300.

Demonstration Problem

Win Big AddingSubtractEven/Odd Rounding Patterns Q $100 Q $200 Q $300 Q $400 Q $500 Q $100 Q $200 Q $300 Q $400 Q $500 Last Chance.

$100 $400 $300$200$400 $200$100$100$400 $200$200$500 $500$300 $200$500 $100$300$100$300 $500$300$400$400$500.

GAME RULES Chose teams One team, picks one case The team opens up six of the remaining 25 cases. The Banker makes an offer. If the team declines they.

$100 $400 $300$200$400 $200$100$100$400 $200$200$500 $500$300 $200$500 $100$300$100$300 $500$300$400$400$500.

Dynamic Programming ACM Workshop 24 August Dynamic Programming Dynamic Programming is a programming technique that dramatically reduces the runtime.

Dynamic Programming Introduction Prof. Muhammad Saeed.

Inventory Management II Lot Sizing Lecture 5 ESD.260 Fall 2003 Caplice.

Inventory Management V Finite Planning Horizon Lecture 9 ESD.260 Fall 2003 Caplice.

$100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300.

Year 6 mental test 15 second questions Numbers and number system Numbers and the number system, Measures and Shape.

Ecocomic Modelling: PG1 Markov Chain and Its Use in Economic Modelling Markov process Transition matrix Convergence Likelihood function Expected values.

Welcome to Who Wants to be a Millionaire

Welcome to Who Wants to be a Millionaire

$100 $200 $300 $400 $100 $200 $300 $400 $100 $200 $300 $400 $100 $200 $300 $400 $100 $200 $300 $400.

Discrete time Markov Chain

$100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300.

Optimization 1/33 Radford, A D and Gero J S (1988). Design by Optimization in Architecture, Building, and Construction, Van Nostrand Reinhold, New York.

10 Year Old Topic 2 9 Year Old Topic 4 8 Year Old Topic 5 8 Year Old Topic 6 7 Year Old Topic 7 7 Year Old Topic 8 6 Year Old Topic 9 6 Year Old Topic.

Equal or Not. Equal or Not

Blank Jeopardy. Category #1 Category #2 Category #3 Category #4 Category #

A SIMPLE INTRODUCTION TO DYNAMIC PROGRAMMING PROBLEM SET-UP Problem is arrayed as a set of decisions made over time. System has a discrete state Each.

Fractions Simplify: 36/48 = 36/48 = ¾ 125/225 = 125/225 = 25/45 = 5/9

SMA 6304/MIT2.853/MIT2.854 Manufacturing Systems Lecture 19-20: Single-part-type, multiple stage systems Lecturer: Stanley B. Gershwin

1 OR II GSLM Outline  introduction to discrete-time Markov Chain introduction to discrete-time Markov Chain  problem statement  long-term.

1 Introduction to Discrete-Time Markov Chain. 2 Motivation  many dependent systems, e.g.,  inventory across periods  state of a machine  customers.

1 OR II GSLM Policy and Action  policy  the rules to specify what to do for all states  action  what to do at a state as dictated by the.

Value and Planning in MDPs. Administrivia Reading 3 assigned today Mahdevan, S., “Representation Policy Iteration”. In Proc. of 21st Conference on Uncertainty.

Probabilistic Reasoning over Time

Markov Decision Processes (MDPs) read Ch utility-based agents –goals encoded in utility function U(s), or U:S  effects of actions encoded in.

Dynamic Programming Rahul Mohare Faculty Datta Meghe Institute of Management Studies.

Markov Decision Process

MARKOV CHAIN EXAMPLE Personnel Modeling. DYNAMICS Grades N1..N4 Personnel exhibit one of the following behaviors: –get promoted –quit, causing a vacancy.

Dynamic Decision Processes

Chapter 11 Dynamic Programming.

An Introduction to Markov Decision Processes Sarah Hickmott

More RL. MDPs defined A Markov decision process (MDP), M, is a model of a stochastic, dynamic, controllable, rewarding process given by: M = 〈 S, A,T,R.

Revenue Management under a Markov Chain Based Choice Model

Dynamic Programming. Well known algorithm design techniques:. –Divide-and-conquer algorithms Another strategy for designing algorithms is dynamic programming.

Markov Decision Processes1 Definitions; Stationary policies; Value improvement algorithm, Policy improvement algorithm, and linear programming for discounted.

Decision Making in Robots and Autonomous Agents Decision Making in Robots and Autonomous Agents The Markov Decision Process (MDP) model Subramanian Ramamoorthy.

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 31 Alternative System Description If all w k are given initially as Then,

D Nagesh Kumar, IIScOptimization Methods: M5L2 1 Dynamic Programming Recursive Equations.

© 2015 McGraw-Hill Education. All rights reserved. Chapter 19 Markov Decision Processes.

Hidden Markov Model Multiarm Bandits: A Methodology for Beam Scheduling in Multitarget Tracking Presented by Shihao Ji Duke University Machine Learning.

Dynamic Programming Discrete time frame Multi-stage decision problem Solves backwards.

Tuesday, April 30 Dynamic Programming – Recursion – Principle of Optimality Handouts: Lecture Notes.

Bayesian Brain: Probabilistic Approaches to Neural Coding Chapter 12: Optimal Control Theory Kenju Doya, Shin Ishii, Alexandre Pouget, and Rajesh P.N.Rao.

Introduction and Preliminaries D Nagesh Kumar, IISc Water Resources Planning and Management: M4L1 Dynamic Programming and Applications.

Discrete-time Markov chain (DTMC) State space distribution

Water Resources Planning and Management Daene McKinney

Chapter 11 Dynamic Programming.

Discrete-time markov chain (continuation)

Dynamic Programming Copyright © 2007 Pearson Addison-Wesley. All rights reserved.

Markov Decision Processes

Dynamic Programming Lecture 13 (5/31/2017).

Markov Decision Processes

Discrete-time markov chain (continuation)

Dynamic Programming General Idea

Discrete-time markov chain (continuation)

Markov Decision Problems

Hidden Markov Models (cont.) Markov Decision Processes

Dynamic Programming General Idea

Dynamic Programming 動態規劃

Presentation transcript:

Dynamic Programming Lecture 13 (5/21/2014)

- A Forest Thinning Example Stage 1 Stage 2 Stage 3 Age 10 Age 20 Age 30 Age 10 Source: Dykstra’s Mathematical Programming for Natural Resource Management (1984)

Dynamic Programming Cont. Stages, states and actions (decisions) Backward and forward recursion

Solving Dynamic Programs Recursive relation at stage i (Bellman Equation):

Dynamic Programming Structural requirements of DP –The problem can be divided into stages –Each stage has a finite set of associated states (discrete state DP) –The impact of a policy decision to transform a state in a given stage to another state in the subsequent stage is deterministic –Principle of optimality: given the current state of the system, the optimal policy for the remaining stages is independent of any prior policy adopted

Examples of DP The Floyd-Warshall Algorithm (used in the Bucket formulation of ARM):

Examples of DP cont. Minimizing the risk of losing an endangered species (non-linear DP): Source: Buongiorno and Gilles’ (2003) Decision Methods for Forest Resource Management $2M $1M $0 $1M $2M Stage 1Stage 2Stage 3 $2M $0 $1M $0 $2M $0 $1M $0 $2M $1M $0

Minimizing the risk of losing an endangered species (DP example) Stages (t=1,2,3) represent $ allocation decisions for Project 1, 2 and 3; States (i=0,1,2) represent the budgets available at each stage t; Decisions (j=0,1,2) represent the budgets available at stage t+1; denotes the probability of failure of Project t if decision j is made (i.e., i-j is spent on Project t); and is the smallest probability of total failure from stage t onward, starting in state i and making the best decision j*. Then, the recursive relation can be stated as:

Markov Chains (based on Buongiorno and Gilles 2003) Transition probability matrix P Vector p t converges to a vector of steady-state probabilities p*. p* is independent of p 0 !

Markov Chains cont. Berger-Parker Landscape Index Mean Residence Times Mean Recurrence Times

Markov Chains cont. Forest dynamics (expected revenues or biodiversity with vs. w/o management:

Markov Chains cont. Present value of expected returns

Markov Decision Processes