Chapter 18 Deterministic Dynamic Programming

Slides:



Advertisements
Similar presentations
Chapter 9 Maintenance and Replacement The problem of determining the lifetime of an asset or an activity simultaneously with its management during that.
Advertisements

Design of the fast-pick area Based on Bartholdi & Hackman, Chpt. 7.
BU Decision Models Integer_LP1 Integer Optimization Summer 2013.
Linear Programming Problem. Introduction Linear Programming was developed by George B Dantzing in 1947 for solving military logistic operations.
Lesson 08 Linear Programming
Dynamic Programming Rahul Mohare Faculty Datta Meghe Institute of Management Studies.
Treatments of Risks and Uncertainty in Projects The availability of partial or imperfect information about a problem leads to two new category of decision-making.
LIAL HORNSBY SCHNEIDER
Linear Programming Problem
Chapter 11 Dynamic Programming.
INTRODUCTION TO MODELING
Chapter 2: Modeling with Linear Programming & sensitivity analysis
Dynamic Programming In this handout A shortest path example
MIT and James Orlin © Dynamic Programming 1 –Recursion –Principle of Optimality.
Deterministic Dynamic Programming.  Dynamic programming is a widely-used mathematical technique for solving problems that can be divided into stages.
Inventory Control IME 451, Lecture 3.
Operations Research: Applications and Algorithms
Managerial Decision Modeling with Spreadsheets
Chapter 19 Probabilistic Dynamic Programming
Example 14.3 Football Production at the Pigskin Company
Planning under Uncertainty
LESSON 22: MATERIAL REQUIREMENTS PLANNING: LOT SIZING
Operations Management
Chapter 16 Probabilistic Inventory Models
MIT and James Orlin © Dynamic Programming 2 –Review –More examples.
Dynamic lot sizing and tool management in automated manufacturing systems M. Selim Aktürk, Siraceddin Önen presented by Zümbül Bulut.
Chapter 10 Dynamic Programming. 2 Agenda for This Week Dynamic Programming –Definition –Recursive Nature of Computations in DP –Forward and Backward Recursion.
Chapter 13 Stochastic Optimal Control The state of the system is represented by a controlled stochastic process. Section 13.2 formulates a stochastic optimal.
THE MATHEMATICS OF OPTIMIZATION
Chapter 3 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Prepaired by: Hashem Al-Sarsak Supervised by: Dr.Sanaa Alsayegh.
Stevenson and Ozgur First Edition Introduction to Management Science with Spreadsheets McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies,
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 1 Chapter 1: The DP Algorithm To do:  sequential decision-making  state.
Chapter 6 Production. ©2005 Pearson Education, Inc. Chapter 62 Topics to be Discussed The Technology of Production Production with One Variable Input.
Copyright © 2008 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin 12 Financial and Cost- Volume-Profit Models.
Section 8.1 Estimating  When  is Known In this section, we develop techniques for estimating the population mean μ using sample data. We assume that.
ITGD4207 Operations Research
Chapter 7 Transportation, Assignment & Transshipment Problems
Markov Decision Processes1 Definitions; Stationary policies; Value improvement algorithm, Policy improvement algorithm, and linear programming for discounted.
Copyright © 2013, 2009, 2005 Pearson Education, Inc. 1 5 Systems and Matrices Copyright © 2013, 2009, 2005 Pearson Education, Inc.
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 49 DP can give complete quantitative solution Example 1: Discrete, finite.
1 1 Slide © 2000 South-Western College Publishing/ITP Slides Prepared by JOHN LOUCKS.
GROUP MEMBERS AMARASENA R.G.C. (061004D) DE MEL W.R. (061013E) DOLAPIHILLA I.N.K. (061017U) KUMARAJITH R.M.E. (061031G)
Chapter 14 Game Theory to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004 Brooks/Cole, a.
The McGraw-Hill Companies, Inc. 2006McGraw-Hill/Irwin 12 Financial and Cost- Volume-Profit Models.
OPTIMAL POLICIES FOR A MULTI- ECHELON INVENTORY PROBLEM ANDREW J. CLARK AND HERBERT SCARF October 1959 Presented By İsmail Koca.
Chapter 20 Queuing Theory to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004 Brooks/Cole,
Operational Research & ManagementOperations Scheduling Economic Lot Scheduling 1.Summary Machine Scheduling 2.ELSP (one item, multiple items) 3.Arbitrary.
Tuesday, April 30 Dynamic Programming – Recursion – Principle of Optimality Handouts: Lecture Notes.
1 Inventory Control with Time-Varying Demand. 2  Week 1Introduction to Production Planning and Inventory Control  Week 2Inventory Control – Deterministic.
Log Truck Scheduling Problem
Chapter 8 Network Models to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004 Brooks/Cole,
Chapter 10 Advanced Topics in Linear Programming
Thursday, May 2 Dynamic Programming – Review – More examples Handouts: Lecture Notes.
1 1 © 2003 Thomson  /South-Western Slide Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 Optimization Techniques Constrained Optimization by Linear Programming updated NTU SY-521-N SMU EMIS 5300/7300 Systems Analysis Methods Dr.
Chapter 3 Introduction to Linear Programming to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c)
Linear Programming Models: Graphical and Computer Methods 7 To accompany Quantitative Analysis for Management, Twelfth Edition, by Render, Stair, Hanna.
Chapter 9 Integer Programming to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004 Brooks/Cole,
 This will explain how consumers allocate their income over many goods.  This looks at individual’s decision making when faced with limited income and.
Chapter 4 The Simplex Algorithm and Goal Programming
Linear Programming Many problems take the form of maximizing or minimizing an objective, given limited resources and competing constraints. specify the.
Water Resources Planning and Management Daene McKinney
Chapter 11 Dynamic Programming.
Continuous improvement is the process by which organizations frequently review their procedures, aiming to correct errors or problems. The most effective.
Linear Programming.
Chapter 7 Transportation, Assignment & Transshipment Problems
§1-2 State-Space Description
Presentation transcript:

Chapter 18 Deterministic Dynamic Programming to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.

Description Dynamic programming is a technique that can be used to solve many optimization problems. In most applications, dynamic programming obtains solutions by working backward from the end of the problem toward the beginning, thus breaking up a large, unwieldy problem into a series of smaller, more tractable problems

18.1 Two Puzzles Example We show how working backward can make a seemingly difficult problem almost trivial to solve. Suppose there are 20 matches on a table. I begin by picking up 1, 2, or 3 matches. Then my opponent must pick up 1, 2, or 3 matches. We continue in this fashion until the last match is picked up. The player who picks up the last match is the loser. How can I (the first player) be sure of winning the game?

If I can ensure that it will be opponent’s turn when 1 match remains, I will certainly win. Working backward one step, if I can ensure that it will be my opponent's turn when 5 matches remain, I will win. If I can force my opponent to play when 5, 9, 13, 17, 21, 25, or 29 matches remain, I am sure of victory. Thus I cannot lose if I pick up 1 match on my first turn.

18.2 A Network Problem Many applications of dynamic programming reduce to finding the shortest (or longest) path that joins two points in a given network. For larger networks dynamic programming is much more efficient for determining a shortest path than the explicit enumeration of all paths.

Characteristics of Dynamic Programming Applications The problem can be divided into stages with a decision required at each stage. Characteristic 2 Each stage has a number of states associated with it. By a state, we mean the information that is needed at any stage to make an optimal decision. Characteristic 3 The decision chosen at any stage describes how the state at the current stage is transformed into the state at the next stage.

Characteristic 4 Characteristic 5 Given the current state, the optimal decision for each of the remaining stages must not depend on previously reached states or previously chosen decisions. This idea is known as the principle of optimality. Characteristic 5 If the states for the problem have been classified into on of T stages, there must be a recursion that related the cost or reward earned during stages t, t+1, …., T to the cost or reward earned from stages t+1, t+2, …. T.

18.3 An Inventory Problem Dynamic programming can be used to solve an inventory problem with the following characteristics: Time is broken up into periods, the present period being period 1, the next period 2, and the final period T. At the beginning of period 1, the demand during each period is known. At the beginning of each period, the firm must determine how many units should be produced. Production capacity during each period is limited.

Each period’s demand must be met on time from inventory or current production. During any period in which production takes place, a fixed cost of production as well as a variable per-unit cost is incurred. The firm has limited storage capacity. This is reflected by a limit on end-of-period inventory. A per-unit holding cost is incurred on each period’s ending inventory. The firms goal is to minimize the total cost of meeting on time the demands for periods 1,2, …., T.

In this model, the firm’s inventory position is reviewed at the end of each period, and then the production decision is made. Such a model is called a periodic review model. This model is in contrast to the continuous review model in which the firm knows its inventory position at all times and may place an order or begin production at any time.

18.4 Resource-Allocation Problems Resource-allocation problems, in which limited resources must be allocated among several activities, are often solved by dynamic programming. To use linear programming to do resource allocation, three assumptions must be made: Assumption 1 : The amount of a resource assigned to an activity may be any non negative number. Assumption 2 : The benefit obtained from each activity is proportional to the amount of the resource assigned to the activity.

Assumption 3: The benefit obtained from more than one activity is the sum of the benefits obtained from the individual activities. Even if assumptions 1 and 2 do not hold, dynamic programming can be used to solve resource-allocation problems efficiently when assumption 3 is valid and when the amount of the resource allocated to each activity is a member of a finite set.

Generalized Resource Allocation Problem The problem of determining the allocation of resources that maximizes total benefit subject to the limited resource availability may be written as where xt must be a member of {0,1,2,…}.

We may generalize the recursions to this situation by writing To solve this by dynamic programming, define ft(d) to be the maximum benefit that can be obtained from activities t, t+1,…, T if d unites of the resource may be allocated to activities t, t+1,…, T. We may generalize the recursions to this situation by writing fT+1(d) = 0 for all d where xt must be a non-negative integer satisfying gt(xt)≤ d.

A Turnpike Theorem Turnpike results abound in the dynamic programming literature. Why are the results referred to as a turnpike theorem? Think about taking an automobile trip in which our goal is to minimize the time needed to complete the trip. For a long trip it may be advantageous to go slightly out of our way so that most of the trip will be spent on a turnpike, on which we can travel at the greatest speed.

18.5 Equipment Replacement Problems Many companies and customers face the problem of determining how long a machine should be utilized before it should be traded in for a new one. Problems of this type are called equipment-replacement problems and can be solved by dynamic programming. An equipment replacement model was actually used by Phillips Petroleum to reduce costs associated with maintaining the company’s stock of trucks.

18.6 Formulating Dynamic Programming Recursions In many dynamic programming problems, a given stage simply consists of all the possible states that the system can occupy at that stage. If this is the case, then the dynamic programming recursion can often be written in the following form: Ft(i) = min{(cost during stage t) + ft+1 (new stage at stage t +1)} where the minimum in the above equation is over all decisions that are allowable, or feasible, when the state at state t is i.

Not all recursions are of the form shown before. Correct formulation of a recursion of the form requires that we identify three important aspects of the problem: Aspect 1: The set of decisions that is allowable, or feasible, for the given state and stage. Aspect 2: We must specify how the cost during the current time periods (stage t) depends on the value of t, the current state, and the decision chosen at stage t. Aspect 3: We must specify how the state at stage t+1 depends on the value of t, the states at stage t, and the decision chosen at stage t. Not all recursions are of the form shown before.

A Fishery Example The owner of a lake must decide how many bass to catch and sell each year. If she sells x bass during year t, then a revenue r(x) is earned. The cost of catching x bass during a year is a function c(x, b) of the number of bass caught during the year and of b, the number of bass in the lake at the beginning of the year. Of course, bass do reproduce.

A Fishery Example To model this, we assume that the number of bass in the lake at the beginning of a year is 20% more than the number of bass left in the lake at the end of the previous year. Assume that there are 10,000 bass in the lake at the beginning of the first year. Develop a dynamic programming recursion that can be used to maximize the owner’s net profit over a T-year horizon.

A Fishery Example In problems where decisions must be made at several points in time, there is often a trade-off of current benefits against future benefits. At the beginning of year T, the owner of the lake need not worry about the effect that the capture of bass will have on the future population of the lake. So the beginning of the year problem is relatively easy to solve. For this reason, we let time be the stage. At each stage, the owner of the lake must decide how many bass to catch.

A Fishery Example We define xt to be the number of bass caught during year t. To determine an optimal value of xt, the owner of the lake need only know the number of bass (call it bt) in the lake at the beginning of year t. Therefore, the state at the beginning of year t is bt. We define ft (bt) to be the maximum net profit that can be earned from bass caught during years t, t+1, …T given that bt bass are in the lake at the beginning of year t.

A Fishery Example We may now dispose of aspects 1-3 of the recursion. Aspect 1: What are the allowable decisions? During any year we can’t catch more bass than there are in the lake. Thus, in each state and for all t 0 ≤ xt ≤ bt must hold. Aspect 2: What is the net profit earned during year t? If xt bass are caught during a year that begins with bt bass in the lake, then the net profit is r(xt) – c(xt, bt). Aspect 3: What will be the state during year t+1? The year t+1 state will be 1.2 (bt – xt).

A Fishery Example After year T, there are no future profits to consider, so ft(bt)=max{r(xt) – c(xt,bt)+ft+1[1.2(bt-xt)]} where 0 ≤ xt ≤ bt. We use this equation to work backwards until f1(10,000) has been computed. Then to determine the optimal fishing policy, we begin by choosing x1 to be any value attaining the maximum in the equation for f1(10,000). Then year 2 will begin with 1.2(10,000 – x1) bass in the lake.

A Fishery Example This means that x2 should be chosen to be any value attaining the maximum in the equation for f2(1.2(10,000-x1)). Continue in this fashion until optimal values of x3, x4,…xT have been determined.

Incorporating the Time Value of Money into Dynamic Programming Formulations A weakness of the current formulation is that profits received during the later years are weighted the same as profits received during the earlier years. Suppose that for some ß < 1, $1 received at the beginning of year t+1 is equivalent to ß dollars received at the beginning of year t.

We can incorporate this idea into the dynamic programming recursion by replacing the previous equation with where 0 ≤ xt ≤ bt. Then we redefine ft(bt) to be the maximum net profit that can be earned during years t, t+1,…T. This approach can be used to account for the time value of money in any dynamic programming formulation.

Computational Difficulties in Using Dynamic Programming There is a problem that limits the practical application of dynamic programming. In many problems, the state space becomes so large that excessive computational time is required to solve the problem by dynamic programming.

18.7 The Wagner-Whitin Algorithm and the Silver-Meal Heuristic The Inventory Example in this chapter is a special case of the dynamic lot-size model. Description of Dynamic Lot-Size Model Demand dt during periods t(t = 1, 2,…, T) is known at the beginning of period 1. Demand for period t must be met on time from inventory or from period t production. The cost c(x) of producing x units during any period is given by c(0) =0, and for x >0, c(x) = K+cx, where K is a fixed cost for setting up production during a period, and c is a variable per-unit cost of production.

At the end of period t, the inventory level it is observed, and a holding cost hit is incurred. We let i0 denotes the inventory level before period 1 production occurs. The goal is to determine a production level xi for each period t that minimizes the total cost of meeting (on time) the demands for periods 1, 2, …, T. There is a limit ct placed on period t’s ending inventory. There is a limit rt placed on period t’s production. Wagner and Whitin have developed a method that greatly simplifies the computation of optimal production schedules for dynamic lot-size models.

Lemmas 1 and 2 are necessary for the development of the Wagner-Whitin algorithm. Lemma 1: Suppose it is optimal to produce a positive quantity during a period t. Then for some j=0, 1, …, T-t, the amount produced during period t must be such that after period t’s production, a quantity dt + dt+1 + … + dt+j will be in stock. In other words, if production occurs during period t, we must (for some j) produce an amount that exactly suffices to meet the demands for periods t, t+1,…, t+j.

Lemma 2: If it is optimal to produce anything during period t, then it-1 <dt. In other words, production cannot occur during period t unless there is insufficient stock to meet period t demand. With the possible exception of the first period, production will occur only during periods in which beginning inventory is zero, and during each period in which beginning inventory is zero (and dt ≠ 0), production must occur.

Using this insight , Wagner and Whitin developed a recursion that can be used to determine an optimal production policy.

The Silver-Meal Heuristic The Silver-Meal(S-M) Heuristic involved less work than the Wagner-Whitin algorithm and can be used to find a near-optimal production schedule. The S-M heuristic is based on the fact that our goal is to minimize average cost per period. In extensive testing, the S-M heuristic usually yielded a production schedule costing less than 1% above the optimal policy obtained by the Wagner-Whitin algorithm.

18.8 Using Excel to Solve Dynamic Programming Problems Excel can often be used to solve DP problems. The examples in the book demonstrates how spreadsheets can be used to solve problems. Refer to these Excel files for more information Dpknap.xls Dpresour.xls Dpinv.xls