Presentation is loading. Please wait.

Presentation is loading. Please wait.

ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 1 Chapter 1: The DP Algorithm To do:  sequential decision-making  state.

Similar presentations


Presentation on theme: "ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 1 Chapter 1: The DP Algorithm To do:  sequential decision-making  state."— Presentation transcript:

1 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 1 Chapter 1: The DP Algorithm To do:  sequential decision-making  state  random elements  discrete-time stochastic dynamic system  optimal control/decision problem  actions vs. strategy (information gathering, feedback) Illustrated via examples, later on the general model will be described.

2 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 2 Example: Inventory Control Problem Quantity of a certain item, e.g. gas in a service station, oil in a refinery, cars in a dealership, spare parts in a maintenance facility, etc. The stock is checked at equally spaced periods in time, e.g. every morning, at the end of each week, etc. At those times, a decision must be made as to what quantity of the item to order, so that demand over the present period is “satisfactorily” met (we will give a quantitative meaning to this). 012k-1kk+1N-1N k-1kk+1 kth period check stock, place order

3 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 3 Example: Inventory Control Problem Stochastic Difference Equation: x k+1 = x k + u k – w k  x k : stock at the beginning of kth period  u k : quantity ordered at beginning of kth period. Assume delivered during kth period.  w k : demand during kth period, { w k } stochastic process  assume real-valued variables

4 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 4 Example: Inventory Control Problem  Negative stock is interpreted as excess demand, which is backlogged and filled ASAP. Cost of operation: 1.purchasing cost: cu k ( c = cost per unit) 2.H(x k+1 ) : penalty for holding and storage of extra quantity (x k+1 >0), or for shortage (x k+1 <0)  Cost for period k = cu k + H(x k +u k -w k ) = g(x k,u k,w k ) x k+1

5 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 5 Example: Inventory Control Problem Let or

6 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 6 Example: Inventory Control Problem Objective: to minimize, in some meaningful sense, the total cost of operation over a finite number of periods (finite “horizon”) total cost over N periods =

7 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 7 Example: Inventory Control Problem Two distinct situations can arise: Deterministic Case: x o is perfectly known, and the demands are known in advance to the manager. 1.at k=0, all future demands are known {w 0, w 1,..., w N-1 }.  select all orders at once, so as to exactly meet the demand  x 1 = x 2 =... = x N-1 = 0 0 = x 1 = x 0 + u 0 – w 0  u 0 = w 0 – x 0 u k = w k, 1  k  N-1 : fixed order schedule assume x 0  w 0

8 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 8 Example: Inventory Control Problem What we do is to select a set of fixed “actions” (numbers, i.e. precomputed order schedule). 2.At the beginning of period k, w k becomes known (perfect forecast). Hence, we must gather information and make decisions sequentially. “strategy” rule for making decisions based on information as it becomes available: forecast

9 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 9 Stochastic Case: x 0 is perfectly known (can generalize to case when only distribution is known), but is a random process. Assume that are i.i.d., -valued r.v., with pdf f w, i.e. Independent of k P w : Probability distribution or measure, i.e. is the problem that takes a value in the set Stochastic Case

10 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 10 Note that the stock is now a r.v. Alternatively, we can describe the evolution of the system in terms of a transition law: Prob = Prob = Stochastic Case

11 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 11 Also, the cost is a random quantity minimize expected cost Action: select all orders (numbers) at k=0 most likely not “optimal” (reduces to nonlinear programming problem) VS Strategy: select a sequence of functions s.t. Stochastic Case : difficult problem ! Optimization is over a function space Information available of kth period

12 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 12 Let  = (  0,,  1,...,  N-1 ) : control / decision strategy, policy, law  : set of all admissible strategies (e.g.  k (x)  0) s.t.    Stochastic Dynamic Program: If the problem is feasible, then  and optimal strategy  *, i.e. Then, the stochastic DP problem is minimize:

13 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 13 Note: No backlogging : transition law Summary of the Problem 1-Discrete time Stochastic System : system equation

14 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 14 3-Control constraint: 2-Stochastic element, assumed i.i.d. for example, will generalize to depending on x k and u k. if there is a maximum capacity M, 4-Additive cost Stochastic Dynamic Program then,

15 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 15 5-Optimization over admissible strategies We will see later on that this problem has a neat closed form solution: for some (threshold levels) T k : base-stock policy Stochastic Dynamic Program

16 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 16 Role of Information: Actions Vs. Strategies Example: Let a two-stage problem be given as: where w 0 is a random variable s.t. it takes values  1 w. p., i.e. 0 0

17 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 17 Role of Information: Actions Vs. Strategies Problem A: Choose actions (u 0, u 1 ) (open loop, control schedule) to minimize: Equivalently, let N=2 minimize s.t. (*)

18 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 18 Solution A: Case (i): Role of Information: Actions Vs. Strategies

19 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 19 Role of Information: Actions Vs. Strategies Case (ii):

20 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 20 Can be anything, then choose appropiately. Role of Information: Actions Vs. Strategies No information gathering: we choose at the start and do not take in to consideration x 1 at the beginning of stage 1.

21 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 21 Role of Information: Actions Vs. Strategies Problem B: Choose u 0 and u 1 sequentially, using the observed value of x 1. Sequential decision-making, feedback control. Thus to take decision u 1, we wait until outcome x 1 becomes available, and act accordingly. Solution B: from (*), we select

22 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 22 Note: information gathering doesn’t always help: Let (Deterministic case) Do not gain anything by making decisions sequentially Role of Information: Actions Vs. Strategies

23 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 23 Discrete-Time Stochastic Dynamic System Model and Optimal Decision / Control Problem 1-Discrete time stochastic dynamic system (t, k can be time or events) state space of time k control space disturbance space (countable) Also, depending on the state of the system, there are constraints on the actions that can be taken: Non empty subset

24 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 24 Discrete-Time Stochastic Dynamic System Model and Optimal Decision / Control Problem 2-Stochastic disturbance {  w k }. : probability measure (distribution), may depend explicitly in time, current state and action, but not on previous disturbances w k-1, …, w 0.

25 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 25 3-Admissible Control / Decision Laws (Strategies, Policies) Define information patterns ! ▪Feasible policies ▪ Markov: -Deterministic -Randomize and (*) (*) holds Discrete-Time Stochastic Dynamic System Model and Optimal Decision / Control Problem

26 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 26 4-Finite Horizon Optimal Control / Decision Problem Given an initial state x 0, and cost functions g k, k=0, …, N-1 find    that minimizes the cost functional k=0, …, N-1 subject to the system equation constraint Discrete-Time Stochastic Dynamic System Model and Optimal Decision / Control Problem

27 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 27 We say that  *   is optimal for the initial state x 0 if Optimal N-stage cost (or value) function Discrete-Time Stochastic Dynamic System Model and Optimal Decision / Control Problem Likewise, for  > 0 given, is said to be  -optimal if

28 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 28 This stochastic optimal control problem is difficult!: we are optimizing over strategies Discrete-Time Stochastic Dynamic System Model and Optimal Decision / Control Problem The Dynamic Programming Algorithm will give us necessary and sufficient conditions to decompose this problem into a sequence of coupled minimization problems over actions, (optimization) from which we will obtain. DP is only general approach for sequential design making under uncertainty.

29 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 29 Given a dynamic description of a system via a system equation Then we can alternatively describe the system via a transition law. Alternative System Description

30 ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 30 Alternative System Description Given x k and u k, x k+1 has distribution:  System equation  system transition law P


Download ppt "ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 1 Chapter 1: The DP Algorithm To do:  sequential decision-making  state."

Similar presentations


Ads by Google