Presentation is loading. Please wait.

Presentation is loading. Please wait.

A SIMPLE INTRODUCTION TO DYNAMIC PROGRAMMING PROBLEM SET-UP Problem is arrayed as a set of decisions made over time. System has a discrete state Each.

Similar presentations


Presentation on theme: "A SIMPLE INTRODUCTION TO DYNAMIC PROGRAMMING PROBLEM SET-UP Problem is arrayed as a set of decisions made over time. System has a discrete state Each."— Presentation transcript:

1

2 A SIMPLE INTRODUCTION TO DYNAMIC PROGRAMMING

3 PROBLEM SET-UP Problem is arrayed as a set of decisions made over time. System has a discrete state Each decision results in some reward or cost, and results in the system being moved to another state. Usually has a finite number of transitions. Transitions can be probabilistic, as can the rewards. Solution is a decision strategy that maximizes summed reward (minimizes cost)

4 Notation N = finite planning horizon S n (x) = cost of optimally operating from n to N given state x at time n. d n *(x) is the optimal policy at stage n given state x at time n. x(d n ) is the state resulting from deciding d at stage n. c(d n ) is the cost of taking decision d n

5 EXAMPLE You have moved to Singapore, and you need to operate a car for 3 yrs. You plan to sell the car when you leave Your QOL is not affected by your wheels Cost/resale of cars and operating costs are below 0123 sale price op cost

6 MAPPING TO THE NOTATION State: Age of you car Stage: Years you have been in S-pore Policy: Car’s age you buy at the END of the year

7 COST EXAMPLE you have a 2yr old car you operate for the year ($600) you sell your 3 yr old car (-$150) you buy a new (to you) 1 yr old used car ($800) TOTAL: $1250

8 finish 0123 start

9 car age"cost" end of yr

10 CONTINUED COST EXAMPLE It’s beginning yr 2, and you possess a 2 yr old car You can.... operate the car (600 + S 3 (3yr old car)) operate the car, sell it, buy new car ( S 3 (new)) operate the car, sell it, buy 1yr old car ( S 3 (1 yr old car))...

11 123 "cost" end of yr

12 123 "cost" end of yr

13 BELLMAN’S EQUATION Sometimes its easy to get your name on something!

14 EXEMPLAR A specialized tool is available during the period 9am,..., 3pm Each hour, a bid for the asset is made according to the table below The asset is busy for 3 hr. if the bid is accepted

15 end

16 end

17 end

18 end

19 end

20 end

21 end

22 end Note 1: Once the diagram is drawn, the problem can be solved by a shortest(longest) path algorithm Note 2: Dynamic Programming = Shortest Path

23 PROBABILISTIC TRANSITIONS 1.c(d) is a random variable 2.x(d) is random 3.the “trial” takes place after the decision

24 EXEMPLAR (Probabilistic) An “asset” is available during the period 8pm, 9pm,..., 3am Each hour, a bid for the asset is made according to the discrete probability density below The asset is busy for 3 hr. if the bid is accepted

25 MANY APPROACHES TO FORMULATION N = 4am S n (x) = profit of optimally operating from n to N given state x at time n. d n *(x) is the optimal policy at stage n given state x at time n (ACCEPT, REJECT) c(d n ) is the profit of taking decision d n x(d n ) is the proposed bid (3,6,9) or the number of hours left in the remaining engagement (1hr, 2hr)

26 RECURSION time hours before asset is available again See DP Example.xls

27 UNLOCKING THE JARGON x(d) can be governed by a Markov Chain a different P i,j matrix for each decision d Result is a Markov Decision Process


Download ppt "A SIMPLE INTRODUCTION TO DYNAMIC PROGRAMMING PROBLEM SET-UP Problem is arrayed as a set of decisions made over time. System has a discrete state Each."

Similar presentations


Ads by Google