Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 13 DETERMINISTIC DYNAMIC PROGRAMMING Math 305 2008 We will cover 9.1-9.4 plus some material not in the text.

Similar presentations


Presentation on theme: "Chapter 13 DETERMINISTIC DYNAMIC PROGRAMMING Math 305 2008 We will cover 9.1-9.4 plus some material not in the text."— Presentation transcript:

1 Chapter 13 DETERMINISTIC DYNAMIC PROGRAMMING Math 305 2008 We will cover 9.1-9.4 plus some material not in the text

2 Dynamic Programming Technique for making a sequence of interrelated decisions Problem solving strategies, e.g. find a route from here to LA – forward: enumerate all possibilities – backward: figure which ways one can get to the desired end General type of problem: consecutive stages – at each stage you are in one of a number of possible states – each state has one or more possible policies from which to choose – the policy you choose determines your state at the next stage Method – start with a solution for a small part of the problem – expand A useful approach when you don’t want to check all possiblities

3 Prototype problem (stage coach) A traveller is going from east to west coast on a series of stage coaches that travel from one state to another and wants the safest route S/he selects a life insurance policy for each stage, reasoning that cheapest = safest

4 Approach 4 stages States at each stage: 1: 1 2: 2,3,4 3: 5, 6, 7 4: 8,9 First stage: travel from 1 to 2, 3, or 4 Second stage: travel from state at that stage to next stage Decision variables: x n (n=1,..,4) = destination on n th stage

5 Approach f n (s,x n ) = total cost of best policy for remaining stages given you are in state s at stage n and select policy x n (n = stage, s = state, x n = decision) f n (s) = cost of best policy given you are in state s at stage n = min f n (s, x n ) over possible choices for x n f n (s, x n ) = c s x n + f n+1 (x n ) (c s x n = cost of policy from s to x n ) goal: find f 1 (1) f 1 (1) = min of f 1 (1,2) = 2 + f 2 (2) f 1 (1,2) = 5 + f 2 (3) f 1 (1,4) = 1 + f 2 (4)

6 Start at Stage 4 f 2 *(2) = ?... Better: go backward Find the best policy from stage 4 on, stage 3 on,... Stage 4: states 8 and 9 → we want f 4 *(8) and f 4 *(9) f 4 (8,10) = 1 = f 4 *(8) f 4 (9,10) = 4 = f 4 *(9) s\x 4 f 4 *(s)x4*x4* 8110 94

7 Stage 3 States 5, 6, 7 → we want f 3 *(5), f 3 *(6). f 3 *(7) f 3 (5,8) = 7 + f 4 *(8) = 8 f 3 (5,9) = 5 + f 4 *(9) = 9 f 3 (6,8) = 3 + f 4 *(8) = 4 f 3 (6,9) = 4 + f 4 *(9) = 8 f 3 (7,8) = 7 + f 4 *(9) = 8 f 3 (7,9) = 1 + f 4 *(9) = 5 So far this isn't any better than exhustive search s\x 3 89f 3 *(s)x3*x3* 58988 64848 78559

8 Stage 2 States 2, 3, 4 f 2 (2,5) = 10 + f 3 *(5) = 18 f 2 (2,6) = 12 + f 3 *(6) = 16 f 2 (3,5) = 5 + f 3 *(5) = 13 f 2 (3,6) = 10 + f 3 *(6) = 14 f 3 (3,7) = 7 + f 3 *(7) = 12 f 2 (4,6) = 15+f 3 *(6) = 19 s\x 2 567f 2 *(s)x2*x2* 21816- 6 3131412 7 4-1918 7

9 Stage 1 f 1 (1,2) = 2 + f 2 *(2) = 18 f 1 (1,3) = 5 + f 2 *(3) = 17 f 1 (1,4) = 1 + f 2 *(4) = 19 s\x 1 234f 1 (s)x1*x1* 1181719173

10 Shortest route: 1 -> 3 3->7 7 -> 9 9->10 s\x 1 234f 1 (s)x1*x1* 1181719173 s\x 2 567f 2 *(s)x2*x2* 21816- 6 3131412 7 4-1918 7 s\x 3 89f 3 *(s)x3*x3* 58988 64848 78559 s\x 4 f 4 *(s)x4*x4* 8110 94

11 Computational Efficiency This method? 16 additions, 9 comparisons How else could we solve this? – list all paths (14) and total # additions (3 on each) => 42 – shortest route ? shortest route Compare n+1 stages with n choices at each stage except last Dynamic programming: n nodes with n additions each =n 2 Exhaustive search: n n paths with n additions = n n+1 E.g. n=10: 100 versus 10 11

12 13.3 Inventory Theory What is inventory? – something that is produced – has a demand – needs to be stored until used – cookies at Dories, beer at the Fargo, flash drives at the bookstore What is an inventory policy? – when to order/produce more – how much at a time What costs are associated with inventory? – cost per unit (variable) – setup or ordering – holding – shortage

13 Inventory Theory What are we trying to optimize? Assumptions – time is broken into periods – production occurs at the beginning of the period – each period has an associated demand which is met from items held over from the last period and/or produced in the current period

14 13.3 Example Demand for a productCosts c(x)=cost of producing x units=3 + 1x Other restrictions – at most 5 units can be produced each month – at most 4 units can be carried over to the next month – 0 units on hand at the beginning of month 1 What are the stages, states, and decisions? – stage: beginning of a month – state: entering inventory – decision: f n (s) = amt produced at beginning of n, given s units on hand MonthDemand 11 23 32 44 CostAmount Setup 3 Variable 1 Holding.50

15 Stage 4 f 4 (i) = cost of entering period 4 with i units = cost of producing 4 - i units = c(4-s) f 4 (0) = set up + cost of 4 units = 3 + 4 f 4 (1) = set up + cost of 3 units = 3 + 3 Demand is 1 3 2 4 f 4 (i)x4*x4* 074 163 252 341 400

16 f 3 (i) = min {c(x) + 0.5(x + i - 2) + f 4 (i+x -2) } x=0..4 ordering holding stage 4 on Demand is 1 3 2 4 i012345f 3 (i)x 3 (i) 01212.513 13.5122 11111.51212.510 5 2710.51111.5970 36.51010.586.50 469.5760 55.56 0 Stage 3

17 ixcostf 3 (i)x 3 (i) 306.5 0 3110 3210.5 338 40660 419.5 427 505.5 0 516 ixcostf 3 (i)x 3 (i) 0212 2 0312.5 0413 0513.5 1111105 1211.5 1312 1412.5 1510 20770 2110.5 2211 2311.5 249 Alternate Notation for Stage 3

18 Stage 2 f 2 (i) = min {c(x) + 0.5(x + i - 3) + f 2 (i+x -3) } for x = 0, 1,...,5 x=0,1,2,3,4 Demand is 1 3 2 4 i012345f 3 (i)x 3 (i) 01818.516 5 11717.51516154 21615.5141516143 31214.5131415120 410.512131410.50 5...

19 Alternate Notation for Stage 2 ixcostf 2 (i)x 2 (i) 0318165 0418.5 0516 1217151 1317.5 1415 1516 21 31 2215.5 2314 2415 2516 ixcostf 2 (i)x 2 (i) 3012 0 3114.5 3213 3314 3415 4010.5 0 4112 4213 4314

20 Stage 1 f 1 (i) = min {c(x) + 0.5(x + i - 1) + f 1 (i+x - 1) } for x = 0, 1,...,5 x=0,1,2,3,4 Demand is 1 3 2 4 (text also finds f 1 (i) for i = 1,2,3,4) i12345f 1 (i)x 1 (i) 02020.52120.5 201

21 There must be an easier way (not in text) Only produce when entering inventory=0 – don't carry inventory to meet part of the demand if you will have to produce in the period Demand is 1 3 2 4 f 4 (i)x4*x4* 074 400 ixcostf 3 (i)x 3 (i) 0212 2 20770 ixcostf 2 (i)x 2 (i) 0318165 0418.5 3012 0 ixcostf 1 (i)x 1 (i) 0120 1 0421.5

22 Network Representation 2,0 1, 0 2,1 2,2 2,3 2,4 5,0 (i,j): i = period, j = beginning inventory

23 13.4 Resource Allocation I have 5 blocks of time to study and want to maximize the sum of my grades g i (x i ) = grade in subject i given I study x i blocks. 3 decisions: x n = # blocks to spend on subject n stage = subject state = time available to allocate to remaining stages hoursEngEconPhys 0406540 1657055 2758963 3889178 4939581 5999885

24 Resource Allocation g i (x i ) = grade in subject i given x i blocks Objective: max  g i (x i ) subject to  x i = 5 f n (s) = max effectiveness of s hours in stages n through 3 f n (s, x n ) = g n (x n ) + f n+1 (s - x n ) = grade from x n hours to subject n plus best effect of remaining hours in later stages f n (s) = max { g n (xn) + f n+1 (s - x n )} x n =o..s

25 Resource Allocation f 3 (s) f 2 (s) f 1 (s) Solution: English 2, Econ 2, Physics 1 sf 3 (s)x3x3 0400 1551 2632 3783 4814 5855 s\x2012345f 2 (s)x2x2 0105 0 11201191200 2128125129 2 31431331311442 41461481521461351522 51501511541671501381672 s\x 1 012345f 1 (s)x1x1 52072172192172132042192

26 Continuous Functions Suppose we can allocate fractional units of time and grade is a continuous function of time spent English: g 1 (x) = 65 - x 2 + 11x max at (5.5, 95) Econ: g 2 (x) = 80 - 2x 2 + 13 x max at (3.25, 101.125) Physics: g 3 (x) = 55 + 7x f 3 (s) = 55 + 7s x 3 = s f 2 (s, x 2 ) = g 2 (x 2 ) + f 3 (s - x 2 ) = 80 -2 x 2 2 + 13x 2 + 55 + 7(s - x 2 ) = 135 - 2 x 2 2 + 6x 2 + 7s df 2 /dx 2 = -4x 2 + 6 = 0 when x 2 = 3/2 d 2 f 2 /dx 2 2 = -4 -> max at x 2 = 3/2 (if s ≥ 3/2) otherwise max at x 2 = s

27 Continuous Functions Case 1, s < 3/2 x 2 = s, f 2 (s) = 80 - 2s 2 + 13s + 55 + 7(s - s) = 135 + 13s - 2s 2 Case 2, 3/2  s f 2 (s) = 80 + 2(3/2) 2 + 13(3/2) + 55 + 7(s - 3/2) = 150 + 7s Implications – if available time < 3/2, put it all into econ – if ≥ 3/2, put 3/2 into econ and surplus into physics (compare slopes of the two graphs before and after 3/2) x2x2 f 2 (s) s < 3/2s135 + 13s - 2s 2 3/2 <= s3/2150 + 7s

28 Continuous Functions f 1 (5, x 1 ) = g 1 (x 1 ) + f 2 (5- x 1 ) case1, 5 - x 1 3.5) f 1 (5, x 1 ) = 65 - x 1 2 + 11x + 135 + 13(5 - x 1 ) - 2(5 - x 1 ) 2 df 1 /d x 1 = -2 x 1 + 11 - 13 - 4(5- x 1 ) = 2 x 1 - 22 < 0 for 3.5 < x 1  5 -> max at x 1 = 3.5 if 3.5 < x 1  5. f 1 (5. 3.5) = 241.25 case 2, 5 - x 1  3/2 (0  x 1  3.5) f 1 (x 1, 5) = 65 - x 1 2 + 11 x 1 + 150 + 7(5- x 1 ) = 250 + 4 x 1 - x 1 2 df 1 /d x 1 = 4 - 2 x 1 = 0 at x = 2 d 2 f 1 /dx 1 2 = -2 -> max at x = 2 f 1 (5) = f 1 (5,2) = 254 x2x2 f 2 (s) s < 3/2s135 + 13s - 2s 2 3/2 <= s3/2150 + 7s

29 Continuous Functions Decision: x 1 = 2 Thus 5 - x 1 = 3 left for remaining stages 3 > 3/2 -> x 2 = 3/2 -> x 3 = 3 - 3/2 = 3/2 Solution: x 1 = 2 x 2 = 3/2 x 3 = 3/2, sum of grades = 254 x3x3 f 3 (s) 0 ≤ s ≤ 5s55 + 7s x2x2 f 2 (s) s < 3/2s135 + 13s - 2s 2 3/2 ≤ s3/2150 + 7s x1x1 f 1 (s) 0 ≤ x 1 < 3.52254 3.5 ≤ x 1 ≤ 53.5241.25

30 Probabilistic Models The state at the next stage is not completely determined by decision at current stage, rather determines a probability distribution for the next stage. Objective: maximize the expected value. Example: job interview – a job candidate has up to three interviews – at each, she will be offered a job which is terrific, good or fair – she must decide then whether to accept the job or interview again – a terrific job is worth 3 points, good: 2, fair: 1 Stages: interviews State: job status at stage n (T, G, or F) Decision: interview or accept JobProb.Value T.23 G.52 F.31

31 Probabilistic Models f n (s) = max expected value if in state s at stage n f n (s, x n ) = max expected value if in state s at stage n and make decision x n (x n = i or a) JobProb.Value T.23 G.52 F.31 0 F G TTT GG FF

32 Stage 3 f 3 (T, i) = 3(.2) + 2(.5) + 1(.3) = 1.9 f 3 (T,a) = 3-> f 3 (T) = 3, x 3 = a f 3 (G, i) = 3(.2) + 2(.5) + 1(.3) = 1.9 f 3 (G,a) = 2-> f 3 (G) = 2, x 3 = a f 3 (F, i) = 3(.2) + 2(.5) + 1(.3) = 1.9 f 3 (F,a) = 1-> f 3 (F) = 1.9, x 3 = i s\ x 3 iaf 3 (s)x3x3 T1.933a G 22a F 1 i

33 Stage 2 f 2 (T, i) = p(T)f 3 (T) + p(G)f 3 (G) + p(F)f 3 (F) =.2(3) +.5(2) +.3(1.9) = 2.17 f 2 (T,a) = 3-> f 2 (T) = 3, x 2 = a f 1 (i) =.2f 2 (T) +.5f 2 (G) +.3f 2 (F) =.2(3) +.5(2.17) +.3(2.17) = 2.336 Strategy at stage I: interview; II: interview in G or F.; III: interview in F s\ x 2 iaf 2 *(s)x2x2 T2.1733a G 2 i F 1 i Stage 1

34 Knapsack Problem (back to 13.4) A thief breaks into a house. Around the thief are various objects: a diamond ring, a silver candelabra, a Bose Wave Radio, a large portrait of Elvis Presley painted on a black velvet background (a "velvet-elvis"), and a large tiffany crystal vase. The thief has a knapsack that can only hold a certain capacity (8). Each of the items has a value and a size, and cannot hold all of the items in the knapsack. Which items should the thief take? There are three thieves: greedy, foolish and slow, and wise (ref for this example) (ref for this example) ItemSizeValue Ring115 Candelabra510 Radio39 Elvis45

35 Knapsack Problem The greedy thief breaks into the window, and sees the items. He makes a mental list of the items, and grabs the most expensive item first. The ring goes in first, leaving a capacity of 7, and a value of 15. Next, he grabs the candelabra, leaving a knapsack of size 2 and a value of 25. No other items will fit in his knapsack, so he leaves. The foolish and slow thief climbs in the window, and sees the items. This thief was a programmer, downsized as part of the "dot-bomb" blowout. Possessing a solid background in boolean logic, he figures that he can simply compute all combinations of the objects and choose the best. So, he starts going through the binary combinations of objects - all 2 5 of them. While he is still drawing the truth table, the police show up, and arrest him. Although his solution would certainly have given him the best answer, it just took long to compute. ItemSizeValue Ring115 Candelabra510 Radio39 Elvis45

36 Knapsack Problem, Wise Thief The wise thief appears, and observes the items. He notes that an empty knapsack has a value of 0. He notes that a knapsack can either contain each item, or not. Further, his decision to include an item will be based on a quick calculation - either the knapsack with some combination of the previous items will be worth more, or else the knapsack of a size that will fit the current item was worth more. So, he does this quick computation, and figures out that the best knapsack he can take is made up of items 1,3, and 4, for a total value of 29 ItemSizeValue Ring115 Candelabra510 Radio39 Elvis45

37 Generalized Resource Problem (p. 767) w units of a resource available T activities in which the resources can be allocated x t : the level at which activity t is implemented g t (x t ): # of units of the resource used by activity t r t (x t ): the resulting benefit States: each activity Stages: how much of the resource is available for remaining stages Decision: how much to use at this stage Formulation maximize Σ r t (x t ) s.t. Σ g t (x t ) ≤ w t = 1,...T t = 1,...T f t (d) = max benefit if d units are allocated to activities t through T f t (d) = max {r t (x t ) + f t+1 (d - x t ) f T+1 (d) = 0 x t

38 Wise Thief, Stages 4 and 3 ItemSizeValue Ring115 Candelabra510 Radio39 Elvis45 s\x 4 04f 4 *(x 4 )x4x4 0000 40554 s\x 3 03f 3 *(x 3 )x3x3 0000 3 0993 4 4 993 7414 3 f 3 (d) = max {r 3 (x 3 ) + f 4 (d - x 3 ) f 3 (3) = max {r 3 (0) + f 4 (3 ) = 0 + 0 r 3 (3) + f 4 (0 ) = 9 + 0} f 3 (4) = max {r 3 (0) + f 4 (4 ) = 4 r 3 (3) + f 4 (1 ) = 9} f 3 (7) = max {r 3 (0) + f 4 (7) = 4 r 3 (3) + f 4 (4 ) = 9 + 5 =14

39 Stage 2 ItemSizeValue Ring115 Candelabra510 Radio39 Elvis45 s\x 3 03f 3 *(x 3 )x3x3 0000 3 994 4 4 993 7414 f 2 (d) = max {r 2 (x 2 ) + f 43 (d - x 2 ) f 2 (3) = max {r 2 (0) + f 3 (3 ) = 4} f 3 (4) = max {r 2 (0) + f 3 (4) = 9} f 3 (5) = max {r 2 (0) + f 3 (5) = 3 r 2 (5) + f 3 (0 ) = 10} f 3 (7) = max {r 2 (0) + f 3 (7) = 14 r 2 (5) + f 3 (2 ) = 10} f 3 (8) = max {r 2 (0) + f 3 (8) = 14 r 2 (5) + f 3 (3) = 10 + 9 = 19} s\x 2 05f 2 *(x 2 )x2x2 0000 3990 4990 5910 5 71410140 8 19 5

40 Stage 1 ItemSizeValue Ring115 Candelabra510 Radio39 Elvis45 f 1 (8) = max {r 1 (0) + f 2 (8 ) = 0 + 19} r 1 (1) + f 2 (7) = 15 + 14} s\x 1 01f 1 *(x 1 )x1x1 81929 1 s\x 2 05f 2 *(x 2 )x2x2 0000 3940 490 5910 5 71410140 8 19 5

41 How is this an LP? x i = # item i max 15x 1 + 10x 2 + 9x 3 + 5x 4 s.t. x 1 + 5x 2 + 3x 3 + 4x 4 ≤ 8 ItemSizeValue Ring115 Candelabra510 Radio39 Elvis45

42 Turnpike Theorem Let c j = benefit from item j, w j = weight of item j Order items by benefit per unit weight c 1 /w 1 ≥ c 2 /w 2 ≥ c 3 /w 3... If there is a unique "best" item #1, e.g. c 1 /w 1 > c 2 /w 2 when the max weight w ≥ w* = (c 1 w 1 )/(c 1 - w 1 ( c 2 /w 2 ) the optimal solution contains at least one of item 1 Thief problem: 15/1 > 9/3 > 10/5 > 5/4 w* = 15*1/(15 - 1*9/3) = 15/12 = 1.25 < 8 Why is this any use? – start with one ring and reduce computation Why turnpike – for a long trip you might go a little out of your way to maximize time on a turnpike. ItemSizeValue Ring115 Candelabra510 Radio39 Elvis45

43 Proof of Turnpike Theorem Without using any type 1 items we cannot do better than include w/w 2 type 2 This would earn c 2 w/w 2 Suppose we fill the knapsack with as many type 1 items as possible We can fit in at least (w/w 1 ‑ 1) type 1 items These items would earn a benefit of c 1 (w/w 1 ‑ 1) Thus if c 1 (w/w 1 ‑ 1)  c 2 w/w 2 (1) there must be an optimal solution using a type 1 item (1) holds if w(c 1 /w 1 ‑ c 2 /w 2 )  c 1 c 1 w 1 or w  ‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑ = w* c 1 ‑ c 2 w 1 /w 2 Thus if knapsack can hold at least w* pounds, there will be an optimal solution using at least one type 1 item


Download ppt "Chapter 13 DETERMINISTIC DYNAMIC PROGRAMMING Math 305 2008 We will cover 9.1-9.4 plus some material not in the text."

Similar presentations


Ads by Google