Approximate Dynamic Programming for High-Dimensional Asset Allocation Ohio State April 16, 2004 Warren Powell CASTLE Laboratory Princeton University © 2003 Warren B. Powell, Princeton University
© 2004 Warren B. Powell Slide 2 Yellow Freight System © 2002 Warren B. Powell, Princeton University
© 2004 Warren B. Powell Slide 3
© 2004 Warren B. Powell Slide 4
© 2004 Warren B. Powell Slide 5
© 2004 Warren B. Powell Slide 6 Air Mobility Command Air Mobility Command Fuel Cargo Handling Ramp Space Maintenance Cargo Holding
© 2004 Warren B. Powell Slide 7 The optimization challenge
Special equipment
Schneider National
© 2004 Warren B. Powell Slide 16 Resources Tasks Time 0 The dynamic assignment problem
© 2004 Warren B. Powell Slide 17 Resources Tasks Time 1Time 0 The dynamic assignment problem
© 2004 Warren B. Powell Slide 18 Resources Tasks Time 1Time 0Time 2 The dynamic assignment problem
© 2004 Warren B. Powell Slide 19 Multidimensional attribute spaces decision d
© 2004 Warren B. Powell Slide 20 Multidimensional attribute spaces decision d’
© 2004 Warren B. Powell Slide 21 Multidimensional attribute spaces ? ?
© 2004 Warren B. Powell Slide 22 State variables Modeling the fleet management problem: »State variables: »Control variables:
© 2004 Warren B. Powell Slide 23 State variables We can formulate the problem of determining what to do with our truck as a dynamic program:
© 2004 Warren B. Powell Slide 24 Hierarchical Aggregation Attribute vectors tend to become increasingly more complex:
© 2004 Warren B. Powell Slide 25 State variables If we only have N=1 truck:
© 2004 Warren B. Powell Slide 35 State variables What if we have N > 1 trucks?
© 2004 Warren B. Powell Slide 36 Computational challenge A sequence of algorithmic challenges: »Step 1: Managing multiple, simple assets. »Step 2: Managing a single, complex asset »Step 3: Managing multiple, complex assets
Outline An algorithmic strategy for high-dimensional dynamic programs.
© 2004 Warren B. Powell Slide 38 Approximate dynamic programming Systems evolve through a cycle of exogenous and endogenous information Time
© 2004 Warren B. Powell Slide 39 Approximate dynamic programming Systems evolve through a cycle of exogenous and endogenous information Time
© 2004 Warren B. Powell Slide 40 Approximate dynamic programming Using this state variable, we obtain the optimality equations: Problem: Curse of dimensionality Three curses State space Outcome space Action space (feasible region)
© 2004 Warren B. Powell Slide 41 Approximate dynamic programming The computational challenge: How do we find ? How do we compute the expectation? How do we find the optimal solution?
© 2004 Warren B. Powell Slide 42 Approximate dynamic programming Approximation methodology: Can’t compute this!!!Don’t know what this is!
© 2004 Warren B. Powell Slide 43 Adaptive dynamic programming Alternative: Change the definition of the state variable: Time
© 2004 Warren B. Powell Slide 44 Adaptive dynamic programming Now our optimality equation looks like: We drop the expectation and solve the conditional problem: Finally, we substitute in our approximation:
© 2004 Warren B. Powell Slide 45 Adaptive dynamic programming Approximating the value function: »We choose approximations of the form:
Outline Managing multiple, simple assets
© 2004 Warren B. Powell Slide 47
© 2004 Warren B. Powell Slide 48 Norfolk Southern
When a boxcar becomes empty, we have three options: Customers Regional depots General depots
Option 1: Send directly to customers
Option 2: Send to distribution areas
Option 1: Send directly to customers Option 2: Send to distribution areas Option 3: Send to general depots
© 2004 Warren B. Powell Slide 53 Forecasts of Car Demands ForecastActual
© 2004 Warren B. Powell Slide 54 Multiple, simple assets
© 2004 Warren B. Powell Slide 55 Multiple, simple assets Our basic strategy: Separable approximation
© 2004 Warren B. Powell Slide 56 Multiple, simple assets Two-stage resource allocation under uncertainty
© 2004 Warren B. Powell Slide 57 Multiple, simple assets
© 2004 Warren B. Powell Slide 58 Multiple, simple assets
© 2004 Warren B. Powell Slide 59 Multiple, simple assets
© 2004 Warren B. Powell Slide 60 Multiple, simple assets
© 2004 Warren B. Powell Slide 61 Multiple, simple assets We estimate the functions by sampling from our distributions. Marginal value:
© 2004 Warren B. Powell Slide 62 Two-stage problems The time t subproblem: t (i-1,t+3) (i,t+1) (i+1,t+5)
© 2004 Warren B. Powell Slide 63 Two-stage problems Left and right gradients are found by solving flow augmenting path problems. t i (i-1,t+3) Gradients: The right derivative (the value of one more unit of that resource) is a flow augmenting path from that node to the supersink.
© 2004 Warren B. Powell Slide 64 Two-stage problems Left and right derivatives are used to build up a nonlinear approximation of the subproblem.
© 2004 Warren B. Powell Slide 65 Two-stage problems Left and right derivatives are used to build up a nonlinear approximation of the subproblem. Right derivativeLeft derivative
© 2004 Warren B. Powell Slide 66 Two-stage problems Each iteration adds new segments, as well as refining old ones.
© 2004 Warren B. Powell Slide 67 Two-stage problems Number of resources Approximate value function
© 2004 Warren B. Powell Slide 68 Two-stage problems It is important to maintain concavity:
© 2004 Warren B. Powell Slide 69 Value function A concave function… Slopes … has monotonically decreasing slopes. But updating the function with a stochastic gradient may violate this property. Two-stage problems
© 2004 Warren B. Powell Slide 70 A projection algorithm (SPAR)
© 2004 Warren B. Powell Slide 71 A projection algorithm (SPAR)
© 2004 Warren B. Powell Slide 72 A projection algorithm (SPAR)
© 2004 Warren B. Powell Slide 73 A projection algorithm (SPAR)
© 2004 Warren B. Powell Slide 74 A projection algorithm (SPAR)
© 2004 Warren B. Powell Slide 75 Two-stage problems Separability implies:
© 2004 Warren B. Powell Slide 76 Two-stage problems We only sample the points in the second stage which were optimal given the second stage approximation. A stochastic allocation problem:
© 2004 Warren B. Powell Slide 77 Two-stage problems
© 2004 Warren B. Powell Slide 78 Two-stage problems Real problems are nonseparable.
© 2004 Warren B. Powell Slide 79 Two-stage problems Exact solutions using Benders: “L-Shaped” decomposition (Van Slyke and Wets) Stochastic decomposition (Higle and Sen) CUPPS (Chen and Powell)
© 2004 Warren B. Powell Slide 80 Two-stage problems Point forecast Profits Iterations
© 2004 Warren B. Powell Slide 81 Two-stage problems Variations on Bender’s decomposition Point forecast Profits Iterations
© 2004 Warren B. Powell Slide 82 Two-stage problems Variations on Bender’s decomposition SPAR algorithm Point forecast Profits Iterations
© 2004 Warren B. Powell Slide 83 A dynamic network: Multiple, simple assets t
© 2004 Warren B. Powell Slide 84 Multiple, simple assets Stepping through time:
© 2004 Warren B. Powell Slide 85 Multiple, simple assets Stepping through time:
© 2004 Warren B. Powell Slide 86
© 2004 Warren B. Powell Slide 87
© 2004 Warren B. Powell Slide 88
© 2004 Warren B. Powell Slide 89
© 2004 Warren B. Powell Slide 90 Resource State-Type Time Multistage problems
© 2004 Warren B. Powell Slide 91 Time Resource State-Type Multistage problems
© 2004 Warren B. Powell Slide 92 Time Resource State-Type Multistage problems
© 2004 Warren B. Powell Slide 93 Resource State-Type Time Multistage problems
© 2004 Warren B. Powell Slide 94 Time Resource State-Type Multistage problems
© 2004 Warren B. Powell Slide 95 Time Resource State-Type Multistage problems
© 2004 Warren B. Powell Slide 96 Backward pass
© 2004 Warren B. Powell Slide 97 Time Resource State-Type Backward pass
© 2004 Warren B. Powell Slide 98 Time Resource State-Type Backward pass
© 2004 Warren B. Powell Slide 99 Time Resource State-Type Backward pass
© 2004 Warren B. Powell Slide 100 Time Resource State-Type Backward pass
© 2004 Warren B. Powell Slide 101 A pure (deterministic) network: Multistage deterministic problems
© 2004 Warren B. Powell Slide 102 The mathematical optimum Approximate dynamic programming Multistage deterministic problems
© 2004 Warren B. Powell Slide 103 T01 Time Evolution of Simulation TSTS 11 EE 00 EE 22 EE EE T-2 EE T-1 TT Multistage stochastic problems 33 EE Planning horizon
© 2004 Warren B. Powell Slide 104 Multistage stochastic problems Planning horizon Percent of posterior bound Deterministic, rolling horizon Posterior bound
© 2004 Warren B. Powell Slide 105 Multistage stochastic problems Planning horizon Percent of posterior bound Posterior bound Using approximate DP Deterministic, rolling horizon
© 2004 Warren B. Powell Slide 106 A car distribution problem For railroads, customers call in orders the week before: Requirement becomes knownRequirement becomes actionable Time
© 2004 Warren B. Powell Slide 107
© 2004 Warren B. Powell Slide 108 A car distribution problem Repositioning movements based on forecasts Assignments to booked orders. Using value function approximations, we may reposition cars before orders become known:
© 2004 Warren B. Powell Slide 109 A car distribution problem Profits Total revenue Empty repositioning costs Late service penalties Iterations
© 2004 Warren B. Powell Slide 110 A car distribution problem Empty miles as a percent of total miles History “Optimized” without adaptive learning
© 2004 Warren B. Powell Slide 111 A car distribution problem Empty miles as a percent of total miles History “Optimized” without adaptive learning “Optimized” with adaptive learning
Outline Estimating the value of a single, complex asset
© 2004 Warren B. Powell Slide 113 Single, complex asset A simple asset:
© 2004 Warren B. Powell Slide 114 Single, complex asset A complex asset:
© 2004 Warren B. Powell Slide 115 PA TX Single, complex asset
© 2004 Warren B. Powell Slide 116 Single, complex asset The attributes of a driver:
© 2004 Warren B. Powell Slide 117 NE region PA TX Single, complex asset
© 2004 Warren B. Powell Slide 118 Single, complex asset We can use a family of aggregation functions:
Resource attributes for which we have values New resource Time t ?
© 2004 Warren B. Powell Slide 120 An approximation strategy The value of a truck at different points in the region.
Resource attributes for which we have values New resource Time t+1
© 2004 Warren B. Powell Slide 126 An approximation strategy
© 2004 Warren B. Powell Slide 127 An approximation strategy
© 2004 Warren B. Powell Slide 128 An approximation strategy
© 2004 Warren B. Powell Slide 129 An approximation strategy
© 2004 Warren B. Powell Slide 130 An approximation strategy
© 2004 Warren B. Powell Slide 134 An approximation strategy
We can use different levels of aggregation to capture the value of an asset:
© 2004 Warren B. Powell Slide 136 Hierarchical aggregation Alternative: »Use multiple levels of aggregation at the same time Estimate at gth level of aggregation Weight on gth level of aggregation
© 2004 Warren B. Powell Slide 137 x f(x) Hierarchical aggregation
© 2004 Warren B. Powell Slide 138 x f(x) Hierarchical aggregation
© 2004 Warren B. Powell Slide 139 x f(x) BiasNoise Hierarchical aggregation
© 2004 Warren B. Powell Slide 140 x f(x) High bias Moderate bias Zero bias Hierarchical aggregation
© 2004 Warren B. Powell Slide 141 Bayesian weights Optimal weights Weight on disagregate level Hierarchical weighting strategy
© 2004 Warren B. Powell Slide 142 Hierarchical aggregation Iterations Weights Aggregation level 6767 Weight on most disaggregate level Weight on most aggregate levels Optimal weights change as the algorithm progresses:
© 2004 Warren B. Powell Slide 143 Hierarchical aggregation Aggregate Disaggregate Weighted Combination
Outline Managing multiple, complex assets
© 2004 Warren B. Powell Slide 145 Yellow Freight System © 2002 Warren B. Powell, Princeton University
© 2004 Warren B. Powell Slide 146 Driver availability, colored by domicile, 24 hours from now
© 2004 Warren B. Powell Slide 147
© 2004 Warren B. Powell Slide 148
© 2004 Warren B. Powell Slide 149
© 2004 Warren B. Powell Slide 151
© 2004 Warren B. Powell Slide 152 Fleet management Questions: »What if we have more team drivers? »What if we hire more drivers domiciled in Texas? »What is the value of moving more freight from Chicago to Denver?
© 2004 Warren B. Powell Slide 153