Approximate Dynamic Programming for High-Dimensional Asset Allocation Ohio State April 16, 2004 Warren Powell CASTLE Laboratory Princeton University

Slides:



Advertisements
Similar presentations
R&D Portfolio Optimization One Stage R&D Portfolio Optimization with an Application to Solid Oxide Fuel Cells Lauren Hannah 1, Warren Powell 1, Jeffrey.
Advertisements

Dynamic Programming In this handout A shortest path example
Computer vision: models, learning and inference Chapter 8 Regression.
Dynamic Bayesian Networks (DBNs)
Computational Stochastic Optimization:
An Introduction to Variational Methods for Graphical Models.
Basic Feasible Solutions: Recap MS&E 211. WILL FOLLOW A CELEBRATED INTELLECTUAL TEACHING TRADITION.
Empirical Maximum Likelihood and Stochastic Process Lecture VIII.
Applications of Stochastic Programming in the Energy Industry Chonawee Supatgiat Research Group Enron Corp. INFORMS Houston Chapter Meeting August 2, 2001.
© 2003 Warren B. Powell Slide 1 Approximate Dynamic Programming for High Dimensional Resource Allocation NSF Electric Power workshop November 3, 2003 Warren.
© 2008 Warren B. Powell Slide 1 The Dynamic Energy Resource Model Warren Powell Alan Lamont Jeffrey Stewart Abraham George © 2007 Warren B. Powell, Princeton.
Lecture 5: Learning models using EM
© 2004 Warren B. Powell Slide 1 Outline A car distribution problem.
Dynamic lot sizing and tool management in automated manufacturing systems M. Selim Aktürk, Siraceddin Önen presented by Zümbül Bulut.
7. Experiments 6. Theoretical Guarantees Let the local policy improvement algorithm be policy gradient. Notes: These assumptions are insufficient to give.
EE 685 presentation Optimization Flow Control, I: Basic Algorithm and Convergence By Steven Low and David Lapsley Asynchronous Distributed Algorithm Proof.
Planning operation start times for the manufacture of capital products with uncertain processing times and resource constraints D.P. Song, Dr. C.Hicks.
Advanced Topics in Optimization
Slide 1 © 2008 Warren B. Powell Slide 1 Approximate Dynamic Programming for High-Dimensional Problems in Energy Modeling Ohio St. University October 7,
Equilibrium problems with equilibrium constraints: A new modelling paradigm for revenue management Houyuan Jiang Danny Ralph Stefan Scholtes The Judge.
CHAPTER 15 S IMULATION - B ASED O PTIMIZATION II : S TOCHASTIC G RADIENT AND S AMPLE P ATH M ETHODS Organization of chapter in ISSO –Introduction to gradient.
Computational Stochastic Optimization: Bridging communities October 25, 2012 Warren Powell CASTLE Laboratory Princeton University
Chen Cai, Benjamin Heydecker Presentation for the 4th CREST Open Workshop Operation Research for Software Engineering Methods, London, 2010 Approximate.
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 1 Chapter 1: The DP Algorithm To do:  sequential decision-making  state.
Slide 1 Tutorial: Optimal Learning in the Laboratory Sciences The knowledge gradient December 10, 2014 Warren B. Powell Kris Reyes Si Chen Princeton University.
Linear Programming Topics General optimization model LP model and assumptions Manufacturing example Characteristics of solutions Sensitivity analysis Excel.
Total Car Management “TCM” TCM is an automatic freight car distribution system –Developed internally, starting in January 2004 –Transitioned one.
An Overview of Dynamic Programming Seminar Series Joe Hartman ISE October 14, 2004.
Stochastic Linear Programming by Series of Monte-Carlo Estimators Leonidas SAKALAUSKAS Institute of Mathematics&Informatics Vilnius, Lithuania
De-Nian Young Ming-Syan Chen IEEE Transactions on Mobile Computing Slide content thanks in part to Yu-Hsun Chen, University of Taiwan.
Discriminant Functions
1 S ystems Analysis Laboratory Helsinki University of Technology Kai Virtanen, Tuomas Raivio and Raimo P. Hämäläinen Systems Analysis Laboratory Helsinki.
INFORMS Annual Meeting San Diego 1 HIERARCHICAL KNOWLEDGE GRADIENT FOR SEQUENTIAL SAMPLING Martijn Mes Department of Operational Methods for.
Approximate Dynamic Programming Methods for Resource Constrained Sensor Management John W. Fisher III, Jason L. Williams and Alan S. Willsky MIT CSAIL.
Bekkjarvik, A Heuristic Solution Method for a Stochastic Vehicle Routing Problem Lars Magnus Hvattum.
© 2007 Warren B. Powell Slide 1 The Dynamic Energy Resource Model Lawrence Livermore National Laboratories September 24, 2007 Warren Powell Alan Lamont.
Linear Programming Erasmus Mobility Program (24Apr2012) Pollack Mihály Engineering Faculty (PMMK) University of Pécs João Miranda
Optimal Fueling Strategies for Locomotive Fleets in Railroad Networks Seyed Mohammad Nourbakhsh Yanfeng Ouyang 1 William W. Hay Railroad Engineering Seminar.
Approximate Dynamic Programming and Policy Search: Does anything work? Rutgers Applied Probability Workshop June 6, 2014 Warren B. Powell Daniel R. Jiang.
EE 685 presentation Optimization Flow Control, I: Basic Algorithm and Convergence By Steven Low and David Lapsley.
Outline The role of information What is information? Different types of information Controlling information.
© 2009 Ilya O. Ryzhov 1 © 2008 Warren B. Powell 1. Optimal Learning On A Graph INFORMS Annual Meeting October 11, 2009 Ilya O. Ryzhov Warren Powell Princeton.
1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
1 Inventory Control with Time-Varying Demand. 2  Week 1Introduction to Production Planning and Inventory Control  Week 2Inventory Control – Deterministic.
INFORMS Annual Meeting Austin1 Learning in Approximate Dynamic Programming for Managing a Multi- Attribute Driver Martijn Mes Department of Operational.
Design and Analysis of Algorithms (09 Credits / 5 hours per week) Sixth Semester: Computer Science & Engineering M.B.Chandak
Efficient Resource Allocation for Wireless Multicast De-Nian Yang, Member, IEEE Ming-Syan Chen, Fellow, IEEE IEEE Transactions on Mobile Computing, April.
INTRO TO OPTIMIZATION MATH-415 Numerical Analysis 1.
Smart Sleeping Policies for Wireless Sensor Networks Venu Veeravalli ECE Department & Coordinated Science Lab University of Illinois at Urbana-Champaign.
DEPARTMENT/SEMESTER ME VII Sem COURSE NAME Operation Research Manav Rachna College of Engg.
Chapter 13 Aggregate Planning.
Searching a Linear Subspace Lecture VI. Deriving Subspaces There are several ways to derive the nullspace matrix (or kernel matrix). ◦ The methodology.
Dynamic Programming - DP December 18, 2014 Operation Research -RG 753.
Design and Analysis of Algorithms (09 Credits / 5 hours per week)
Linear Programming Topics General optimization model
Water Resources Planning and Management Daene McKinney
Analytics and OR DP- summary.
13 Aggregate Planning.
Linear Programming Topics General optimization model
Clearing the Jungle of Stochastic Optimization
Linear Programming Topics General optimization model
Linear Programming Topics General optimization model
Approximate Dynamic Programming for
1.206J/16.77J/ESD.215J Airline Schedule Planning
Slides for Introduction to Stochastic Search and Optimization (ISSO) by J. C. Spall CHAPTER 15 SIMULATION-BASED OPTIMIZATION II: STOCHASTIC GRADIENT AND.
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Markov Decision Problems
Support Vector Machines
Operations Research Models
Presentation transcript:

Approximate Dynamic Programming for High-Dimensional Asset Allocation Ohio State April 16, 2004 Warren Powell CASTLE Laboratory Princeton University © 2003 Warren B. Powell, Princeton University

© 2004 Warren B. Powell Slide 2 Yellow Freight System © 2002 Warren B. Powell, Princeton University

© 2004 Warren B. Powell Slide 3

© 2004 Warren B. Powell Slide 4

© 2004 Warren B. Powell Slide 5

© 2004 Warren B. Powell Slide 6 Air Mobility Command Air Mobility Command Fuel Cargo Handling Ramp Space Maintenance Cargo Holding

© 2004 Warren B. Powell Slide 7 The optimization challenge

Special equipment

Schneider National

© 2004 Warren B. Powell Slide 16 Resources Tasks Time 0 The dynamic assignment problem

© 2004 Warren B. Powell Slide 17 Resources Tasks Time 1Time 0 The dynamic assignment problem

© 2004 Warren B. Powell Slide 18 Resources Tasks Time 1Time 0Time 2 The dynamic assignment problem

© 2004 Warren B. Powell Slide 19 Multidimensional attribute spaces decision d

© 2004 Warren B. Powell Slide 20 Multidimensional attribute spaces decision d’

© 2004 Warren B. Powell Slide 21 Multidimensional attribute spaces ? ?

© 2004 Warren B. Powell Slide 22 State variables Modeling the fleet management problem: »State variables: »Control variables:

© 2004 Warren B. Powell Slide 23 State variables We can formulate the problem of determining what to do with our truck as a dynamic program:

© 2004 Warren B. Powell Slide 24 Hierarchical Aggregation Attribute vectors tend to become increasingly more complex:

© 2004 Warren B. Powell Slide 25 State variables If we only have N=1 truck:

© 2004 Warren B. Powell Slide 35 State variables What if we have N > 1 trucks?

© 2004 Warren B. Powell Slide 36 Computational challenge A sequence of algorithmic challenges: »Step 1: Managing multiple, simple assets. »Step 2: Managing a single, complex asset »Step 3: Managing multiple, complex assets

Outline An algorithmic strategy for high-dimensional dynamic programs.

© 2004 Warren B. Powell Slide 38 Approximate dynamic programming Systems evolve through a cycle of exogenous and endogenous information Time

© 2004 Warren B. Powell Slide 39 Approximate dynamic programming Systems evolve through a cycle of exogenous and endogenous information Time

© 2004 Warren B. Powell Slide 40 Approximate dynamic programming Using this state variable, we obtain the optimality equations: Problem: Curse of dimensionality Three curses State space Outcome space Action space (feasible region)

© 2004 Warren B. Powell Slide 41 Approximate dynamic programming The computational challenge: How do we find ? How do we compute the expectation? How do we find the optimal solution?

© 2004 Warren B. Powell Slide 42 Approximate dynamic programming Approximation methodology: Can’t compute this!!!Don’t know what this is!

© 2004 Warren B. Powell Slide 43 Adaptive dynamic programming Alternative: Change the definition of the state variable: Time

© 2004 Warren B. Powell Slide 44 Adaptive dynamic programming Now our optimality equation looks like: We drop the expectation and solve the conditional problem: Finally, we substitute in our approximation:

© 2004 Warren B. Powell Slide 45 Adaptive dynamic programming Approximating the value function: »We choose approximations of the form:

Outline Managing multiple, simple assets

© 2004 Warren B. Powell Slide 47

© 2004 Warren B. Powell Slide 48 Norfolk Southern

When a boxcar becomes empty, we have three options: Customers Regional depots General depots

Option 1: Send directly to customers

Option 2: Send to distribution areas

Option 1: Send directly to customers Option 2: Send to distribution areas Option 3: Send to general depots

© 2004 Warren B. Powell Slide 53 Forecasts of Car Demands ForecastActual

© 2004 Warren B. Powell Slide 54 Multiple, simple assets

© 2004 Warren B. Powell Slide 55 Multiple, simple assets Our basic strategy: Separable approximation

© 2004 Warren B. Powell Slide 56 Multiple, simple assets Two-stage resource allocation under uncertainty

© 2004 Warren B. Powell Slide 57 Multiple, simple assets

© 2004 Warren B. Powell Slide 58 Multiple, simple assets

© 2004 Warren B. Powell Slide 59 Multiple, simple assets

© 2004 Warren B. Powell Slide 60 Multiple, simple assets

© 2004 Warren B. Powell Slide 61 Multiple, simple assets We estimate the functions by sampling from our distributions. Marginal value:

© 2004 Warren B. Powell Slide 62 Two-stage problems The time t subproblem: t (i-1,t+3) (i,t+1) (i+1,t+5)

© 2004 Warren B. Powell Slide 63 Two-stage problems Left and right gradients are found by solving flow augmenting path problems. t i (i-1,t+3) Gradients: The right derivative (the value of one more unit of that resource) is a flow augmenting path from that node to the supersink.

© 2004 Warren B. Powell Slide 64 Two-stage problems Left and right derivatives are used to build up a nonlinear approximation of the subproblem.

© 2004 Warren B. Powell Slide 65 Two-stage problems Left and right derivatives are used to build up a nonlinear approximation of the subproblem. Right derivativeLeft derivative

© 2004 Warren B. Powell Slide 66 Two-stage problems Each iteration adds new segments, as well as refining old ones.

© 2004 Warren B. Powell Slide 67 Two-stage problems Number of resources Approximate value function

© 2004 Warren B. Powell Slide 68 Two-stage problems It is important to maintain concavity:

© 2004 Warren B. Powell Slide 69 Value function A concave function… Slopes … has monotonically decreasing slopes. But updating the function with a stochastic gradient may violate this property. Two-stage problems

© 2004 Warren B. Powell Slide 70 A projection algorithm (SPAR)

© 2004 Warren B. Powell Slide 71 A projection algorithm (SPAR)

© 2004 Warren B. Powell Slide 72 A projection algorithm (SPAR)

© 2004 Warren B. Powell Slide 73 A projection algorithm (SPAR)

© 2004 Warren B. Powell Slide 74 A projection algorithm (SPAR)

© 2004 Warren B. Powell Slide 75 Two-stage problems Separability implies:

© 2004 Warren B. Powell Slide 76 Two-stage problems We only sample the points in the second stage which were optimal given the second stage approximation. A stochastic allocation problem:

© 2004 Warren B. Powell Slide 77 Two-stage problems

© 2004 Warren B. Powell Slide 78 Two-stage problems Real problems are nonseparable.

© 2004 Warren B. Powell Slide 79 Two-stage problems Exact solutions using Benders: “L-Shaped” decomposition (Van Slyke and Wets) Stochastic decomposition (Higle and Sen) CUPPS (Chen and Powell)

© 2004 Warren B. Powell Slide 80 Two-stage problems Point forecast Profits Iterations

© 2004 Warren B. Powell Slide 81 Two-stage problems Variations on Bender’s decomposition Point forecast Profits Iterations

© 2004 Warren B. Powell Slide 82 Two-stage problems Variations on Bender’s decomposition SPAR algorithm Point forecast Profits Iterations

© 2004 Warren B. Powell Slide 83 A dynamic network: Multiple, simple assets t

© 2004 Warren B. Powell Slide 84 Multiple, simple assets Stepping through time:

© 2004 Warren B. Powell Slide 85 Multiple, simple assets Stepping through time:

© 2004 Warren B. Powell Slide 86

© 2004 Warren B. Powell Slide 87

© 2004 Warren B. Powell Slide 88

© 2004 Warren B. Powell Slide 89

© 2004 Warren B. Powell Slide 90 Resource State-Type Time Multistage problems

© 2004 Warren B. Powell Slide 91 Time Resource State-Type Multistage problems

© 2004 Warren B. Powell Slide 92 Time Resource State-Type Multistage problems

© 2004 Warren B. Powell Slide 93 Resource State-Type Time Multistage problems

© 2004 Warren B. Powell Slide 94 Time Resource State-Type Multistage problems

© 2004 Warren B. Powell Slide 95 Time Resource State-Type Multistage problems

© 2004 Warren B. Powell Slide 96 Backward pass

© 2004 Warren B. Powell Slide 97 Time Resource State-Type Backward pass

© 2004 Warren B. Powell Slide 98 Time Resource State-Type Backward pass

© 2004 Warren B. Powell Slide 99 Time Resource State-Type Backward pass

© 2004 Warren B. Powell Slide 100 Time Resource State-Type Backward pass

© 2004 Warren B. Powell Slide 101 A pure (deterministic) network: Multistage deterministic problems

© 2004 Warren B. Powell Slide 102 The mathematical optimum Approximate dynamic programming Multistage deterministic problems

© 2004 Warren B. Powell Slide 103 T01 Time Evolution of Simulation TSTS 11 EE 00 EE 22 EE EE T-2 EE T-1 TT Multistage stochastic problems 33 EE Planning horizon

© 2004 Warren B. Powell Slide 104 Multistage stochastic problems Planning horizon Percent of posterior bound Deterministic, rolling horizon Posterior bound

© 2004 Warren B. Powell Slide 105 Multistage stochastic problems Planning horizon Percent of posterior bound Posterior bound Using approximate DP Deterministic, rolling horizon

© 2004 Warren B. Powell Slide 106 A car distribution problem For railroads, customers call in orders the week before: Requirement becomes knownRequirement becomes actionable Time

© 2004 Warren B. Powell Slide 107

© 2004 Warren B. Powell Slide 108 A car distribution problem Repositioning movements based on forecasts Assignments to booked orders. Using value function approximations, we may reposition cars before orders become known:

© 2004 Warren B. Powell Slide 109 A car distribution problem Profits Total revenue Empty repositioning costs Late service penalties Iterations

© 2004 Warren B. Powell Slide 110 A car distribution problem Empty miles as a percent of total miles History “Optimized” without adaptive learning

© 2004 Warren B. Powell Slide 111 A car distribution problem Empty miles as a percent of total miles History “Optimized” without adaptive learning “Optimized” with adaptive learning

Outline Estimating the value of a single, complex asset

© 2004 Warren B. Powell Slide 113 Single, complex asset A simple asset:

© 2004 Warren B. Powell Slide 114 Single, complex asset A complex asset:

© 2004 Warren B. Powell Slide 115 PA TX Single, complex asset

© 2004 Warren B. Powell Slide 116 Single, complex asset The attributes of a driver:

© 2004 Warren B. Powell Slide 117 NE region PA TX Single, complex asset

© 2004 Warren B. Powell Slide 118 Single, complex asset We can use a family of aggregation functions:

Resource attributes for which we have values New resource Time t ?

© 2004 Warren B. Powell Slide 120 An approximation strategy The value of a truck at different points in the region.

Resource attributes for which we have values New resource Time t+1

© 2004 Warren B. Powell Slide 126 An approximation strategy

© 2004 Warren B. Powell Slide 127 An approximation strategy

© 2004 Warren B. Powell Slide 128 An approximation strategy

© 2004 Warren B. Powell Slide 129 An approximation strategy

© 2004 Warren B. Powell Slide 130 An approximation strategy

© 2004 Warren B. Powell Slide 134 An approximation strategy

We can use different levels of aggregation to capture the value of an asset:

© 2004 Warren B. Powell Slide 136 Hierarchical aggregation Alternative: »Use multiple levels of aggregation at the same time Estimate at gth level of aggregation Weight on gth level of aggregation

© 2004 Warren B. Powell Slide 137 x f(x) Hierarchical aggregation

© 2004 Warren B. Powell Slide 138 x f(x) Hierarchical aggregation

© 2004 Warren B. Powell Slide 139 x f(x) BiasNoise Hierarchical aggregation

© 2004 Warren B. Powell Slide 140 x f(x) High bias Moderate bias Zero bias Hierarchical aggregation

© 2004 Warren B. Powell Slide 141 Bayesian weights Optimal weights Weight on disagregate level Hierarchical weighting strategy

© 2004 Warren B. Powell Slide 142 Hierarchical aggregation Iterations Weights Aggregation level 6767 Weight on most disaggregate level Weight on most aggregate levels Optimal weights change as the algorithm progresses:

© 2004 Warren B. Powell Slide 143 Hierarchical aggregation Aggregate Disaggregate Weighted Combination

Outline Managing multiple, complex assets

© 2004 Warren B. Powell Slide 145 Yellow Freight System © 2002 Warren B. Powell, Princeton University

© 2004 Warren B. Powell Slide 146 Driver availability, colored by domicile, 24 hours from now

© 2004 Warren B. Powell Slide 147

© 2004 Warren B. Powell Slide 148

© 2004 Warren B. Powell Slide 149

© 2004 Warren B. Powell Slide 151

© 2004 Warren B. Powell Slide 152 Fleet management Questions: »What if we have more team drivers? »What if we hire more drivers domiciled in Texas? »What is the value of moving more freight from Chicago to Denver?

© 2004 Warren B. Powell Slide 153