Stochastic Network Optimization with Non-Convex Utilities and Costs Michael J. Neely University of Southern California

Slides:

Advertisements

Similar presentations

Optimal Pricing in a Free Market Wireless Network Michael J. Neely University of Southern California *Sponsored in part.

Advertisements

Network Utility Maximization over Partially Observable Markov Channels 1 1 Channel State 1 = ? Channel State 2 = ? Channel State 3 = ? Restless.

Introduction to Algorithms

Stochastic optimization for power-aware distributed scheduling Michael J. Neely University of Southern California t ω(t)

Delay Reduction via Lagrange Multipliers in Stochastic Network Optimization Longbo Huang Michael J. Neely WiOpt *Sponsored in part by NSF.

Resource Allocation in Wireless Networks: Dynamics and Complexity R. Srikant Department of ECE and CSL University of Illinois at Urbana-Champaign.

EE 685 presentation Optimal Control of Wireless Networks with Finite Buffers By Long Bao Le, Eytan Modiano and Ness B. Shroff.

DYNAMIC POWER ALLOCATION AND ROUTING FOR TIME-VARYING WIRELESS NETWORKS Michael J. Neely, Eytan Modiano and Charles E.Rohrs Presented by Ruogu Li Department.

Intelligent Packet Dropping for Optimal Energy-Delay Tradeoffs for Wireless Michael J. Neely University of Southern California

Decomposable Optimisation Methods LCA Reading Group, 12/04/2011 Dan-Cristian Tomozei.

Dynamic Product Assembly and Inventory Control for Maximum Profit Michael J. Neely, Longbo Huang (University of Southern California) Proc. IEEE Conf. on.

TCP Stability and Resource Allocation: Part II. Issues with TCP Round-trip bias Instability under large bandwidth-delay product Transient performance.

Dynamic Index Coding Broadcast Station N N Michael J. Neely, Arash Saber Tehrani, Zhen Zhang University of Southern California Paper available.

Universal Scheduling for Networks with Arbitrary Traffic, Channels, and Mobility Michael J. Neely, University of Southern California Proc. IEEE Conf. on.

Efficient Algorithms for Renewable Energy Allocation to Delay Tolerant Consumers Michael J. Neely, Arash Saber Tehrani, Alexandros G. Dimakis University.

Utility Optimization for Dynamic Peer-to-Peer Networks with Tit-for-Tat Constraints Michael J. Neely, Leana Golubchik University of Southern California.

Stock Market Trading Via Stochastic Network Optimization Michael J. Neely (University of Southern California) Proc. IEEE Conf. on Decision and Control.

Delay-Based Network Utility Maximization Michael J. Neely University of Southern California IEEE INFOCOM, San Diego, March.

Dynamic Optimization and Learning for Renewal Systems Michael J. Neely, University of Southern California Asilomar Conference on Signals, Systems, and.

Dynamic Index Coding User set N Packet set P Broadcast Station N N p p p Michael J. Neely, Arash Saber Tehrani, Zhen Zhang University.

Dynamic Optimization and Learning for Renewal Systems -- With applications to Wireless Networks and Peer-to-Peer Networks Michael J. Neely, University.

Max Weight Learning Algorithms with Application to Scheduling in Unknown Environments Michael J. Neely University of Southern California

Dynamic Data Compression for Wireless Transmission over a Fading Channel Michael J. Neely University of Southern California CISS 2008 *Sponsored in part.

Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x x 2 2 f(x) = -(x-a)

*Sponsored in part by the DARPA IT-MANET Program, NSF OCE Opportunistic Scheduling with Reliability Guarantees in Cognitive Radio Networks Rahul.

Optimal Throughput Allocation in General Random Access Networks P. Gupta, A. Stolyar Bell Labs, Murray Hill, NJ March 24, 2006.

ADCN MURI Tools for the Analysis and Design of Complex Multi-Scale Networks Review September 9, 2009 Protocols for Wireless Networks Libin Jiang, Jiwoong.

Multi-Hop Networking with Hard Delay Constraints Michael J. Neely, University of Southern California DARPA IT-MANET Presentation, January 2011 PDF of paper.

Cross Layer Adaptive Control for Wireless Mesh Networks (and a theory of instantaneous capacity regions) Michael J. Neely, Rahul Urgaonkar University of.

CISS Princeton, March Optimization via Communication Networks Matthew Andrews Alcatel-Lucent Bell Labs.

Rethinking Internet Traffic Management: From Multiple Decompositions to a Practical Protocol Jiayue He Princeton University Joint work with Martin Suchara,

1 Optimization and Stochastic Control of MANETs Asu Ozdaglar Electrical Engineering and Computer Science Massachusetts Institute of Technology CBMANET.

Flow Models and Optimal Routing. How can we evaluate the performance of a routing algorithm –quantify how well they do –use arrival rates at nodes and.

Optimal Energy and Delay Tradeoffs for Multi-User Wireless Downlinks Michael J. Neely University of Southern California

A Lyapunov Optimization Approach to Repeated Stochastic Games Michael J. Neely University of Southern California Proc.

Resource Allocation for E-healthcare Applications

DaVinci: Dynamically Adaptive Virtual Networks for a Customized Internet Jennifer Rexford Princeton University With Jiayue He, Rui Zhang-Shen, Ying Li,

Optimal Backpressure Routing for Wireless Networks with Multi-Receiver Diversity Michael J. Neely University of Southern California

Delay Analysis for Maximal Scheduling in Wireless Networks with Bursty Traffic Michael J. Neely University of Southern California INFOCOM 2008, Phoenix,

By Avinash Sridrahan, Scott Moeller and Bhaskar Krishnamachari.

Mazumdar Ne X tworking’03 June 23-25,2003, Chania, Crete, Greece The First COST-IST(EU)-NSF(USA) Workshop on EXCHANGES & TRENDS IN N ETWORKING 1 Non-convex.

1 Chapter 7 Linear Programming. 2 Linear Programming (LP) Problems Both objective function and constraints are linear. Solutions are highly structured.

Michael J. Neely, University of Southern California CISS, Princeton University, March 2012 Wireless Peer-to-Peer Scheduling.

Michael J. Neely, University of Southern California CISS, Princeton University, March 2012 Asynchronous Scheduling for.

Utility Maximization for Delay Constrained QoS in Wireless I-Hong Hou P.R. Kumar University of Illinois, Urbana-Champaign 1 /23.

Stochastic Network Optimization and the Theory of Network Throughput, Energy, and Delay Michael J. Neely University of Southern California

ORSIS Conference, Jerusalem Mountains, Israel May 13, 2007 Yoni Nazarathy Gideon Weiss University of Haifa Yoni Nazarathy Gideon Weiss University of Haifa.

Stochastic Optimal Networking: Energy, Delay, Fairness Michael J. Neely University of Southern California

DaVinci: Dynamically Adaptive Virtual Networks for a Customized Internet Jiayue He, Rui Zhang-Shen, Ying Li, Cheng-Yen Lee, Jennifer Rexford, and Mung.

Energy-Aware Wireless Scheduling with Near Optimal Backlog and Convergence Time Tradeoffs Michael J. Neely University of Southern California INFOCOM 2015,

Super-Fast Delay Tradeoffs for Utility Optimal Scheduling in Wireless Networks Michael J. Neely University of Southern California

ITMANET PI Meeting September 2009 ITMANET Nequ-IT Focus Talk (PI Neely): Reducing Delay in MANETS via Queue Engineering.

Information Theory for Mobile Ad-Hoc Networks (ITMANET): The FLoWS Project Competitive Scheduling in Wireless Networks with Correlated Channel State Ozan.

Fairness and Optimal Stochastic Control for Heterogeneous Networks Time-Varying Channels     U n (c) (t) R n (c) (t) n (c) sensor.

1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.

Order Optimal Delay for Opportunistic Scheduling In Multi-User Wireless Uplinks and Downlinks Michael J. Neely University of Southern California

Content caching and scheduling in wireless networks with elastic and inelastic traffic Group-VI 09CS CS CS30020 Performance Modelling in Computer.

Stochastic Optimization for Markov Modulated Networks with Application to Delay Constrained Wireless Scheduling Michael J. Neely University of Southern.

Delay Analysis for Max Weight Opportunistic Scheduling in Wireless Systems Michael J. Neely --- University of Southern California

Energy Optimal Control for Time Varying Wireless Networks Michael J. Neely University of Southern California

Chance Constrained Robust Energy Efficiency in Cognitive Radio Networks with Channel Uncertainty Yongjun Xu and Xiaohui Zhao College of Communication Engineering,

Asynchronous Control for Coupled Markov Decision Systems Michael J. Neely University of Southern California Information Theory Workshop (ITW) Lausanne,

Approximation Algorithms Duality My T. UF.

Online Fractional Programming for Markov Decision Systems

energy requests a(t) renewable source s(t) non-renewable source x(t)

Utility Optimization with “Super-Fast”

Javad Ghaderi, Tianxiong Ji and R. Srikant

Optimal Control for Generalized Network-Flow Problems

Optimization under Uncertainty

Presentation transcript:

Stochastic Network Optimization with Non-Convex Utilities and Costs Michael J. Neely University of Southern California Information Theory and Applications Workshop (ITA), Feb *Sponsored in part by the DARPA IT-MANET Program, NSF Career CCF , ARL a 1 (t) a 2 (t) a K (t) Utility Attribute x

Problem Description: K Queue Network --- (Q 1 (t), …, Q K (t)) Slotted time, t in {0, 1, 2, … }  (t) = “Random Network Event” (e.g., arrivals, channels, etc.)  (t) = “Control Action” (e.g., power allocation, routing, etc.) Decision: Observe  (t) every slot. Choose  (t) in A  (t). Affects, arrivals, service, and “Network Attributes”: a k (t) = a k (  (t),  (t)) = arrivals to queue k on slot t b k (t) = b k (  (t),  (t)) = service to queue k on slot t x m (t) = x m (  (t),  (t)) = Network Attribute m on slot t a k (t)b k (t) (these are general functions, possibly non-convex, discontinuous)

What are “Network Attributes” ? x(t) = (x 1 (t), …, x M (t)) Traditional: Packet Admissions / Throughput Power Expenditures Packet Drops Emerging Attributes for Network Science: Quality of Information (QoI) Metrics Distortions Profit Real-Valued Meta-Data

Define Time Averages: x = ( x 1, …, x M ) Goal: Minimize : f( x ) Subject to: 1) g n ( x ) ≤ 0 for n in {1, …, N} 2) x in X 3) All queues Q k (t) stable Where: X is an abstract convex set g n (x) are convex functions f(x) is a possibly non-convex function!

Example Problem 1: Maximizing non-concave thruput-utility x = ( x 1, …, x M ) = time avg “thruput” attribute vector f(x) = Non-Concave Utility = f 1 (x 1 ) + f 2 (x 2 ) + … + f M (x M ) Utility f m (x) Attribute x Utility is only large when thruput exceeds a threshold. Global Optimality can be as hard as combinatorial bin-packing.

Example Problem 2: Risk-Aware Networking (Variance Minimization) Let p(t) = “Network Profit” on slot t. Define Attributes: x 1 (t) = p(t) x 2 (t) = p(t) 2 Then: Var(p) = E{p 2 } – E{p} 2 = x 2 – ( x 1 ) 2 Minimizing variance minimizes a non-convex function of a time-average! Non-Convex!

Prior Work on Non-Stochastic (static) Non-Convex Network Optimization: Lee, Mazumdar, Shroff, TON 2005 Chiang 2008 Utility f m (x) Attribute x

Prior Work on Stochastic, Convex Network Optimization: Dual-Based: Neely 2003, 2005, Georgiadis, Neely, Tassiulas F&T 2006 Explicit optimality, performance, convergence analysis via a “drift-plus-penalty” alg: [O(1/V), O(V)] Performance-Delay tradeoff Eryilmaz, Srikant 2005 (“fluid model,” infinite backlog) Primal-Dual-Based: Agrawal, Subramanian 2002 (no queues, infinite backlog) Kushner, Whiting 2002 (no queues, infinite backlog) Stolyar 2005, 2006 (with queues, but “fluid model”): Proves optimality over a “fluid network.” Conjectures that the actual network utility approaches optimal when a parameter is scaled.

Summary: 1) Optimizing a time average of a non-convex function is Easy! (can find global optimum Georgiadis, Neely, Tassiulas F&T 2006). 2) Optimizing a non-convex function of a time average is Hard! (CAN WE FIND A LOCAL OPTIMUM??) Drift-Plus-Penalty with “Pure-Dual” Algorithm: Works great for convex problems Robust to changes, has explicit performance, convergence bounds BUT: For non-convex problems, it would find global optimum of the time average of f(x), which is not necessarily even a local optimum of f( x ). Drift-Plus-Penalty with “Primal-Dual” Component: OUR NEW RESULT: Works well for non-convex! Can find a local optimum of f( x )!

Solving the Problem via a Transformation: Original Problem: Min: f( x ) Subject to: 1) g n ( x ) ≤ 0, n in {1,…,N} 2) x in X 3) All Queues Stable Transformed Problem: Min: f( x ) Subject to: 1) g n (  ) ≤ 0, n in {1,…,N} 2)  m = x m, for all m 3)  (t) in X, for all t 4) All Queues Stable Auxiliary Variables:  (t) =(  1 (t), …,  M (t)). These act as a proxy for x(t) = (x 1 (t), …, x M (t)). Constraints in the new problem are time averages of functions, not functions of time averages! And the problems are equivalent!

Transformed Problem: Min: f( x ) Subject to: 1) g n (  ) ≤ 0, n in {1,…,N} 2)  m = x m, for all m 3)  (t) in X, for all t 4) All Queues Stable Auxiliary Variables:  (t) =(  1 (t), …,  M (t)). These act as a proxy for x(t) = (x 1 (t), …, x M (t)). Constraints in the new problem are time averages of functions, not functions of time averages! And the problems are equivalent! Define Virtual Queue for each inequality and equality constraint  (t) = vector of virtual and actual queues. Use Quadratic Lyapunov function, Drift =  (t) Use Min Drift-Plus-Penalty… Solving the Problem via a Transformation: Next Step: Lyapunov Optimization:

Use a “Primal” Derivative in Drift-Plus-Penalty: ∂ f( x(t) ) ∂ x m m x m (  (t),  (t))  (t) + V Every slot t, observe  (t) and current queues  (t). Choose  (t) in A  (t),  (t) in X to minimize… where x(t) = (x 1 (t), …., x M (t)) = Empirical Running Time Avg. up to time t (starting from time 0) Note: “Pure Dual” Algorithm Minimizes  (t) + Vf(  (t)), does not need running time average, is more robust to varying parameters and provides stronger guarantees, but only works for convex f() functions! Doesn’t need knowledge of traffic or channel statistics! Can “approx” minimize to within constant C of infimum.

Theorem: Assuming the constraints are feasible, then for any parameter choice V ≥ 0, we have: 1.All required constraints are satisfied. 2.All queues strongly stable with: E{Delay} ≤ O(V) 3.Assuming the attribute vector converges with prob. 1, then Time Average Attribute vector is a “Near-Local-Min”: ∂ f( x(t) ) ∂ x m m x m (  (t),  (t))  (t) + V ∂ f( x ) ∂ x m m (x m - x m ) * ≥ -(B +C)/V where x* = (x 1 *, …, x M *) is any other feasible time average vector

Proof Sketch: Very Simple Proof! Because we take actions to minimize the drift-plus-penalty every slot (given current queue states) to within C, we have: ∂ f( x(t) ) ∂ x m m x m (  (t),  (t))  (t) + V ∂ f( x(t) ) ∂ x m m x m (  (t),   (t))   (t) + V ≤ C + where   (t) and   (t) are the drift and decision under any other (possibly randomized) decision choices! But for any feasible time average vector x*, there are choices that make the drift zero (plus a constant B that is Independent of queue state)…so….

Proof Sketch: Very Simple Proof! Because we take actions to minimize the drift-plus-penalty every slot (given current queue states) to within C, we have: ∂ f( x(t) ) ∂ x m m x m (  (t),  (t))  (t) + V ∂ f( x(t) ) ∂ x m m x m (  (t),   (t))   (t) + V ≤ C + xmxm B * where   (t) and   (t) are the drift and decision under any other (possibly randomized) decision choices! But for any feasible time average vector x*, there are choices that make the drift zero (plus a constant B that is Independent of queue state)…so….

Proof Sketch: Very Simple Proof! Because we take actions to minimize the drift-plus-penalty every slot (given current queue states) to within C, we have: ∂ f( x(t) ) ∂ x m m x m (  (t),  (t))  (t) + V ∂ f( x(t) ) ∂ x m m xmxm ≤ C + B + V * The rest follows by (see [Georgiadis, Neely, Tassiulas, F&T 2006)]: Iterated Expectations: E{E{X|Y}} = E{X} Telescoping Sums: [f(4) – f(3)] + [f(3) –f(2)] + [f(2) – f(1)] + [f(1) – f(0)] = f(4) – f(0) Rearranging Terms and Taking Limits

Extension 1: Using a “Variable V(t)” algorithm with increasing V(t): V(t) = (1+t) d (for 0 < d < 1) gives a true local min: ∂ f( x ) ∂ x m m (x m - x m ) * ≥ 0 where x* = (x 1 *, …, x M *) is any other feasible time average vector All Constraints are still satisfied with this Variable-V algorithm. However, queues are only “mean rate stable” (input rate = output rate) and have infinite average congestion and delay!

Extension 2: A 3-phase algorithm in special case when Utility function  (x) is entrywise non-decreasing: Phase 1: Pick Directions {  1, …,  N }. Solve the convex stochastic net opt problem via pure dual method: Maximize:  Subject to: 1) x =  n 2) desired constraints 3) All queues stable Unknown “Attribute Region” Phase 2: Solve (to a local min) the deterministic problem: Max:  (x 1,…,x M ) S.t.: (x 1,…, x M ) in Conv{  1  1  n  n  optimal x*

Extension 2: A 3-phase algorithm in special case when Utility function  (x) is entrywise non-decreasing: Phase 3: Solve the convex stochastic net opt problem via pure dual method: Maximize:  Subject to: 1) x =  x  2) desired constraints 3) All queues stable x* This involves 1 purely deterministic non-convex phase (any static solver can be used) and 2 purely convex stochastic network optimizations!

Conclusions: We have studied techniques for non-convex stochastic network optimization. “Primal-Dual” partial derivative info used with Drift-Plus-Penalty metric for achieving local min. Requires a running time average, not as robust to changes, convergence time issues unclear Second approach uses 3-phases, the stochastic parts are purely convex, and we can use the pure-dual method to provide stronger performance guarantees.

Some Possible Questions: 1) Why do we use auxiliary variables? They allow treatment of the abstract set constraint They allow the constraints of the problem to be transformed into constraints on time averages of functions, rather than functions of time averages. This enables explicit bounds on convergence times. It also ensures the constraint satisfaction is robust to system changes, even if the non-convex utility optimization is not.

Some Possible Questions: 2) How is the first method different from prior stochastic primal-dual methods? We use auxiliary variables We treat the convex inequality constraints via a “pure-dual” (no derivatives) to get stronger proof that all constraints are met, and to within a known convergence time We treat abstract set constraints We treat the non-convex problem (the lack of convergence time knowledge for the utility part is due to the “primal” component, but this is the price of treating non-convex problems!) We treat joint queue stability and utility optimization, with a proof that is even simpler than the fluid limit proof given for the special case of convex problems in Stolyar 05, 06.

Some Possible Questions: 3) Why do we consider the 3-phase algorithm? Uses 2 pure convex stochastic problems (and so the stochastic parts have stronger and more explicit convergence time guarantees, do not require derivatives to exist). The 1 non-convex optimization is a pure deterministic problem, from which we can use any known deterministic solver (such as “brute force,” or “Nelder-Mead,” or “Newton-type” methods that do not necessarily restrict to small step sizes.