Stochastic Optimization for Markov Modulated Networks with Application to Delay Constrained Wireless Scheduling Michael J. Neely University of Southern.

Slides:

Advertisements

Similar presentations

Optimal Pricing in a Free Market Wireless Network Michael J. Neely University of Southern California *Sponsored in part.

Advertisements

Network Utility Maximization over Partially Observable Markov Channels 1 1 Channel State 1 = ? Channel State 2 = ? Channel State 3 = ? Restless.

Stochastic optimization for power-aware distributed scheduling Michael J. Neely University of Southern California t ω(t)

Dynamic Data Compression in Multi-hop Wireless Networks Abhishek B. Sharma (USC) Collaborators: Leana Golubchik Ramesh Govindan Michael J. Neely.

Delay Reduction via Lagrange Multipliers in Stochastic Network Optimization Longbo Huang Michael J. Neely WiOpt *Sponsored in part by NSF.

Resource Allocation in Wireless Networks: Dynamics and Complexity R. Srikant Department of ECE and CSL University of Illinois at Urbana-Champaign.

DYNAMIC POWER ALLOCATION AND ROUTING FOR TIME-VARYING WIRELESS NETWORKS Michael J. Neely, Eytan Modiano and Charles E.Rohrs Presented by Ruogu Li Department.

Stochastic Network Optimization with Non-Convex Utilities and Costs Michael J. Neely University of Southern California

Intelligent Packet Dropping for Optimal Energy-Delay Tradeoffs for Wireless Michael J. Neely University of Southern California

Dynamic Product Assembly and Inventory Control for Maximum Profit Michael J. Neely, Longbo Huang (University of Southern California) Proc. IEEE Conf. on.

An Introduction to Markov Decision Processes Sarah Hickmott

Infinite Horizon Problems

Dynamic Index Coding Broadcast Station N N Michael J. Neely, Arash Saber Tehrani, Zhen Zhang University of Southern California Paper available.

Universal Scheduling for Networks with Arbitrary Traffic, Channels, and Mobility Michael J. Neely, University of Southern California Proc. IEEE Conf. on.

Kuang-Hao Liu et al Presented by Xin Che 11/18/09.

Efficient Algorithms for Renewable Energy Allocation to Delay Tolerant Consumers Michael J. Neely, Arash Saber Tehrani, Alexandros G. Dimakis University.

Utility Optimization for Dynamic Peer-to-Peer Networks with Tit-for-Tat Constraints Michael J. Neely, Leana Golubchik University of Southern California.

Stock Market Trading Via Stochastic Network Optimization Michael J. Neely (University of Southern California) Proc. IEEE Conf. on Decision and Control.

Delay-Based Network Utility Maximization Michael J. Neely University of Southern California IEEE INFOCOM, San Diego, March.

Dynamic Optimization and Learning for Renewal Systems Michael J. Neely, University of Southern California Asilomar Conference on Signals, Systems, and.

Dynamic Optimization and Learning for Renewal Systems -- With applications to Wireless Networks and Peer-to-Peer Networks Michael J. Neely, University.

Max Weight Learning Algorithms with Application to Scheduling in Unknown Environments Michael J. Neely University of Southern California

Dynamic Data Compression for Wireless Transmission over a Fading Channel Michael J. Neely University of Southern California CISS 2008 *Sponsored in part.

*Sponsored in part by the DARPA IT-MANET Program, NSF OCE Opportunistic Scheduling with Reliability Guarantees in Cognitive Radio Networks Rahul.

Multi-Hop Networking with Hard Delay Constraints Michael J. Neely, University of Southern California DARPA IT-MANET Presentation, January 2011 PDF of paper.

Cross Layer Adaptive Control for Wireless Mesh Networks (and a theory of instantaneous capacity regions) Michael J. Neely, Rahul Urgaonkar University of.

CISS Princeton, March Optimization via Communication Networks Matthew Andrews Alcatel-Lucent Bell Labs.

1 40 th Annual CISS 2006 Conference on Information Sciences and Systems Some Optimization Trade-offs in Wireless Network Coding Yalin E. Sagduyu Anthony.

Reinforcement Learning Yishay Mansour Tel-Aviv University.

Scheduling of Wireless Metering for Power Market Pricing in Smart Grid Husheng Li, Lifeng Lai, and Robert Caiming Qiu. "Scheduling of Wireless Metering.

Optimal Energy and Delay Tradeoffs for Multi-User Wireless Downlinks Michael J. Neely University of Southern California

A Lyapunov Optimization Approach to Repeated Stochastic Games Michael J. Neely University of Southern California Proc.

Resource Allocation for E-healthcare Applications

Multiple-access Communication in Networks A Geometric View W. Chen & S. Meyn Dept ECE & CSL University of Illinois.

DaVinci: Dynamically Adaptive Virtual Networks for a Customized Internet Jennifer Rexford Princeton University With Jiayue He, Rui Zhang-Shen, Ying Li,

EE 685 presentation Distributed Cross-layer Algorithms for the Optimal Control of Multi-hop Wireless Networks By Atilla Eryılmaz, Asuman Özdağlar, Devavrat.

Optimal Backpressure Routing for Wireless Networks with Multi-Receiver Diversity Michael J. Neely University of Southern California

Delay Analysis for Maximal Scheduling in Wireless Networks with Bursty Traffic Michael J. Neely University of Southern California INFOCOM 2008, Phoenix,

By Avinash Sridrahan, Scott Moeller and Bhaskar Krishnamachari.

CS774. Markov Random Field : Theory and Application Lecture 21 Kyomin Jung KAIST Nov

Markov Decision Processes1 Definitions; Stationary policies; Value improvement algorithm, Policy improvement algorithm, and linear programming for discounted.

1 ECE-517 Reinforcement Learning in Artificial Intelligence Lecture 7: Finite Horizon MDPs, Dynamic Programming Dr. Itamar Arel College of Engineering.

Michael J. Neely, University of Southern California CISS, Princeton University, March 2012 Wireless Peer-to-Peer Scheduling.

1 Markov Decision Processes Infinite Horizon Problems Alan Fern * * Based in part on slides by Craig Boutilier and Daniel Weld.

Michael J. Neely, University of Southern California CISS, Princeton University, March 2012 Asynchronous Scheduling for.

Utility Maximization for Delay Constrained QoS in Wireless I-Hong Hou P.R. Kumar University of Illinois, Urbana-Champaign 1 /23.

Dynamic Programming Applications Lecture 6 Infinite Horizon.

Stochastic Optimal Networking: Energy, Delay, Fairness Michael J. Neely University of Southern California

Reinforcement Learning Yishay Mansour Tel-Aviv University.

1 Markov Decision Processes Infinite Horizon Problems Alan Fern * * Based in part on slides by Craig Boutilier and Daniel Weld.

DaVinci: Dynamically Adaptive Virtual Networks for a Customized Internet Jiayue He, Rui Zhang-Shen, Ying Li, Cheng-Yen Lee, Jennifer Rexford, and Mung.

Energy-Aware Wireless Scheduling with Near Optimal Backlog and Convergence Time Tradeoffs Michael J. Neely University of Southern California INFOCOM 2015,

Super-Fast Delay Tradeoffs for Utility Optimal Scheduling in Wireless Networks Michael J. Neely University of Southern California

ITMANET PI Meeting September 2009 ITMANET Nequ-IT Focus Talk (PI Neely): Reducing Delay in MANETS via Queue Engineering.

Information Theory for Mobile Ad-Hoc Networks (ITMANET): The FLoWS Project Competitive Scheduling in Wireless Networks with Correlated Channel State Ozan.

Fairness and Optimal Stochastic Control for Heterogeneous Networks Time-Varying Channels     U n (c) (t) R n (c) (t) n (c) sensor.

Order Optimal Delay for Opportunistic Scheduling In Multi-User Wireless Uplinks and Downlinks Michael J. Neely University of Southern California

14 th INFORMS Applied Probability Conference, Eindhoven July 9, 2007 Yoni Nazarathy Gideon Weiss University of Haifa Yoni Nazarathy Gideon Weiss University.

Delay Analysis for Max Weight Opportunistic Scheduling in Wireless Systems Michael J. Neely --- University of Southern California

Energy Optimal Control for Time Varying Wireless Networks Michael J. Neely University of Southern California

Information Theory for Mobile Ad-Hoc Networks (ITMANET): The FLoWS Project Collision Helps! Algebraic Collision Recovery for Wireless Erasure Networks.

Asynchronous Control for Coupled Markov Decision Systems Michael J. Neely University of Southern California Information Theory Workshop (ITW) Lausanne,

Online Fractional Programming for Markov Decision Systems

Scheduling Algorithms for Multi-Carrier Wireless Data Systems

Delay Efficient Wireless Networking

energy requests a(t) renewable source s(t) non-renewable source x(t)

Utility Optimization with “Super-Fast”

Javad Ghaderi, Tianxiong Ji and R. Srikant

Optimal Control for Generalized Network-Flow Problems

Presentation transcript:

Stochastic Optimization for Markov Modulated Networks with Application to Delay Constrained Wireless Scheduling Michael J. Neely University of Southern California Proc. 48 th IEEE Conf. on Decision and Control (CDC), Dec *Sponsored in part by the DARPA IT-MANET Program, NSF OCE , NSF Career CCF A 1 (t) A 2 (t) A L (t) State 1 State 2 State 3 Control-Dependent Transition Probabilities

Motivating Problem: Delay Constrained Opportunistic Scheduling A 2 (t) A K (t) A 1 (t) S 1 (t) S 2 (t) S K (t) Status Quo: Lyapunov Based Max-Weight: [Georgiadis, Neely, Tassiulas F&T 2006] Treats stability/energy/thruput-utility with low complexity Cannot treat average delay constraints Dynamic Programming / Markov Decision (MDP) Theory: Curse of Dimensionality Need to know Traffic/Channel Probabilities

Insights for Our New Approach: Combine Lyapunov/Max-Weight Theory with Renewals/MDP A 2 (t) A K (t) A M (t) A 1 (t) S 1 (t) S 2 (t) S K (t) A K+1 (t) S K+1 (t) S M (t) Consider “Small” number of Control-Driven Markov States K Queues with Avg. Delay Constraints (K “small”) N Queues with Stability Constraints (N arbitrarily large)  Lyapunov Functions  Max-Weight Theory  Virtual Queues  Lyapunov Functions  Max-Weight Theory  Virtual Queues  Renewal Theory  Stochastic Shortest Paths  MDP Theory  Renewal Theory  Stochastic Shortest Paths  MDP Theory Example : Delay Constrained Not Delay Constrained

Key Results: Unify Lyapunov/Max-Weight Theory with Renewals/MDP “Weighted Stochastic Shortest Path (WSSP)” “Max Weight (MW)” Treat General Markov Decision Networks Use Lyapunov Analysis and Virtual Queues to Optimize and Compute Performance Bounds Use Existing SSP Approx Algs (Robbins-Monro) to Implement For Example Delay Problem: Meet all K Average Delay Constraints, Stabilize all N other queues Utility close to optimal, with tradeoff in delay of N other queues All Delays and Convergence Times are polynomial in (N+K) Per-Slot Complexity geometric in K

General Problem Formulation: (slotted time t = {0,1,2,…}) Q n (t) = Collection of N queues to be stabilized S(t) = Random Event (e.g. random traffic, channels) Z(t) = Markov State Variable (|Z| states) I(t) = Control Action (e.g. service, resource alloc.) x m (t) = Additional Penalties Incurred by action on slot t Q n (t) R n (t)  n (t)  n (t) =  n (I(t), S(t), Z(t)) R n (t) = R n (I(t), S(t), Z(t)) x m (t) = x m (I(t), S(t), Z(t)) Z(t) Z(t+1) I(t), S(t) State 1 State 2 State 3 Control-Dependent Transition Probs: General functions for  (t), R(t), x(t):

General Problem Formulation: (slotted time t = {0,1,2,…}) Q n (t) = Collection of N queues to be stabilized S(t) = Random Event (e.g. random traffic, channels) Z(t) = Markov State Variable (|Z| states) I(t) = Control Action (e.g. service, resource alloc.) x m (t) = Additional Penalties Incurred by action on slot t Q n (t) R n (t)  n (t)  n (t) =  n (I(t), S(t), Z(t)) R n (t) = R n (I(t), S(t), Z(t)) x m (t) = x m (I(t), S(t), Z(t)) General functions for  (t), R(t), x(t): Goal: Minimize: x 0 Subject to: x m < x m av, all m Q m stable, all m

Applications of this Formulation: For K of the queues, let: Z(t) = (Q 1 (t), …, Q K (t)) These K have Finite Buffer: Q k (t) in {0, 1, …, B max } Cardinality of states: |Z| = (B max +1) K Recall: Penalties have the form: x m (t) = x m (I(t), S(t), Z(t)) 1)Penalty for Congestion: Define Penalty: x k (t) = Z k (t) Can then do one of the following (for example): Minimize: x k Minimize: x 1 + … + x K Constraints: x k < x k av

Applications of this Formulation: For K of the queues, let: Z(t) = (Q 1 (t), …, Q K (t)) These K have Finite Buffer: Q k (t) in {0, 1, …, B max } Cardinality of states: |Z| = (B max +1) K Recall: Penalties have the form: x m (t) = x m (I(t), S(t), Z(t)) 2) Penalty for Packet Drops: Define Penalty: x k (t) = Drops k (t) Can then do one of the following (for example): Minimize: x k Minimize: x 1 + … + x K Constraints: x k < x k av

Applications of this Formulation: For K of the queues, let: Z(t) = (Q 1 (t), …, Q K (t)) These K have Finite Buffer: Q k (t) in {0, 1, …, B max } Cardinality of states: |Z| = (B max +1) K Recall: Penalties have the form: x m (t) = x m (I(t), S(t), Z(t)) 3)A Nice Trick for Average Delay Constraints: Suppose we want: W < 5 slots : Define Penalty: x k (t) = Q k (t) – 5 x Arrivals k (t) Then by Little’s Theorem… x k < 0 equivalent to: Q k – 5 x k < 0 equivalent to: W k x k – 5 x k < 0 equivalent to: W k < 5

Solution to the General Problem: Minimize: x 0 Subject to: x m < x m av, all m Q k stable, all k Define Virtual Queues for Each Penalty Constraint: Define Lyapunov Function: L(t) = Q k (t) 2 + Y m (t) 2 Y m (t) x m (t) x m av

Solution to the General Problem: Define Forced Renewals every slot i.i.d. probability  State 1 State 2 State 3 Renewal State 0 Example for K Delay-Constrained Queue Problem: Every slot, with probability , drop all packets in all K Delay-Constrained Queues (loss rate < B max  ) Renewals “Reset” the system

Solution to the General Problem: Define Variable Slot Lyapunov Drift over Renewal Period  T (Q(t), Y(t)) = E{L(t+T) – L(t)| Q(t), Y(t)} where T = Random Renewal Period Duration tt+T Control Rule: Every Renewal time t, observe queues, Take action to Min the following over 1 Renewal Period: Minimize:  T (Q(t), Y(t)) + VE{ x 0 (  ) | Q(t), Y(t)}  =t t+T-1 *Generalizes our previous max-weight rule from [F&T 2006] !

 =t t+T-1 Minimize:  T (Q(t), Y(t)) + VE{ x 0 (  ) | Q(t), Y(t)} Max-Weight (MW) Weighted Stochastic Shortest Path (WSSP) Suppose we implement a (C,  )-approximate SSP, so that every renewal period we have… Achieved Cost < Optimal SSP + C +  [ Q k + Y m + V] Can achieve this using approximate DP Theory, Neurodynamic Programming, etc., (see [Bertsekas, Tsitsiklis Neurodynamic Programming]) together with a Delayed-Queue-Analysis.

Theorem: If there exists a policy that meets all Constraints with “  max slackness,” then any (C,  ) approximate SSP implementation yields: 1)All (virtual and actual) Queues Stable, and: E{Q sum } < (B/  + C  ) + V(  + x max )  max -  2)All Time Average Constraints are satisfied ( x m < x m av ) 3)Time Average Cost satisfies: x 0 < x 0 (optimal) + (B/  + C  ) +  x max /  max ) V (recall that  = forced renewal probability)

Proof Sketch: (Consider exact SSP for simplicity)  =t t+T-1  T (Q(t), Y(t)) + VE{ x 0 (  ) | Q(t), Y(t)} < B + VE{ x 0 (  ) | Q(t), Y(t)}  =t t+T-1 - Q k (t)E{ [  k (  ) – R k (  )] | Q(t), Y(t)}  =t t+T-1 - Y m (t)E{ [x m av – x m (  )] | Q(t), Y(t)}  =t t+T-1 [We take control action to minimize the Right Hand Side above over the Renewal Period. This is the Weighted SSP problem of interest]

Proof Sketch: (Consider exact SSP for simplicity)  =t t+T-1  T (Q(t), Y(t)) + VE{ x 0 (  ) | Q(t), Y(t)} < B + VE{ x 0 *(  ) | Q(t), Y(t)}  =t t+T-1 - Q k (t)E{ [  k *(  ) – R k *(  )] | Q(t), Y(t)}  =t t+T-1 - Y m (t)E{ [x m av – x m *(  )] | Q(t), Y(t)}  =t t+T-1 [We can thus plug in any alternative control policy in the Right Hand Side, including the one that yields the optimum time average subject to all time average constraints]

Proof Sketch: (Consider exact SSP for simplicity)  =t t+T-1  T (Q(t), Y(t)) + VE{ x 0 (  ) | Q(t), Y(t)} < B + VE{ x 0 *(  ) | Q(t), Y(t)}  =t t+T-1 - Q k (t)E{ [  k *(  ) – R k *(  )] | Q(t), Y(t)}  =t t+T-1 - Y m (t)E{ [x m av – x m *(  )] | Q(t), Y(t)}  =t t+T-1 [Note by RENEWAL THEORY, the infinite horizon time average is exactly achieved over any renewal period ]

Proof Sketch: (Consider exact SSP for simplicity)  =t t+T-1  T (Q(t), Y(t)) + VE{ x 0 (  ) | Q(t), Y(t)} < B + VE{ x 0 *(  ) | Q(t), Y(t)}  =t t+T-1 - Q k (t)E{ [  k *(  ) – R k *(  )] | Q(t), Y(t)}  =t t+T-1 - Y m (t)E{ [x m av – x m *(  )] | Q(t), Y(t)}  =t t+T X 0 (optimum) E{T} [Note by RENEWAL THEORY, the infinite horizon time average is exactly achieved over any renewal period ]

Proof Sketch: (Consider exact SSP for simplicity)  =t t+T-1  T (Q(t), Y(t)) + VE{ x 0 (  ) | Q(t), Y(t)} < B + VX 0 (optimum) E{T} [Sum the resulting telescoping series to get the utility performance bound! ]

Implementation of Approximate Weighted SSP: Use a simple 1-step Robbins-Monro Iteration with past history Of W samples {S(t 1 ), S(t 2 ), …, S(t W )}. To avoid subtle correlations between samples and queue weights, use a Delayed Queue Analysis. Algorithm requires no a-priori knowledge of statistics, and takes roughly |Z| operations per slot to perform Robbins-Monro. Convergence and Delay are log(|Z|). For K Delay constrained queues, |Z| = B max K (geometric in K). Can modify implementation for constant per-slot complexity, but then convergence time is geometric in K. (Either way, we want K small).

Conclusions: Treat general Markov Decision Networks Generalize Max-Weight/Lyapunov Optimization to Min Weighted Stochastic Shortest Path (W-SSP) Can solve delay constrained network problems: Convergence Times, Delays Polynomial in (N+K) Per-Slot Computation Complexity of Solving Robbins-Monro is geometric in K. (want K small) A 1 (t) A 2 (t) A L (t) State 1 State 2 State 3 Control-Dependent Transition Probabilities