Traffic Engineering for ISP Networks Jennifer Rexford Computer Science Department Princeton University

Slides:



Advertisements
Similar presentations
Computer Science Department (Dipartimento di Informatica e Sistemistica - DIS), University of Napoli Federico II – Comics Group Intra-domain Traffic Engineering.
Advertisements

1 Traffic Engineering (TE). 2 Network Congestion Causes of congestion –Lack of network resources –Uneven distribution of traffic caused by current dynamic.
1 EL736 Communications Networks II: Design and Algorithms Class3: Network Design Modeling Yong Liu 09/19/2007.
Fundamentals of Computer Networks ECE 478/578 Lecture #18: Policy-Based Routing Instructor: Loukas Lazos Dept of Electrical and Computer Engineering University.
1 EL736 Communications Networks II: Design and Algorithms Class8: Networks with Shortest-Path Routing Yong Liu 10/31/2007.
Dynamic routing – QoS routing Other approaches to QoS routing Traffic Engineering Practical Traffic Engineering.
Network Architecture for Joint Failure Recovery and Traffic Engineering Martin Suchara in collaboration with: D. Xu, R. Doverspike, D. Johnson and J. Rexford.
TIE Breaking: Tunable Interdomain Egress Selection Renata Teixeira Laboratoire d’Informatique de Paris 6 Université Pierre et Marie Curie with Tim Griffin.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Traffic Engineering With Traditional IP Routing Protocols
1 Route Control Platform Making the Network Act Like One Big Router Jennifer Rexford Princeton University
1 Adapting Routing to the Traffic COS 461: Computer Networks Spring 2006 (MW 1:30-2:50 in Friend 109) Jennifer Rexford Teaching Assistant: Mike Wawrzoniak.
Traffic Engineering Jennifer Rexford Advanced Computer Networks Tuesdays/Thursdays 1:30pm-2:50pm.
1 Traffic Engineering for ISP Networks Jennifer Rexford IP Network Management and Performance AT&T Labs - Research; Florham Park, NJ
Traffic Engineering in IP Networks Jennifer Rexford Computer Science Department Princeton University; Princeton, NJ
Traffic Engineering for ISP Networks Jennifer Rexford Computer Science Department Princeton University
Traffic Engineering for ISP Networks
Traffic Engineering for ISP Networks Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ
Network Protocols Designed for Optimizability Jennifer Rexford Princeton University
Traffic Measurement for IP Operations Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ
Traffic Engineering for ISP Networks Jennifer Rexford Computer Science Department Princeton University
Traffic Measurement for IP Operations Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ
Internet Routing (COS 598A) Today: Intradomain Traffic Engineering Jennifer Rexford Tuesdays/Thursdays.
Traffic Engineering for ISP Networks Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ
Traffic Engineering for ISP Networks Jennifer Rexford Computer Science Department Princeton University
Link-State Routing Reading: Sections 4.2 and COS 461: Computer Networks Spring 2011 Mike Freedman
Rethinking Internet Traffic Management: From Multiple Decompositions to a Practical Protocol Jiayue He Princeton University Joint work with Martin Suchara,
Network Monitoring for Internet Traffic Engineering Jennifer Rexford AT&T Labs – Research Florham Park, NJ 07932
Routing.
Spring Routing & Switching Umar Kalim Dept. of Communication Systems Engineering 06/04/2007.
Backbone Networks Jennifer Rexford COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
1 Traffic Engineering for ISP Networks Jennifer Rexford IP Network Management and Performance AT&T Labs - Research; Florham Park, NJ
Multipath Protocol for Delay-Sensitive Traffic Jennifer Rexford Princeton University Joint work with Umar Javed, Martin Suchara, and Jiayue He
Óscar González de Dios PCE, the magic component of Segment Routing Telefónica I+D.
Intradomain routing Lecture 4 CS 653, Fall Routing as a shortest path problem u y x wv z Shortest path between u and z = uxyz.
Building a Strong Foundation for a Future Internet Jennifer Rexford ’91 Computer Science Department (and Electrical Engineering and the Center for IT Policy)
Jennifer Rexford Princeton University MW 11:00am-12:20pm Wide-Area Traffic Management COS 597E: Software Defined Networking.
AGG-NANOG IP Network Traffic Engineering Albert Greenberg Internet and Networking Systems Research Lab AT&T Labs - Research; Florham Park, NJ See.
Traffic Matrix Estimation for Traffic Engineering Mehmet Umut Demircin.
ROUTING ON THE INTERNET COSC Aug-15. Routing Protocols  routers receive and forward packets  make decisions based on knowledge of topology.
Tomo-gravity Yin ZhangMatthew Roughan Nick DuffieldAlbert Greenberg “A Northern NJ Research Lab” ACM.
Cost-Performance Tradeoffs in MPLS and IP Routing Selma Yilmaz Ibrahim Matta Boston University.
Transit Traffic Engineering Nick Feamster CS 6250: Computer Networks Fall 2011.
Network Sensitivity to Hot-Potato Disruptions Renata Teixeira (UC San Diego) with Aman Shaikh (AT&T), Tim Griffin(Intel),
DaVinci: Dynamically Adaptive Virtual Networks for a Customized Internet Jennifer Rexford Princeton University With Jiayue He, Rui Zhang-Shen, Ying Li,
1 Meeyoung Cha, Sue Moon, Chong-Dae Park Aman Shaikh Placing Relay Nodes for Intra-Domain Path Diversity To appear in IEEE INFOCOM 2006.
Authors Renata Teixeira, Aman Shaikh and Jennifer Rexford(AT&T), Tim Griffin(Intel) Presenter : Farrukh Shahzad.
1 Pertemuan 20 Teknik Routing Matakuliah: H0174/Jaringan Komputer Tahun: 2006 Versi: 1/0.
Lecture 15. IGP and MPLS D. Moltchanov, TUT, Spring 2008 D. Moltchanov, TUT, Spring 2015.
1 © 2004, Cisco Systems, Inc. All rights reserved. CCNA 2 v3.1 Module 6 Routing and Routing Protocols.
Shannon Lab 1AT&T – Research Traffic Engineering with Estimated Traffic Matrices Matthew Roughan Mikkel Thorup
Traffic Engineering for ISP Networks Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ
Chi-Cheng Lin, Winona State University CS 313 Introduction to Computer Networking & Telecommunication Chapter 5 Network Layer.
Traffic Engineering for ISP Networks Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ
Intradomain Routing What is Routing? A famous quotation from RFC 791 “A name indicates what we seek. An address indicates where it is. A route indicates.
Controlling the Impact of BGP Policy Changes on IP Traffic Jennifer Rexford IP Network Management and Performance AT&T Labs – Research; Florham Park, NJ.
DaVinci: Dynamically Adaptive Virtual Networks for a Customized Internet Jiayue He, Rui Zhang-Shen, Ying Li, Cheng-Yen Lee, Jennifer Rexford, and Mung.
Intradomain Traffic Engineering By Behzad Akbari These slides are based in part upon slides of J. Rexford (Princeton university)
Jennifer Rexford Fall 2010 (TTh 1:30-2:50 in COS 302) COS 561: Advanced Computer Networks Backbone.
1 7-Jan-16 S Ward Abingdon and Witney College Dynamic Routing CCNA Exploration Semester 2 Chapter 3.
1 An Arc-Path Model for OSPF Weight Setting Problem Dr.Jeffery Kennington Anusha Madhavan.
Mike Freedman Fall 2012 COS 561: Advanced Computer Networks Traffic Engineering.
1 Slides by Yong Liu 1, Deep Medhi 2, and Michał Pióro 3 1 Polytechnic University, New York, USA 2 University of Missouri-Kansas City, USA 3 Warsaw University.
Spring 2000CS 4611 Routing Outline Algorithms Scalability.
1 Traffic Engineering By Kavitha Ganapa. 2 Introduction Traffic engineering is concerned with the issue of performance evaluation and optimization of.
Internet Traffic Engineering Motivation: –The Fish problem, congested links. –Two properties of IP routing Destination based Local optimization TE: optimizing.
Intra-Domain Routing Jacob Strauss September 14, 2006.
COS 561: Advanced Computer Networks
Backbone Traffic Engineering
Traffic Engineering for ISP Networks
Presentation transcript:

Traffic Engineering for ISP Networks Jennifer Rexford Computer Science Department Princeton University

Outline  Internet routing –Overview of the Internet routing architecture –Shortest-path link-state routing between edge routers  Optimization: Tune routing to the traffic –Optimizing routing given a topology and traffic matrix –Local search to select the integer link weights  Design for optimizability: Design routing protocol –Optimal traffic engineering with link-state routing  Tomography: Infer the traffic matrix –Estimating traffic matrix from routing and link load

Internet Routing

Autonomous Systems (ASes)  Internet is divided into Autonomous Systems –Distinct regions of administrative control –Routers/links managed by a single “institution” –Service provider, company, university, …  Hierarchy of Autonomous Systems –Large, tier-1 provider with a nationwide backbone –Medium-sized regional provider with smaller backbone –Small network run by a single company or university  Cooperate to ensure end-to-end reachability

Interdomain Routing  AS-level topology –Destinations are IP prefixes (e.g., /8) –Nodes are Autonomous Systems (ASes) –Edges are links and business relationships Client Web server

Example Backbone: Abilene Internet2 Newtork

Points-of-Presence (PoPs)  Inter-PoP links –Long distances –High bandwidth  Intra-PoP links –Short cables between racks or floors –Aggregated bandwidth  Links to other networks –Wide range of media and bandwidth Intra-PoP Other networks Inter-PoP

Intradomain Routing: Shortest-Path Routing  Path-selection model –Destination-based –Load-insensitive (e.g., static link weights) –Minimum hop count or sum of link weights

Computing Shortest Paths: Link-State Routing  Topology discovery –Routers flood information to learn the topology –Each router constructs a link-state database

Computing Shortest Paths: Link-State Routing  Shortest-path computation –Each router runs Dijkstra’s shortest-path algorithm –Computes the “next hop” to reach other routers

Computing Shortest Paths: Link-State Routing  Packet forwarding –Each router maintains a forwarding table –To forward incoming packets to the right next-hop link

Our Focus: Traffic Engineering (TE)  Adjusting routing to the flow of traffic –How should network administrators run their networks? –Specifically, how should they set the link weights?  Designing protocols for better traffic engineering –How should future routing protocols be designed? –Specifically, how to make TE efficient and easy?  Collecting measurements of the offered traffic –How should administrators learn the traffic matrix? –Specifically, how to infer the matrix from link loads?

Optimization: Tuning Routing to the Traffic Joint work with Bernard Fortz and Mikkel Thorup

Link Weights Control the Flow of Traffic  Routers compute paths –Shortest paths as sum of link weights  Operators set the link weights –To control where the traffic goes 3

Heuristics for Setting the Link Weights  Proportional to physical distance –Cross-country links have higher weights than local ones –Minimizes end-to-end propagation delay  Inversely proportional to link capacity –Smaller weights for higher-bandwidth links –Attracts more traffic to links with more capacity  Tuned based on the offered traffic –Network-wide optimization of weights based on traffic –Directly minimizes key metrics like max link utilization

Why Are the Link Weights Static?  Strawman alternative: load-sensitive routing –Link metrics based on traffic load –Flood dynamic metrics as they change –Adapt automatically to changes in offered load  Reasons why this is typically not done –Delay-based routing unsuccessful in the early days –Oscillation as routers adapt to out-of-date information –Most Internet transfers are very short-lived  Research and standards work continues… –… but operators have to do what they can today

Big Picture: Measure, Model, and Control Topology/ Configuration Offered traffic Changes to the network Operational network Network-wide “what if” model measure control

Traffic Engineering in an ISP Backbone  Topology –Connectivity and capacity of routers and links  Traffic matrix –Offered load between points in the network  Link weights –Configurable parameters for Interior Gateway Protocol  Performance objective –Balanced load, low latency, service level agreements …  Question: Given the topology and traffic matrix in an IP network, which link weights should be used?

Key Ingredients of Our Approach  Measurement –Topology: monitoring of the routing protocols –Traffic matrix: widely deployed traffic measurement  Network-wide models –Representations of topology and traffic –“What-if” models of shortest-path routing  Network optimization –Efficient algorithms to find good configurations –Operational experience to identify key constraints

Formalizing the Optimization Problem  Input: graph G(R,L) –R is the set of routers –L is the set of unidirectional links –c l is the capacity of link l  Input: traffic matrix –M i,j is traffic load from router i to j  Output: setting of the link weights –w l is weight on unidirectional link l –P i,j,l is fraction of traffic from i to j traversing link l

Multiple Shortest Paths With Even Splitting Values of P i,j,l

Defining the Objective Function  Computing the link utilization – Link load: u l =  i,j M i,j P i,j,l – Utilization: u l /c l  Objective functions – min(max l (u l /c l )) – min(  l  f(u l /c l )) f(x) 1 x

Complexity of the Optimization Problem  NP-hard optimization problem –No efficient algorithm to find the link weights –Even for the simple convex objective functions  Why can’t we just do multi-commodity flow? –E.g., solve the multi-commodity flow problem… –… and the link weights pop out as the dual –Because IP routers cannot split arbitrarily over ties  What are the implications? –Have to resort to searching through weight settings

Optimization Based on Local Search  Start with an initial setting of the link weights –E.g., same integer weight on every link –E.g., weights inversely proportional to link capacity –E.g., existing weights in the operational network  Compute the objective function –Compute the all-pairs shortest paths to get P i,j,l –Apply the traffic matrix M i,j to get link loads u l –Evaluate the objective function from the u l /c l  Generate a new setting of the link weights repeat

Making the Search Efficient  Avoid repeating the same weight setting –Keep track of past values of the weight setting –… or keep a small signature (e.g., a hash) of past values –Do not evaluate a weight setting if signatures match  Avoid computing the shortest paths from scratch –Explore weight settings that changes just one weight –Apply fast incremental shortest-path algorithms  Limit the number of unique values of link weights –Do not explore all 2 16 possible values for each weight  Stop early, before exploring the whole search space

Incorporating Operational Realities  Minimize number of changes to the network –Changing just 1 or 2 link weights is often enough  Tolerate failure of network equipment –Weights settings usually remain good after failure –… or can be fixed by changing one or two weights  Limit dependence on measurement accuracy –Good weights remain good, despite random noise  Limit frequency of changes to the weights –Joint optimization for day and night traffic matrices

Application to AT&T’s Backbone Network  Performance of the optimized weights –Search finds a good solution within a few minutes –Much better than link capacity or physical distance –Competitive with multi-commodity flow solution  How AT&T changes the link weights –Maintenance done every night from midnight to 6am –Predict effects of removing link(s) from the network –Reoptimize the link weights to avoid congestion –Configure new weights before disabling equipment

Example from My Visit to AT&T’s Operations Center  Amtrak repairing/moving part of the train track –Need to move some of the fiber optic cables –Or, heightened risk of the cables being cut –Amtrak notifies us of the time the work will be done  AT&T engineers model the effects –Determine which IP links go over the affected fiber –Pretend the network no longer has these links –Evaluate the new shortest paths and traffic flow –Identify whether link loads will be too high

Example Continued  If load will be too high –Reoptimize the weights on the remaining links –Schedule the time for the new weights to be configured –Roll back to the old weight setting after Amtrak is done  Same process applied to other cases –Assessing the network’s risk to possible failures –Planning for maintenance of existing equipment –Adapting the link weights to installation of new links –Adapting the link weights in response to traffic shifts

Conclusions on Traffic Engineering  IP networks do not adapt on their own –Routers compute shortest paths based on static weights  Service providers need to adapt the weights –Due to failures, congestion, or planned maintenance  Leads to an interesting optimization problems –Optimize link weights based on topology and traffic  Optimization problem is computationally difficult –Forces the use of efficient local-search techniques  Results of the local search are pretty good –Near-optimal solutions that minimize disruptions

Ongoing Work  Robust link-weight assignments –Link/node failures –Range of traffic matrices  More complex routing models –Hot-potato routing –BGP routing policies  Interaction between ASes –Inter-AS negotiation for joint optimization –Grappling with scalability and trust issues

Design for Optimizability: Optimal Link-State Routing Protocol Joint work with Dahai Xu and Mung Chiang

Revisiting TE With Link-State Routing Protocols  Advantages of link weights –One parameter for each unidirectional link –Hop-by-hop forwarding (no tunneling, no per-flow state) –New routes computed automatically after failure –Changing just a few weights can alleviate congestion  Disadvantages of link weights –Computationally expensive optimization –Suboptimal distribution of traffic –(Disruptions when changing the link weights)

Example of Inefficient TE  Simple topology  Demand of 300 units: –All on top path: 300% utilization of top path –All on bottom path: 150% utilization of bottom path –Even splitting: 150% on top path, 75% on bottom s t c 1 = 100 c 2 = 200

Stepping Back: Design for Optimizability  Two research approaches –Bottom up: do the best with what you have –Top down: design systems that are easier to manage  Design for manage-ability –“If you are both the professor and the student, you create exam questions that are easy to answer.” – Mung Chiang  Knowing what we know now… –How should intradomain routing protocols work… –… to make TE more efficient and hopefully easier?

Optimal TE With Multicommodity Flow  Problem with shortest-path routing –Inflexible even splitting over shortest paths  Optimal distribution of traffic –Send traffic over any paths in any proportions –Using tunneling to force traffic on the paths –Realizable with MultiProtocol Label Switching (MPLS)  Disadvantage of MPLS: high overhead –Large number of paths between pairs of routers –Must adapt the splitting ratios after each failure

Can We Have Link-State Routing and Optimal TE?  Link-state routing and hop-by-hop forwarding –Single weight on each link –Local rule to compute splitting over paths –Each router forwards based only on the destination  Link-state routing != shortest-path routing –Routers could use other traffic-splitting rules –… as long as they are locally computable –… only from the link weights

Forward Packet Based on Link Weights  Available information at router u –w u,v :weights for all links –d u t : shortest distance from u to t –h u,v t : distance gap (d v t + w u,v – d u t ) distance gap of 0 distance gap of 1

Traffic-Splitting Function  Relative flow distributed on outgoing links –G(h u,v t ): proportion sent out link v toward t  Split traffic to t in proportion  Even splitting –G(h u,v t ) is 1 if h u,v t = 0 (all traffic on shortest paths) –G(h u,v t ) is 0 if h u,v t > 0 (no traffic on longer paths) G(h u,v t )  G(h u,j t )

Exponential Splitting  Exponentially diminishing traffic on longer paths –Proportion on path i proportional to exp(-p i ) –… where p i is the cost of path i

Optimal TE  A surprising result –This kind of link-state routing can achieve optimal TE  Optimality –Can realize the multicommodity flow traffic distribution –Expressible in terms of settings of link weights  Efficient algorithm –Computationally tractable to compute optimal weights –… for a given traffic matrix and capacitated topology

Intuition Behind the Theory Feasible flow routing Optimal flow routing Realizable with link-state routing

Finding Link-State Protocols That Achieve Optimal TE  Need an additional objective function –To find solutions expressible in terms of link weights  But we already have an objective function –So, how can we add another one?  First, solve the original optimization problem –To determine the load on each link at optimality –… i.e., the “necessary capacity” of each link  Then, solve a second optimization problem –On this new topology, with our new objective

TE Optimization Problem: Compute Necessary Capacity  Convex objective –Min sum f() over all links  Constraints –Flow conservation: must carry the traffic matrix –Capacity constraint: cannot exceed link capacity  Variables –Flow along each path  Given –Traffic matrix and link capacities

New Optimization Problem  Necessary link capacity –Flow on link u,v in the multicommdity-flow solution –… becomes the capacity of the link in the new problem  In the new optimization problem –Any feasible solution is “optimal” –… relative to the original optimization problem  So, now we can pick a new objective –Key intuition: maximizing “entropy”

Entropy Maximization  Assume we could enumerate all paths from s to t –(Though in practice this wouldn’t be practical)  Entropy –x k s,t : fraction of traffic from s to t put on path k –z(x) = - x * log(x): entropy function  New objective: maximize entropy –  M i,j (  z(x k s,t ))

High-Level Overview of the Details  NEM problem always has a solution –Earlier multicommodity flow solution  Solving directly is not efficient –Need to avoid enumerating all the paths  Solving with dual decomposition –Derivation leads to the exponential function –… for splitting traffic over the multiple paths  Derivation also leads to weight-setting algorithm –Computationally efficient, better than local search

Conclusions  Protocols induce optimization problems –E.g., setting link weights to do traffic engineering  Complexity of the optimization problems –A symptom that the protocol is not quite right –E.g., NP-hard problem and suboptimal traffic flow  Design for optimizability –Design the protocol to be easy to optimize –… using optimization theory as a protocol design tool

Tomography: Inferring the Traffic Matrix Work by Yin Zhang, Matthew Roughan, Nick Duffield, and Albert Greenberg

Computing the Traffic Matrix M i,j  Hard to measure the traffic matrix –IP networks transmit data as individual packets –Routers do not keep traffic statistics, except link utilization on (say) a five-minute time scale  Need to infer the traffic matrix M i,j from –Current topology G(R,L) –Current routing P i,j,l –Current link load u l –Link capacity c l

4Mbps 3Mbps5Mbps Inference: Network Tomography Sources Destinations From link counts to the traffic matrix

Tomography: Formalizing the Problem  Ingress-egress pairs –p is a ingress-egress pair of nodes (i,j) –x p is the (unknown) traffic volume for this pair M i,j  Routing –P lp is proportion of p’s traffic that traverses l  Links in the network –l is a unidirectional edge –u l is the observed traffic volume on this link  Relationship: u = Px (work backwards to get x)

Tomography: One Observation Not Enough  Linear system of n nodes is underdetermined –Number of links e is around O(n) –Number of ingress-egress pairs c is O(n 2 ) –Dimension of solution sub-space at least c - e  Multiple observations are needed –k independent observations (over time) –Stochastic model with Poisson iid counts –Maximum likelihood estimation to infer matrix  Doesn’t work all that well in practice…

Approach Used at AT&T: Tomo-gravity  Gravitational assumption –Ingress point a has traffic v i a –Egress point b has traffic v e b –Pair (a,b) has traffic proportional to v i a * v e b

Approach Used at AT&T: Tomo-gravity  Problem with gravity model –Gravity model ignores the load on the inside links –Gravity assumption isn’t always 100% correct –Resulting traffic matrix might not satisfy the link loads  Combining the two techniques –Gravity: find a traffic matrix using the gravity model –Tomography: find the family of traffic matrices consistent with all link load statistics –Tomo-gravity: find the tomography solution that is closest to the output of the gravity model  Works extremely well (and fast) in practice

Conclusions on Tomography  Routers don’t reveal much information about traffic –Measurement provides a network-wide view –E.g., network topology and traffic matrix  Available data induces a tomography problem –Input: network topology, routing, and link loads –Output: inferred traffic matrix  Design for tomography? –Design future monitoring systems to induce easier-to- solve tomography problems?

Conclusions  Internet routing is a rich problem space –Designed incrementally as Internet evolved –Not designed with network management in mind  Network management: bottom up –Working with what you have –Tuning link weights, and inferring traffic matrices  Exciting new area: design for manageability –Protocols that are easy to tune –Measurements that make inference easy