Traffic Engineering for ISP Networks Jennifer Rexford Computer Science Department Princeton University

Slides:



Advertisements
Similar presentations
Computer Science Department (Dipartimento di Informatica e Sistemistica - DIS), University of Napoli Federico II – Comics Group Intra-domain Traffic Engineering.
Advertisements

Routing Basics.
Fundamentals of Computer Networks ECE 478/578 Lecture #18: Policy-Based Routing Instructor: Loukas Lazos Dept of Electrical and Computer Engineering University.
1 Interdomain Routing Protocols. 2 Autonomous Systems An autonomous system (AS) is a region of the Internet that is administered by a single entity and.
Dynamic routing – QoS routing Other approaches to QoS routing Traffic Engineering Practical Traffic Engineering.
TIE Breaking: Tunable Interdomain Egress Selection Renata Teixeira Laboratoire d’Informatique de Paris 6 Université Pierre et Marie Curie with Tim Griffin.
Traffic Engineering With Traditional IP Routing Protocols
Internet Routing (COS 598A) Today: Addressing and Routing Jennifer Rexford Tuesdays/Thursdays 11:00am-12:20pm.
1 Adapting Routing to the Traffic COS 461: Computer Networks Spring 2006 (MW 1:30-2:50 in Friend 109) Jennifer Rexford Teaching Assistant: Mike Wawrzoniak.
Traffic Engineering Jennifer Rexford Advanced Computer Networks Tuesdays/Thursdays 1:30pm-2:50pm.
Interdomain Routing and The Border Gateway Protocol (BGP) Courtesy of Timothy G. Griffin Intel Research, Cambridge UK
Traffic Engineering for ISP Networks Jennifer Rexford Computer Science Department Princeton University
1 Traffic Engineering for ISP Networks Jennifer Rexford IP Network Management and Performance AT&T Labs - Research; Florham Park, NJ
Traffic Engineering in IP Networks Jennifer Rexford Computer Science Department Princeton University; Princeton, NJ
Traffic Engineering for ISP Networks Jennifer Rexford Computer Science Department Princeton University
Traffic Engineering for ISP Networks
Traffic Engineering for ISP Networks Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ
MIRED: Managing IP Routing is Extremely Difficult Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ
Network Protocols Designed for Optimizability Jennifer Rexford Princeton University
Traffic Measurement for IP Operations Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ
Traffic Measurement for IP Operations Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ
Internet Routing (COS 598A) Today: Intradomain Traffic Engineering Jennifer Rexford Tuesdays/Thursdays.
Traffic Engineering for ISP Networks Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ
Internet Routing (COS 598A) Today: Interdomain Traffic Engineering Jennifer Rexford Tuesdays/Thursdays.
Traffic Engineering for ISP Networks Jennifer Rexford Computer Science Department Princeton University
Routing Jennifer Rexford Advanced Computer Networks Tuesdays/Thursdays 1:30pm-2:50pm.
Network Monitoring for Internet Traffic Engineering Jennifer Rexford AT&T Labs – Research Florham Park, NJ 07932
Routing and Routing Protocols
Spring Routing & Switching Umar Kalim Dept. of Communication Systems Engineering 06/04/2007.
Backbone Networks Jennifer Rexford COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
1 Traffic Engineering for ISP Networks Jennifer Rexford IP Network Management and Performance AT&T Labs - Research; Florham Park, NJ
Hot Potatoes Heat Up BGP Routing Jennifer Rexford AT&T Labs—Research Joint work with Renata Teixeira, Aman Shaikh, and.
Building a Strong Foundation for a Future Internet Jennifer Rexford ’91 Computer Science Department (and Electrical Engineering and the Center for IT Policy)
Jennifer Rexford Princeton University MW 11:00am-12:20pm Wide-Area Traffic Management COS 597E: Software Defined Networking.
ROUTING ON THE INTERNET COSC Aug-15. Routing Protocols  routers receive and forward packets  make decisions based on knowledge of topology.
Computer Networks Layering and Routing Dina Katabi
Routing Protocols and CIDR BSAD 146 Dave Novak Sources: Network+ Guide to Networks, Dean 2013.
Tomo-gravity Yin ZhangMatthew Roughan Nick DuffieldAlbert Greenberg “A Northern NJ Research Lab” ACM.
Transit Traffic Engineering Nick Feamster CS 6250: Computer Networks Fall 2011.
Dr. John P. Abraham Professor University of Texas Pan American Internet Routing and Routing Protocols.
Network Sensitivity to Hot-Potato Disruptions Renata Teixeira (UC San Diego) with Aman Shaikh (AT&T), Tim Griffin(Intel),
Authors Renata Teixeira, Aman Shaikh and Jennifer Rexford(AT&T), Tim Griffin(Intel) Presenter : Farrukh Shahzad.
1 Pertemuan 20 Teknik Routing Matakuliah: H0174/Jaringan Komputer Tahun: 2006 Versi: 1/0.
IP is a Network Layer Protocol Physical 1 Network DataLink 1 Transport Application Session Presentation Network Physical 1 DataLink 1 Physical 2 DataLink.
1 © 2004, Cisco Systems, Inc. All rights reserved. CCNA 2 v3.1 Module 6 Routing and Routing Protocols.
Traffic Engineering for ISP Networks Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ
Routing protocols Basic Routing Routing Information Protocol (RIP) Open Shortest Path First (OSPF)
CCNA 1 Module 10 Routing Fundamentals and Subnets.
Jennifer Rexford Fall 2014 (TTh 3:00-4:20 in CS 105) COS 561: Advanced Computer Networks BGP.
Chi-Cheng Lin, Winona State University CS 313 Introduction to Computer Networking & Telecommunication Chapter 5 Network Layer.
Dynamics of Hot-Potato Routing in IP Networks Jennifer Rexford AT&T Labs—Research Joint work with Renata Teixeira (UCSD),
David Wetherall Professor of Computer Science & Engineering Introduction to Computer Networks Hierarchical Routing (§5.2.6)
Traffic Engineering for ISP Networks Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ
Intradomain Traffic Engineering By Behzad Akbari These slides are based in part upon slides of J. Rexford (Princeton university)
Jennifer Rexford Fall 2010 (TTh 1:30-2:50 in COS 302) COS 561: Advanced Computer Networks Backbone.
Evolving Toward a Self-Managing Network Jennifer Rexford Princeton University
1 7-Jan-16 S Ward Abingdon and Witney College Dynamic Routing CCNA Exploration Semester 2 Chapter 3.
Mike Freedman Fall 2012 COS 561: Advanced Computer Networks Traffic Engineering.
1 Agenda for Today’s Lecture The rationale for BGP’s design –What is interdomain routing and why do we need it? –Why does BGP look the way it does? How.
Spring 2000CS 4611 Routing Outline Algorithms Scalability.
1 Chapter 4: Internetworking (IP Routing) Dr. Rocky K. C. Chang 16 March 2004.
Michael Schapira, Princeton University Fall 2010 (TTh 1:30-2:50 in COS 302) COS 561: Advanced Computer Networks
COS 561: Advanced Computer Networks
Intra-Domain Routing Jacob Strauss September 14, 2006.
COS 561: Advanced Computer Networks
COS 561: Advanced Computer Networks
COS 561: Advanced Computer Networks
Backbone Traffic Engineering
BGP Instability Jennifer Rexford
Traffic Engineering for ISP Networks
Presentation transcript:

Traffic Engineering for ISP Networks Jennifer Rexford Computer Science Department Princeton University

Outline  Overview of Internet routing –IP addressing and forwarding –Interdomain and intradomain routing –Constructing the forwarding table  Role of optimization in traffic engineering –Heuristics for setting the link weights –Optimizing routing to prevailing traffic –Local search to select the integer link weights  Conclusion and ongoing work

IP Service Model: Best-Effort Packet Delivery  Packet switching –Send data in packets –Header with source & destination address  Best-effort delivery –Packets may be lost –Packets may be corrupted –Packets may be delivered out of order source destination IP network

Packet Delivery Based on Destination IP Address  32-bit number in dotted-quad notation ( )  Divided into network & host portions (left and right)  /24 is a 24-bit prefix with 2 8 addresses Network (24 bits)Host (8 bits)

Longest-Prefix Match Forwarding  Forwarding tables in IP routers –Maps each IP prefix to next-hop link(s)  Destination-based forwarding –Packet has a destination address –Router identifies longest-matching prefix / / / / / destination forwarding table Serial0/0.1 outgoing link

Where do Forwarding Tables Come From?  Routers have forwarding tables –Map prefix to outgoing link(s)  Entries can be statically configured –E.g., “map /24 to Serial0/0.1”  But, this doesn’t adapt –To failures –To new equipment –To the need to balance load  That is where routing protocols come in…

Two-Tiered Internet Routing Architecture  Goal: distributed management of resources –Internetworking of multiple networks –Networks under separate administrative control  Solution: two-tiered routing architecture –Intradomain: inside a region of control »Okay for routers to share topology information »Routers configured to achieve a common goal –Interdomain: between regions of control »Not okay to share complete information »Networks may have different/conflicting goals

Autonomous Systems (ASes)  Autonomous Systems –Distinct regions of administrative control –Routers and links managed by an institution –Service provider, company, university, …  AS hierarchy –Tier-1 provider with national or global backbone –Regional provider with smaller backbone –Campus or corporate network  Interaction between ASes –Internal topology is not shared between ASes –… but, neighboring ASes interact to coordinate routing

AS Numbers (ASNs) Level 3: 1 MIT: 3 Harvard: 11 Yale: 29 Princeton: 88 AT&T: 7018, 6341, 5074, … UUNET: 701, 702, 284, 12199, … Sprint: 1239, 1240, 6211, 6242, … … ASNs represent units of routing policy Currently around 20,000 in use.

Traffic Traverses Multiple ASes Client Web server Path: 6, 5, 4, 3, 2, 1

Interdomain Routing: Border Gateway Protocol  ASes exchange info about who they can reach –IP prefix: block of destination IP addresses –AS path: sequence of ASes along the path  Policies configured by the AS’s network operator –Path selection: which of the paths to use? –Path export: which neighbors to tell? “I can reach /24” “I can reach /24 via AS 1” data traffic

Interior Gateway Protocol (Within an AS)  Routers flood information to learn the topology –Routers determine “next hop” to reach other routers… –By computing shortest paths based on the link weights  Link weights configured by the network operator Serial0/ /24

Constructing the Forwarding Table  Two routing protocols –BGP: learn the external route at some border router –IGP: learn outgoing link on path to other router  Router joins the data –Prefix /24 reached through red router –Red router reached via link Serial0/0.1 –Forwarding entry: /24  Serial0/0.1  Router forwards packets –Lookup destination in table –Forward packet out link Serial0/0.1

14 What if There are Multiple Choices? / IGP distances egress 1 egress 2 This router has two BGP routes to /24. Hot potato: get traffic off of your network as soon as possible. Go for egress 1!

Two Kinds of Routing Protocols  Topology information is flooded within the routing domain  Best end-to-end paths are computed locally at each router.  Best end-to-end paths determine next-hops.  Based on minimizing some notion of distance  Works only if policy is shared and uniform  Examples: OSPF, IS-IS  Each router knows little about network topology  Only best next-hops are chosen by each router for each destination.  Best end-to-end paths result from composition of all next-hop choices  Does not require any notion of distance  Does not require uniform policies at all routers  Examples: RIP, BGP Link StateVectoring

Role of Optimization in Traffic Engineering

Link Weights Control the Flow of Traffic  Routers compute paths –Shortest paths as sum of link weights  Operators set the link weights –To control where the traffic goes 3

Heuristics for Setting the Link Weights  Proportional to physical distance –Cross-country links have higher weights than local ones –Minimizes end-to-end propagation delay  Inversely proportional to link capacity –Smaller weights for higher-bandwidth links –Attracts more traffic to links with more capacity  Tuned based on the offered traffic –Network-wide optimization of weights based on traffic –Directly minimizes key metrics like max link utilization

Why Are the Link Weights Static?  Strawman alternative: load-sensitive routing –Link metrics based on traffic load –Flood dynamic metrics as they change –Adapt automatically to changes in offered load  Reasons why this is typically not done –Delay-based routing unsuccessful in the early days –Oscillation as routers adapt to out-of-date information –Most Internet transfers are very short-lived  Research and standards work continues… –… but operators have to do what they can today

Big Picture: Measure, Model, and Control Topology/ Configuration Offered traffic Changes to the network Operational network Network-wide “what if” model measure control

Traffic Engineering in an ISP Backbone  Topology –Connectivity and capacity of routers and links  Traffic matrix –Offered load between points in the network  Link weights –Configurable parameters for Interior Gateway Protocol  Performance objective –Balanced load, low latency, service level agreements …  Question: Given the topology and traffic matrix in an IP network, which link weights should be used?

Key Ingredients of Our Approach  Instrumentation –Topology: monitoring of the routing protocols –Traffic matrix: widely deployed traffic measurement  Network-wide models –Representations of topology and traffic –“What-if” models of shortest-path routing  Network optimization –Efficient algorithms to find good configurations –Operational experience to identify key constraints

Formalizing the Optimization Problem  Input: graph G(R,L) –R is the set of routers –L is the set of unidirectional links –c l is the capacity of link l  Input: traffic matrix –M i,j is traffic load from router i to j  Output: setting of the link weights –w l is weight on unidirectional link l –P i,j,l is fraction of traffic from i to j traversing link l

Multiple Shortest Paths With Even Splitting Values of P i,j,l

Defining the Objective Function  Computing the link utilization – Link load: u l =  i,j M i,j P i,j,l – Utilization: u l /c l  Objective functions – min l (u l /c l ) – min l (  f(u l /c l )) f(x) 1 x

Complexity of the Optimization Problem  NP-complete optimization problem –No efficient algorithm to find the link weights –Even for the simple convex objective functions  Why can’t we just do multi-commodity flow? –E.g., solve the multi-commodity flow problem… –… and the link weights pop out as the dual –Because IP routers cannot split arbitrarily over ties  What are the implications? –Have to resort to searching through weight settings

Optimization Based on Local Search  Start with an initial setting of the link weights –E.g., same integer weight on every link –E.g., weights inversely proportional to link capacity –E.g., existing weights in the operational network  Compute the objective function –Compute the all-pairs shortest paths to get P i,j,l –Apply the traffic matrix M i,j to get link loads u l –Evaluate the objective function from the u l /c l  Generate a new setting of the link weights repeat

Making the Search Efficient  Avoid repeating the same weight setting –Keep track of past values of the weight setting –… or keep a small signature (e.g., a hash) of past values –Do not evaluate a weight setting if signatures match  Avoid computing the shortest paths from scratch –Explore weight settings that changes just one weight –Apply fast incremental shortest-path algorithms  Limit the number of unique values of link weights –Do not explore all 2 16 possible values for each weight  Stop early, before exploring the whole search space

Incorporating Operational Realities  Minimize number of changes to the network –Changing just 1 or 2 link weights is often enough  Tolerate failure of network equipment –Weights settings usually remain good after failure –… or can be fixed by changing one or two weights  Limit dependence on measurement accuracy –Good weights remain good, despite random noise  Limit frequency of changes to the weights –Joint optimization for day and night traffic matrices

Application to AT&T’s Backbone Network  Performance of the optimized weights –Search finds a good solution within a few minutes –Much better than link capacity or physical distance –Competitive with multi-commodity flow solution  How AT&T changes the link weights –Maintenance done every night from midnight to 6am –Predict effects of removing link(s) from the network –Reoptimize the link weights to avoid congestion –Configure new weights before disabling equipment

Example from My Visit to AT&T’s Operations Center  Amtrak repairing/moving part of the train track –Need to move some of the fiber optic cables –Or, heightened risk of the cables being cut –Amtrak notifies us of the time the work will be done  AT&T engineers model the effects –Determine which IP links go over the affected fiber –Pretend the network no longer has these links –Evaluate the new shortest paths and traffic flow –Identify whether link loads will be too high

Example Continued  If load will be too high –Reoptimize the weights on the remaining links –Schedule the time for the new weights to be configured –Roll back to the old weight setting after Amtrak is done  Same process applied to other cases –Assessing the network’s risk to possible failures –Planning for maintenance of existing equipment –Adapting the link weights to installation of new links –Adapting the link weights in response to traffic shifts

Conclusion  IP networks do not adapt on their own –Routers compute shortest paths based on static weights  Service providers need to adapt the weights –Due to failures, congestion, or planned maintenance  Leads to an interesting optimization problems –Optimize link weights based on topology and traffic  Optimization problem in NP-complete –Forces the use of efficient local-search techniques  Results of the local search are good –Near-optimal solutions that minimize disruptions

Ongoing Work  Robust link-weight assignments –Link/node failures –Range of traffic matrices  More complex routing models –Hot-potato routing –BGP routing policies  Interaction between ASes –Inter-AS negotiation for joint optimization –Grappling with scalability and trust issues  Optimization as a first-class citizen –Protocols that lead to tractable optimization problems –Centralized computation of the forwarding tables

Extra Material: Computing the Traffic Matrix

Computing the Traffic Matrix M i,j  Hard to measure the traffic matrix –IP networks transmit data as individual packets –Routers do not keep traffic statistics, except link utilization on (say) a five-minute time scale  Need to infer the traffic matrix M i,j from –Current topology G(R,L) –Current routing P i,j,l –Current link load u l –Link capacity c l

4Mbps 3Mbps5Mbps Inference: Network Tomography Sources Destinations From link counts to the traffic matrix

Tomography: Formalizing the Problem  Ingress-egress pairs –p is a ingress-egress pair of nodes (i,j) –x p is the (unknown) traffic volume for this pair M i,j  Routing –P lp is proportion of p’s traffic that traverses l  Links in the network –l is a unidirectional edge –u l is the observed traffic volume on this link  Relationship: u = Px (work backwards to get x)

Tomography: One Observation Not Enough  Linear system of n nodes is underdetermined –Number of links e is around O(n) –Number of ingress-egress pairs c is O(n 2 ) –Dimension of solution sub-space at least c - e  Multiple observations are needed –k independent observations (over time) –Stochastic model with Poisson iid counts –Maximum likelihood estimation to infer matrix  Doesn’t work all that well in practice…

Approach Used at AT&T: Tomo-gravity  Gravitational assumption –Ingress point a has traffic v i a –Egress point b has traffic v e b –Pair (a,b) has traffic proportional to v i a * v e b

Approach Used at AT&T: Tomo-gravity  Problem with gravity model –Gravity model ignores the load on the inside links –Gravity assumption isn’t always 100% correct –Resulting traffic matrix might not satisfy the link loads  Combining the two techniques –Gravity: find a traffic matrix using the gravity model –Tomography: find the family of traffic matrices consistent with all link load statistics –Tomo-gravity: find the tomography solution that is closest to the output of the gravity model  Works extremely well (and fast) in practice

Conclusions  Managing IP networks is challenging –Routers don’t adapt on their own to congestion –Routers don’t reveal much information about traffic  Measurement provides a network-wide view –Topology –Traffic matrix  Optimization enables the network to adapt –Inferring the traffic matrix from the link loads –Optimizing the link weights based on the traffic matrix