What to Change, and Where? Add another layer about network routing –Routing functionality in overlay networks Change the routing protocols –To improve scalability, security, convergence, … Change the division of functionality –Data, control, and management planes Change the division of responsibility –End users, third parties, and service providers ???
Removing Routing from Routers: Routing Control Platform, Routing as a Service, 4D Control Plane, Ethane, …
Network Operators Network-wide views –Network topology (e.g., routers, links) –Mapping to lower-level equipment –Traffic matrix Network-level objectives –Load balancing –Survivability –Reachability –Security Direct control –Explicit configuration of data-plane mechanisms
What Should Routers Do? Forward packets: yes –Must be done at high speed –… in line-card hardware on fast routers –So, needs to be done on the routers Collect measurement data: yes –Traffic statistics –Topology information Compute routes: no??? –Distributed computation of forwarding tables –Doesn’t inherently need to run on the routers
Reasons to Remove Routing From Routers Routing is hard to do in a distributed fashion –Beyond single-path and/or shortest-path routing Difficult to make load-sensitive routing stable –Over-reacting to out-of-date information Poor visibility to drive good decisions –Incomplete local views of topology and load Not flexible enough for end users –Cannot easily select customized routes Difficult to extend over time –Hard-coded into the underlying routers
Routing Control Platform Goal: Move beyond today’s artifacts, while remaining compatible with the legacy routers RCP computes routes for the routers –Network-wide visibility and control Backwards compatibility –RCP speaks to routers using BGP protocol AS 2 RCP
Example Services Selective denial-of-service attack blackholing –Identify entry point and victim of attack –Drop offending traffic at the entry point Planned maintenance dryout –Drain traffic off of an edge router –Before bringing it down for maintenance Flexible egress point selection –Multiple ways to reach the same destination –Giving customers control over the decision Enhanced interdomain routing security –Anomaly detection or security protocols
Routing As a Service Goal: third parties pick end-to-end paths for clients to satisfy diverse user objectives Forwarding infrastructure –Basic routing (e.g., default routing) –Primitives for inserting routes Route selector –Aggregates network information –Selects routes on behalf of clients –Competes with other selectors for customers End host –Queries route selector to set up paths
Feasibility Fast reaction to failures –Routers are closer to the failures –Can a service react quickly enough? Scalability with network size –State and computation grow with the topology –Can a service manage a large network? Reliability? –Service is now a point of failure –Is simple replication enough? Security? –Service is now a natural point of attack –Easier (or harder) to protect than the routers?
Routing Change: Before and After 0 1 2 3 0 1 2 3 (1,0) (2,0) (3,1,0) (2,0) (1,2,0) (3,2,0)
Routing Change: Path Exploration AS 1 –Delete the route (1,0) –Switch to next route (1,2,0) –Send route (1,2,0) to AS 3 AS 3 –Sees (1,2,0) replace (1,0) –Compares to route (2,0) –Switches to using AS 2 0 1 2 3 (2,0) (1,2,0) (3,2,0)
Routing Change: Path Exploration Initial situation –Destination 0 is alive –All ASes use direct path When destination dies –All ASes lose direct path –All switch to longer paths –Eventually withdrawn E.g., AS 2 –(2,0) (2,1,0) –(2,1,0) (2,3,0) –(2,3,0) (2,1,3,0) –(2,1,3,0) null 1 2 3 0 (1,0) (1,2,0) (1,3,0) (2,0) (2,1,0) (2,3,0) (2,1,3,0) (3,0) (3,1,0) (3,2,0)
Convergence Overhead and Delay Path exploration is expensive –Large number of possible paths –Might have to explore (nearly) all of them Much slower than link-state routing –Simply floods the topology –And routers compute shortest path Any way to reduce BGP convergence time? –Avoid exploring paths with the same failure? –Hybrids of path vector and link state?
HLP: Hybrid Link-state/Path-vector Assume hierarchical AS structure –Provider-customer relationships dominate –And some peer-peer edges –(Are we willing to cook in these assumptions?) Hybrid of link state and path vector –Link state within a sub-tree –Path vector across peer-peer links Route on AS numbers –Rather than prefixes
Add New Features in an Overlay: Resilient Overlay Networks
RON: Resilient Overlay Networks Premise: by building application overlay network, can increase performance and reliability of routing Two-hop (application-level) Berkeley-to-Princeton route application-layer router Princeton Yale Berkeley http://nms.csail.mit.edu/ron/
RON Circumvents Policy Restrictions IP routing depends on AS routing policies –But hosts may pick paths that circumvent policies USLEC PU Patriot ISP me My home computer
RON Adapts to Network Conditions Start experiencing bad performance –Then, start forwarding through intermediate host A C B
RON Customizes to Applications VoIP traffic: low-latency path Bulk transfer: high-bandwidth path A C B voice bulk transfer
How Does RON Work? Keeping it small to avoid scaling problems –A few friends who want better service –Just for their communication with each other –E.g., VoIP, gaming, collaborative work, etc. Send probes between each pair of hosts A C B
How Does RON Work? Exchange the results of the probes –Each host shares results with every other host –Essentially running a link-state protocol! –So, every host knows the performance properties Forward through intermediate host when needed A C B B
RON Works in Practice Faster reaction to failure –RON reacts in a few seconds –BGP sometimes takes a few minutes Single-hop indirect routing –No need to go through many intermediate hosts –One extra hop circumvents the problems Better end-to-end paths –Circumventing routing policy restrictions –Sometimes the RON paths are actually shorter
RON Limited to Small Deployments Extra latency through intermediate hops –Software delays for packet forwarding –Propagation delay across the access link Overhead on the intermediate node –Imposing CPU and I/O load on the host –Consuming bandwidth on the access link Overhead for probing the virtual links –Bandwidth consumed by frequent probes –Trade-off probe overhead vs. detection speed Possibility of causing instability –Moving traffic in response to poor performance –May lead to congestion on the new paths
Future Routing Architecture Who is in charge? –Network administrators? –End hosts? –Third-party overlays? –Third-party routing providers? Build on top of today’s network? –New AS-level control plane? –Overlays on top of existing Internet? Assume (restricted) economic models? –To improve scalability and convergence?