1 Route Control Platform Making the Network Act Like One Big Router Jennifer Rexford Princeton University
Published byModified over 5 years ago
Presentation on theme: "1 Route Control Platform Making the Network Act Like One Big Router Jennifer Rexford Princeton University"— Presentation transcript:
1 Route Control Platform Making the Network Act Like One Big Router Jennifer Rexford Princeton University http://www.cs.princeton.edu/~jrex http://www.cs.princeton.edu/~jrex/papers/rcp.pdf
2 Outline Internet architecture –Complexity of network management Moving control from routers to servers –Reducing complexity and increasing flexibility Traffic engineering example –Today’s approach vs. the RCP Making the RCP real –Deployability, scalability, and reliability Example applications –Security, maintenance, and customer control
3 Internet Architecture The Internet is –Decentralized: loose confederation of peers –Self-configuring: no global registry of topology –Stateless: limited information in the routers –Connectionless: no fixed connection between hosts These attributes contribute –To the success of Internet –To the rapid growth of the Internet –… and the difficulty of controlling the Internet! senderreceiver
4 A Well-Studied Architecture Question Smart hosts, dumb network Network moves IP packets between hosts Services implemented on hosts Keep state at the edges Edge Network IP How to partition function vertically?
5 Inside a Single Network Data Plane Distributed routers Forwarding, filtering, queuing Management Plane Figure out what is happening in network Decide how to change it Shell scripts Traffic Eng Databases Planning tools OSPF SNMPnetflowmodems Configs OSPF BGP Link metrics OSPF BGP OSPF BGP Control Plane Multiple routing processes on each router Each router with different configuration program Huge number of control knobs: metrics, ACLs, policy FIB Routing policies Packet filters
6 Inside a Single Network Data Plane Distributed routers Forwarding, filtering, queueing Based on FIB or labels Management Plane Figure out what is happening in network Decide how to change it Shell scripts Traffic Eng Databases Planning tools OSPF SNMPnetflowmodems Configs OSPF BGP Link metrics OSPF BGP OSPF BGP Control Plane Multiple routing processes on each router Each router with different configuration program Huge number of control knobs: metrics, ACLs, policy FIB Routing policies Packet filters State everywhere! Dynamic state in forwarding tables Configured state in settings, policies, packet filters Programmed state in magic constants, timers Many dependencies between bits of state State updated in uncoordinated, decentralized way!
7 How Did We Get in This Mess? Initial IP architecture –Bundled packet handling and control logic –Distributed the functions across routers –Didn’t anticipate the need for management Rapid growth in features –Sudden popularity and growth of the Internet –Increasing demands for new functionality –Incremental extensions to protocols & router software Challenges of distributed algorithms –Some functions are hard to do in a distributed fashion
8 What Does the Operator Want? Network-wide views –Network topology –Mapping to lower-level equipment –Traffic matrix Network-level objectives –Load balancing –Survivability –Reachability –Security Direct control –Explicit configuration of data-plane mechanisms
9 What Architecture Would Achieve This? Management plane Decision plane –Responsible for all decision logic and state –Operates on network-wide view and objectives –Directly controls the behavior of the data plane Control plane Discovery plane –Responsible for providing the network-wide view –Topology discovery, traffic measurement, etc. Data plane –Queues, filters, and forwards data packets –Accepts direct instruction from the decision plane
10 Example Application: Traffic Engineering Problem: Adapt routing to the traffic demands –Inputs: network topology and traffic matrix –Outputs: routing of traffic that balances load Three ways to solve the problem –Extend the control plane to adapt to load –Management plane, with today’s control plane –Decision plane
11 Interior Gateway Protocol (OSPF/IS-IS) Routers flood information to learn the topology –Determine “next hop” to reach other routers… –Compute shortest paths based on the link weights Link weights configured by the network operator 3 2 2 1 1 3 1 4 5 3
12 Control Plane: Let the Routers Adapt Strawman alternative: load-sensitive routing –Link metrics based on traffic load –Flood dynamic metrics as they change –Adapt automatically to changes in offered load Reasons why this is typically not done –Delay-based routing unsuccessful in the early days –Oscillation as routers adapt to out-of-date information –Most Internet transfers are very short-lived Research and standards work continues… –… but operators have to do what they can today
13 Management Plane: Measure, Model, Control Topology/ Configuration Offered traffic Changes to link weights Operational network Network-wide “what if” model measure control optimize
14 Management Plane Approach Topology –Connectivity and capacity of routers and links Traffic matrix –Offered load between points in the network Link weights –Configurable parameters for routing protocol Performance objective –Balanced load, low latency, service agreements … Question: Given the topology and traffic matrix, which link weights should be used?
15 Management Plane Solution Measure –Topology: monitoring of the routing protocols –Traffic matrix: widely deployed traffic measurement Model –Representations of topology and traffic –“What-if” models of the routing protocol Optimize –Efficient local-search algorithms to find good settings –Operational experience to identify key constraints http://www.cs.princeton.edu/~jrex/papers/ieeecomm02.pdf
16 This Works, But Has Some Limitations “What-if” model –Repeats the logic implemented in the control plane –Duplication of functionality, and debugging Optimization techniques –Local search because the problem is intractable –Too much computation to explore all possibilities Network effects –Link-weight changes are disruptive –Routers must converge after each change –Leads to transient packet loss and delay
17 Decision Plane Solution Measure –Topology: monitoring of the routing protocols –Traffic matrix: widely deployed traffic measurement Optimize the routing –Compute desired forwarding paths directly –Simpler than optimizing the link weights Instruct the routers –Could change one router at a time to gradually switch to the new routes –Avoid transient packet loss and delays
18 More Network-Level Objectives Survivability –Routing that can tolerate any single equipment failure –Incorporate knowledge of shared risk groups Reachability policies –Control which pairs of hosts can communicate –Install packet filters and forwarding-table entries Security –Install “blackhole” routes that drop attack traffic –Keep routing tables within router storage limits Etc.
19 Is The Decision Plane Feasible? Deployability: any path from here to there? –Must be compatible with today’s routers –Must provide incentives for deployment Speed: can it run fast enough? –Must respond quickly to network events –Needs to be as fast as a router Reliability: single point of failure? –Must be replicated to tolerate failure –Replicas must behave consistently
20 Deployability Take a lesson from Ethernet –Change anything but the message format Border Gateway Protocol (BGP) –Interdomain routing protocol for the Internet Widely implemented on existing routers Widely used, especially in backbone networks –Three main aspects of BGP Protocol: standard messages sent between routers Vendors: path-selection logic on individual routers Operators: configuration of policies for path selection –Logic and policies are complex, but messages simple
21 Deployment in a Single Network iBGP eBGP Before: conventional use of BGP in a backbone network iBGP eBGP After: RCP learns external routes and sends answers to the routers Only one AS has to change its architecture! RCP
22 Longer Term, Wide-Spread Deployment Represents an AS as a single logical entity –Complete view of AS’s routes –Computes routes for all routers inside an AS Exchanges routing information with other ASes –Using BGP or a new inter-AS protocol –While still using BGP to talk to the routers AS 3 AS 2 AS 1 iBGP Physical peering Inter-AS Protocol RCP
23 RCP Architecture R R R R RRR BGP EngineOSPF Viewer Route Control Server BGP EngineOSPF Viewer Route Control Server Brain Brawn RCP Scalability through decomposition; reliability through replication
24 Scalability: Three-Part RCP Architecture OSPF viewer –Continuous view of network topology –Passive monitoring of link-state advertisements BGP engine –Collecting BGP updates from border routers –Sending chosen routes to the router –Lots of TCP connections, like a Web server Route Control Server –Logic for computing answers for the routers –Configuration for controlling the logic –Operates on real-time feeds from the monitors
25 Scalability: Initial Prototype Implementation platform –3.2 GHz Pentium-4 –8 GB memory –Linux 2.6.5 kernel Workload –Routing/topology changes in AT&T’s network RCP performance –Memory usage: less than 2GB –Speed, BGP changes: less than 40 msec –Speed, topology changes: 0.1-0.8 seconds System is able to keep up…
26 Reliability Replication: avoid single point of failures –Multiple RCPs per network –Connected at different places Consistency: replicas act as one –Replicas performing the same algorithm on the same input get the same answer (eventually) –Replica has complete view of each partition it sees AB A A, B B
27 Application: DDoS Blackholing Blackholing of denial-of-service attacks –Preconfigure a “null” route on each router –Identify address of attack victim (from DoS system) –RCP assigns the destination address to the null route iBGP RCP Victim 18.104.22.168 “Use null route for 22.214.171.124/32” attack (detected by traffic analysis)
28 Application: Maintenance Dry-Out Dry-out of traffic before maintenance –Plan to take a router out of service –RCP assigns routes via new egress points in advance iBGP RCP Router r about to undergo maintenance before d r s “Use route via s for d” after
29 Application: Customized Egress Selection Customer-controlled selection of egress points –Customer with two data centers and many sites –Customer wants to control the load balance –RCP customization (not simply closest egress) iBGP RCP d r s “Use route via r for d” “Use route via s for d” Site #1 Site #2
30 Conclusion Managing IP networks is too hard –IP architecture not designed for management –Complex, distributed operation of routers Reducing complexity in the key –Network-wide views & objectives, and direct control –Removing control logic and state from the routers New architecture is feasible –RCP is deployable, scalable, reliable –RCP solves important operations problems