Presentation is loading. Please wait.

Presentation is loading. Please wait.

Rethinking Routers in the Age of Virtualization Jennifer Rexford Princeton University

Similar presentations


Presentation on theme: "Rethinking Routers in the Age of Virtualization Jennifer Rexford Princeton University"— Presentation transcript:

1 Rethinking Routers in the Age of Virtualization Jennifer Rexford Princeton University http://www.cs.princeton.edu/~jrex/virtual.html

2 Traditional View of a Router A big, physical device… – Processors – Multiple links – Switching fabric … that directs Internet traffic – Connects to other routers – Computes routes – Forwards packets

3 Times Are Changing

4 Backbone Links are Virtual Flexible underlying transport network – Layer-3 links are multi-hop paths at layer 2 4 Chicago New York Washington D.C.

5 Routing Separate From Forwarding Separation of functionality – Control plane: computes paths – Forwarding plane: forwards packets Switching Fabric Processor Line card data plane control plane

6 Multiple Virtual Routers Multiple virtual routers on same physical one – Virtual Private Networks (VPNs) – Router consolidation for smaller footprint Switching Fabric data plane control plane

7 Capitalizing on Virtualization Simplify network management – Hide planned changes in the physical topology Improve router reliability – Survive bugs in complex routing software Deploy new value-added services – Customized protocols in virtual networks Enable new network business models – Separate service providers from the infrastructure What should the router “hypervisor” look like?

8 VROOM: Virtual Routers On the Move With Yi Wang, Eric Keller, Brian Biskeborn, and Kobus van der Merwe

9 The Two Notions of “Router” IP-layer logical functionality, and physical equipment 9 Logical (IP layer) Physical

10 Tight Coupling of Physical & Logical Root of many network-management challenges (and “point solutions”) 10 Logical (IP layer) Physical

11 VROOM: Breaking the Coupling Re-mapping logical node to another physical node 11 Logical (IP layer) Physical VROOM enables this re-mapping of logical to physical through virtual router migration.

12 Case 1: Planned Maintenance NO reconfiguration of VRs, NO reconvergence 12 A B VR-1

13 Case 1: Planned Maintenance NO reconfiguration of VRs, NO reconvergence 13 A B VR-1

14 Case 1: Planned Maintenance NO reconfiguration of VRs, NO reconvergence 14 A B VR-1

15 Case 2: Service Deployment/Evolution Move (logical) router to more powerful hardware 15

16 Case 2: Service Deployment/Evolution VROOM guarantees seamless service to existing customers during the migration 16

17 Case 3: Power Savings 17 $ Hundreds of millions/year of electricity bills

18 Case 3: Power Savings 18 Contract and expand the physical network according to the traffic volume

19 Case 3: Power Savings 19 Contract and expand the physical network according to the traffic volume

20 Case 3: Power Savings 20 Contract and expand the physical network according to the traffic volume

21 Virtual Router Migration: Challenges 21 1.Migrate an entire virtual router instance All control-plane processes & data-plane states

22 Virtual Router Migration: Challenges 22 1.Migrate an entire virtual router instance 2.Minimize disruption Data plane: millions of packets/sec on a 10Gbps link Control plane: less strict (with routing message retrans.)

23 Virtual Router Migration: Challenges 23 1.Migrating an entire virtual router instance 2.Minimize disruption 3.Link migration

24 Virtual Router Migration: Challenges 24 1.Migrating an entire virtual router instance 2.Minimize disruption 3.Link migration

25 VROOM Architecture 25 Dynamic Interface Binding Data-Plane Hypervisor

26 Key idea: separate the migration of control and data planes 1.Migrate the control plane 2.Clone the data plane 3.Migrate the links 26 VROOM’s Migration Process

27 Leverage virtual server migration techniques Router image – Binaries, configuration files, etc. 27 Control-Plane Migration

28 Leverage virtual server migration techniques Router image Memory – 1 st stage: iterative pre-copy – 2 nd stage: stall-and-copy (when the control plane is “frozen”) 28 Control-Plane Migration

29 Leverage virtual server migration techniques Router image Memory 29 Control-Plane Migration Physical router A Physical router B DP CP

30 Clone the data plane by repopulation – Enable migration across different data planes – Avoid copying duplicate information 30 Data-Plane Cloning Physical router A Physical router B CP DP-old DP-new

31 Data-plane cloning takes time – Installing 250k routes may take several seconds Control & old data planes need to be kept “online” Solution: redirect routing messages through tunnels 31 Remote Control Plane Physical router A Physical router B CP DP-old DP-new

32 Data-plane cloning takes time – Installing 250k routes takes over 20 seconds Control & old data planes need to be kept “online” Solution: redirect routing messages through tunnels 32 Remote Control Plane Physical router A Physical router B CP DP-old DP-new

33 Data-plane cloning takes time – Installing 250k routes takes over 20 seconds Control & old data planes need to be kept “online” Solution: redirect routing messages through tunnels 33 Remote Control Plane Physical router A Physical router B CP DP-old DP-new

34 At the end of data-plane cloning, both data planes are ready to forward traffic 34 Double Data Planes CP DP-old DP-new

35 With the double data planes, links can be migrated independently 35 Asynchronous Link Migration A CP DP-old DP-new B

36 Virtualized operating system – OpenVZ, supports VM migration Routing protocols – Quagga software suite Packet forwarding – NetFPGA hardware Router hypervisor – Our extensions for repopulating data plane, remote control plane, double data planes, … 36 Prototype Implementation

37 Data plane: NetFPGA – No packet loss or extra delay Control plane: Quagga routing software – All routing-protocol adjacencies stay up – Core router migration (intradomain only) Inject an unplanned link failure at another router At most one retransmission of an OSPF message – Edge router migration (intra and interdomain) Control-plane downtime: 3.56 seconds Within reasonable keep-alive timer intervals 37 Experimental Results

38 Conclusions on VROOM Useful network-management primitive – Separate tight coupling between physical and logical – Simplify management, enable new applications Evaluation of prototype – No disruption in packet forwarding – No noticeable disruption in routing protocols Ongoing work – Migration scheduling as an optimization problem – Extensions to hypervisor for other applications 38

39 VERB: Virtually Eliminating Router Bugs With Eric Keller, Minlan Yu, and Matt Caesar

40 Router Bugs Are Important Routing software is complicated – Leads to programming errors (aka “bugs”) – Recent string of high-profile outages Bugs different from traditional failures – Byzantine failures, don’t simply crash the router – Violate protocol, and cause cascading outages The problem is getting worse – Software is getting more complicated – Other outages becoming less common – Vendors allowing third-party software

41 Exploit Software and Data Diversity Many sources of diversity – Diverse code (Quagga, XORP, BIRD) – Diverse protocols (OSPF and IS-IS) – Diverse environment (timing, ordering, memory) Reasonable overhead – Extra processor blade for hardware reliability – Multi-core processors, separate route servers, … Special properties of routing software – Clear interfaces to data plane and other routers – Limited dependence on past history

42 Handling Bugs at Run Time Diverse replication – Run multiple control planes in parallel – Vote on routing messages and forwarding table UPDATE VOTER FIB VOTER REPLICA MANAGER Hypervisor Forwarding Table (FIB) IF 1 IF 2 Protocol daemon RIB Protocol daemon RIB Protocol daemon RIB

43 UPDATE VOTER FIB VOTER REPLICA MANAGER Hypervisor Replicating Incoming Routing Messages FIB IF 1 IF 2 12.0.0.0/8 Update Protocol daemon RIB Protocol daemon RIB Protocol daemon RIB No need for protocol parsing – operates at socket level

44 UPDATE VOTER FIB VOTER REPLICA MANAGER Hypervisor Voting: Updates to Forwarding Table FIB IF 1 IF 2 12.0.0.0/8  IF 2 12.0.0.0/8 Update Protocol daemon RIB Protocol daemon RIB Protocol daemon RIB Transparent by intercepting calls to “Netlink”

45 UPDATE VOTER FIB VOTER REPLICA MANAGER Hypervisor Voting: Control-Plane Messages FIB IF 1 IF 2 12.0.0.0/8  IF 2 12.0.0.0/8 Update Protocol daemon RIB Protocol daemon RIB Protocol daemon RIB Transparent by intercepting socket system calls

46 Simple Voting and Recovery Tolerate transient periods of disagreement – During routing-protocol convergence (tens of sec) Several different voting mechanisms – Master-slave vs. wait-for-consensus Small, trusted software component – No parsing, treats data as opaque strings – Just 514 lines of code in our implementation Recovery – Kill faulty instance, and invoke a new one

47 Conclusion on Bug-Tolerant Router Seriousness of routing software bugs – Cause serious outages, misbehavior, vulnerability – Violate protocol semantics, so not handled by traditional failure detection and recovery Software and data diversity – Effective, and has reasonable overhead Design and prototype of bug-tolerant router – Works with Quagga, XORP, and BIRD software – Low overhead, and small trusted code base

48 Conclusions for the Talk Router virtualization is exciting – Enables wide variety of new networking techniques – … for network management & service deployment – … and even rethinking the Internet architecture Fascinating space of open questions – Other possible applications of router virtualization? – What is the right interface to router hardware? – What is the right programming environment for customized protocols on virtual networks? http://www.cs.princeton.edu/~jrex/virtual.html


Download ppt "Rethinking Routers in the Age of Virtualization Jennifer Rexford Princeton University"

Similar presentations


Ads by Google