Presentation is loading. Please wait.

Presentation is loading. Please wait.

BCube: A High Performance, Server- centric Network Architecture for Modular Data Centers Chuanxiong Guo 1, Guohan Lu 1, Dan Li 1, Haitao Wu 1, Xuan Zhang.

Similar presentations


Presentation on theme: "BCube: A High Performance, Server- centric Network Architecture for Modular Data Centers Chuanxiong Guo 1, Guohan Lu 1, Dan Li 1, Haitao Wu 1, Xuan Zhang."— Presentation transcript:

1 BCube: A High Performance, Server- centric Network Architecture for Modular Data Centers Chuanxiong Guo 1, Guohan Lu 1, Dan Li 1, Haitao Wu 1, Xuan Zhang 2, Yunfeng Shi 3, Tian Chen 4, Yongguang Zhang 1, Songwu Lu 5 1: Microsoft Research Asia (MSR-A), 2: Tsinghua U, 3: PKU, 4: HUST, 5: UCLA Present by Wei Bai 1

2 Data Center Network 2 Interconnect for distributed computing workload 1000s of server ports web app db map- reduce HPC monitoring cache

3 Traditional Tree Topology 3 Limited network capacity Single point failure

4 Clos-based Multi-rooted Tree 4 High network capacity & Good fault tolerance Network core is free of persistent congestion

5 Clos-based Multi-rooted Tree 5 Network edge is the bottleneck Bad for One-to-X communication

6 Clos-based Multi-rooted Tree H1 H2 H3 H4 H5 H6 H7 H8 H9 H1 H2 H3 H4 H5 H6 H7 H8 H9 ingress & egress capacity constraints TXRX 6

7 Container-based modular DC Sun Project Black Box 242 systems in 20’ Rackable Systems Container 2800 servers in 40’ Core benefits of Shipping Container DCs: – Easy deployment High mobility Just plug in power, network, & chilled water – Increased cooling efficiency – Manufacturing & H/W Admin. Savings servers in a single container 7

8 BCube design goals High network capacity for various traffic patterns – One-to-one unicast – One-to-all and one-to-several reliable groupcast – All-to-all data shuffling Only use low-end, commodity switches Graceful performance degradation – Performance degrades gracefully as servers/switches failure increases 8

9 BCube structure BCube0 BCube1 server switch Level-1 Level-0 Connecting rule - The i-th server in the j-th BCube 0 connects to the j-th port of the i-th level-1 switch A BCube k network supports servers - n is the number of servers in a BCube 0 - k is the level of that BCube A server is assigned a BCube addr (a k,a k-1,…,a 0 ) where a i  [0,k] Neighboring server addresses differ in only one digit 9

10 MAC addr Bcube addr BCube0 BCube1 MAC030 MAC131 MAC232 MAC333 port Switch MAC table MAC200 MAC211 MAC222 MAC233 port Switch MAC table BCube: Server centric network MAC23MAC data MAC23MAC data dstsrc MAC20MAC data MAC20MAC data Server-centric BCube - Switches never connect to other switches and only act as L2 crossbars - Servers control routing, load balancing, fault-tolerance 10

11 Multi-paths for one-to-one traffic T HEOREM 1. The diameter of a BCube k is k+1 T HEOREM 3. There are k+1 parallel paths between any two servers in a BCube k

12 Speedup for one-to-several traffic T HEOREM 4. Server A and a set of servers {d i |d i is A’s level-i neighbor} form an edge disjoint complete graph of diameter P1 P2 P1P2P1 P2 12

13 Speedup for one-to-all traffic T HEOREM 5. There are k+1 edge-disjoint spanning trees in a Bcube k src The one-to-all and one-to- several SPTs can be implemented by TCP unicast to achieve reliability 13

14 Aggregate bottleneck throughput for all-to-all traffic Aggregate bottleneck throughput (ABT) is the total number of flows times the throughput of the bottleneck flow under the all-to-all communication pattern T HEOREM 6. The ABT for a BCube network is where n is the switch port number and N is the total server number 14

15 BCube Source Routing (BSR) Server-centric source routing – Source server decides the best path for a flow by probing a set of parallel paths – Source server adapts to network condition by re- probing periodically or due to failures BSR design rationale – Network structural property – Scalability – Routing performance 15

16 Path compression and fast packet forwarding Traditional address array needs 16 bytes: Path(00,13) = {02,22,23,13} Forwarding table of server 23 The Next Hop Index (NHI) Array needs 4 bytes: Path(00,13)={0:2,1:2,0:3,1:1} NHIOutput portMAC 0:00Mac20 0:10Mac21 0:20Mac22 1:01Mac03 1:11Mac13 1:31Mac Fwd node Next hop 16

17 Graceful degradation Server failure Switch failure BCube DCell Fat-tree BCube Fat-tree 17 The metric: a ggregation bottleneck throughput (ABT) under different server and switch failure rates

18 Routing to external networks Ethernet has two levels link rate hierarchy – 1G for end hosts and 10G for uplink aggregator gateway G G 18

19 hardware IF 0IF 1IF k Ethernet miniport driver TCP/IP protocol driver BCube configuration server ports BCube driver BSR path probing & selection Flow-path cache Neighbor maintenance Ava_band calculation Packet send/recv app kernel packet fwd software Neighbor maintenance Neighbor maintenance Ava_band calculation packet fwd Intel® PRO/1000 PT Quad Port Server Adapter NetFPGA Implementation Intermediate driver 19

20 Testbed A BCube testbed – 16 servers (Dell Precision 490 workstation with Intel 2.00GHz dualcore CPU, 4GB DRAM, 160GB disk) – 8 8-port mini-switches (DLink 8-port Gigabit switch DGS-1008D) NIC – Intel Pro/1000 PT quad-port Ethernet NIC – NetFPGA 20

21 Bandwidth-intensive application support Per-server throughput 21

22 Support for all-to-all traffic Total throughput for all-to-all 22

23 Related work 23 Speedup

24 Related work UCSD08 and Portland – Rearrangeable non-blocking Clos network – No server change needed – Destination addr based routing VL2 – Reduces cables by using 10G Ethernet – Leveraging existing OSFP and ECMP – Randomized Valiant routing DCell – For different purposes but share the same design philosophy – BCube provides better load-balancing and network capacity 24

25 A novel network architecture for container- based, modular data centers – Enables speedup for one-to-x and x-to-one traffic – Provides high network capacity to all-to-all traffic – Purely constructed from low-end commodity switches – Graceful performance degradation Summary Internetworking for modular mega data centers IEEE Spectrum Feb. 25

26 My comments Bcube has good properties in theory, but … 26

27 My comments Server-Centric network architecture relies on server switch. 27

28 My comments Server-Centric network architecture relies on server switch. Using server CPU to forward packets reduces the CPU usage for computing. 28

29 Q & A 29


Download ppt "BCube: A High Performance, Server- centric Network Architecture for Modular Data Centers Chuanxiong Guo 1, Guohan Lu 1, Dan Li 1, Haitao Wu 1, Xuan Zhang."

Similar presentations


Ads by Google