Presentation on theme: "2nd Symposium on Networked Systems Design & Implementation (NSDI) Boston, MA May 2-4, 2005 Guohui Wang 3, David G. Andersen 2, Michael Kaminsky 1, Michael."— Presentation transcript:
2nd Symposium on Networked Systems Design & Implementation (NSDI) Boston, MA May 2-4, 2005 Guohui Wang 3, David G. Andersen 2, Michael Kaminsky 1, Michael Kozuch 1, T. S. Eugene Ng 3, Dina Papagiannaki 1, Madeleine Glick 1 and Lily Mummert 1, 1. Intel Labs Pittsburgh2. Carnegie Mellon University3. Rice University 1 Your Data Center Is a Router: The Case for Reconfigurable Optical Circuit Switched Paths
2 Data Center Network Todays Data Center Network Data intensive applications are experiencing bandwidth bottleneck in the tree structure data center networks. E.g. Video data processing, MapReduce … End of Row Switch Top of Rack Switch Core Switch Picture from: James Hamilton, Architecture for Modular Data Centers
3 Full bisection bandwidth solutions Re-structure data center network to provide full bisection bandwidth among all the servers. Complicated network structure, hard to construct and expand. Tree FatTree BCube Picture from: Ken Hall, Green Data Centers
4 Full bisection bandwidth may not be necessary Spatial Traffic Locality –Nodes only communicate with a small number of partners. –e.g. Earthquake simulation Temporal Traffic Locality –Applications might hit CPU, disk IO or Sync bounds. –e.g. MapReduce Many measurement studies have suggested evidence of traffic locality. –[SC05][WREN09][IMC09][HotNets09] Full bisection bandwidth solutions provide too much with high costs.
5 An alternative design: hybrid data center network Hybrid network may give us best of both worlds: –Optical circuit-switched paths for data intensive transfer. –Electrical packet-switched paths for timely delivery. ABC DEF Optical circuit-switched network Electrical packet-switched network
6 Optical Circuit Switching MEMS Optical Switching Module Switching at whatever rate modulated on input/output ports Up to tens of ms physical reconfiguration time Picture from: http://www.ntt.co.jp/milab/en/project/pr05_3Dmems.html
7 Optical Channels Ultra-high bandwidth Dropping prices 40G, 100Gbps technology has been developed. 15.5Tbps over a single fiber! Price data from: Joe Berthold, Hot Interconnects09
8 Optical circuits in datacenters A - E, B - D, C - F A - D, B - E, C - F A - F, B - E, C - D ABC DEF Advantage: –Simple and flexible: easy to construct, expand and manage –Ultra-high bandwidth –Low power Disadvantage: –Fat pipes are not all-to-all. –Reconfiguration overhead
9 Research questions Enough traffic locality in data centers to leverage optical path? Reconfigure optical paths fast enough to meet dynamic traffic? How to integrate optical circuits into data centers at low costs? How to manage and leverage optical paths? How do applications behave over the hybrid network?
10 Is there enough traffic locality? Analyzing production data center traffic trace: –7 racks, 155 servers, 1060 cores –One week NetFlow traces collected at all servers –Configure 3 optical paths out of total 21 cross-rack paths with maximum optical traffic, reconfigure every 10s. Traffic locality: a few optical paths have the potential to offload significant amount of traffic from electrical networks. 10 sec TM Time 10s …
11 R1 R2 R3 R4 R5 R6 R7 R8 w xy = vol(Rx, Ry) + vol(Ry, Rx) Graph G: (V, E) w 12 w 14 w 43 w 38 w 68 w 36 w 35 w 27 w 47 Can optical paths be reconfigured fast enough? - Optical Path Configuration Algorithm R1 R2 R3 R4 R5 R6 R7 R8 R1R2R3R4R5R6R7R8 Optical path configuration is a maximum weight perfect matching on graph G. Solved by polynomial time Edmonds algorithm  !  J. Edmonds, Paths, trees and flowers, Canadian J. of Mathematics, pp 449-467, 1965
12 Can optical paths be reconfigured fast enough? - Optical Path Configuration Time Several time factors –Computation time 640ms for a 1000-rack data center using Edmonds algorithm. –Signaling time < 1ms in data centers –Physical reconfiguration time Up to tens ms for MEMS optical switches Even in very large data centers, optical paths can still be reconfigured at small time scales (< 1 sec).
13 How to manage optical paths in data centers? Routing over dynamic dual-path (electrical/optical) network: Ethernet Spanning Tree? –NO, dual paths will be blocked Link State Routing? –NO, long routing convergence time after reconfiguration
14 VLAN based dual-path routing: VLAN1: Electrical VLAN2: Optical Advantages: –Leverage both electrical and optical paths by tagging packets –No route convergence delay after optical reconfiguration –No need to modify switches How to manage optical paths in data centers?
15 How to manage optical paths in data centers? How to measure application traffic demand? Extensive buffering at servers –Traffic demands measurement –Aggregate traffic and batch for optical transfer Per-rack virtual output queuing: –Avoid head-of-line blocking Kernel User Apps Network Interface Servers Per-rack Virtual Output Queue Scheduler
16 How to manage optical paths in data centers? Daemon Kernel Stats Config Stats User Configuration Manager Apps Network Interface Switches with VLAN settings Traffic Config Servers Scheduler Per-rack Virtual Output Queue How to configure optical paths and schedule traffic to them? A centralized manager to control the optical path configuration. Configurable virtual output queue scheduler to control traffic to optical paths. ABCD
17 Challenges TCP/IP reacting to optical path reconfiguration. Potential long delays caused by extensive queuing at servers. Collecting traffic demand from a million servers. Choosing the right buffer sizes and reconfiguration intervals.
18 Summary Adding optical circuit switched paths into data centers. Potential benefits: A simpler and flexible data center network design. Relieving data intensive applications from network bottlenecks.