Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Data Center Networking CS 6250: Computer Networking Fall 2011.

Similar presentations


Presentation on theme: "1 Data Center Networking CS 6250: Computer Networking Fall 2011."— Presentation transcript:

1 1 Data Center Networking CS 6250: Computer Networking Fall 2011

2 Cloud Computing Elastic resources –Expand and contract resources –Pay-per-use –Infrastructure on demand Multi-tenancy –Multiple independent users –Security and resource isolation –Amortize the cost of the (shared) infrastructure Flexibility service management –Resiliency: isolate failure of servers and storage –Workload movement: move work to other locations 2

3 Trend of Data Center By J. Nicholas Hoover, InformationWeek June 17, :00 AMJ. Nicholas Hoover 200 million Euro Data centers will be larger and larger in the future cloud computing era to benefit from commodities of scale. Tens of thousand Hundreds of thousands in # of servers Most important things in data center management - Economies of scale - High utilization of equipment - Maximize revenue - Amortize administration cost - Low power consumption (http://www.energystar.gov/ia/partners/prod_development/downloads/EPA_Datacenter_Report_Congress_Final1.pdf)

4 Cloud Service Models Software as a Service –Provider licenses applications to users as a service –e.g., customer relationship management, , … –Avoid costs of installation, maintenance, patches, … Platform as a Service –Provider offers software platform for building applications –e.g., Googles App-Engine –Avoid worrying about scalability of platform Infrastructure as a Service –Provider offers raw computing, storage, and network –e.g., Amazons Elastic Computing Cloud (EC2) –Avoid buying servers and estimating resource needs 4

5 Multi-Tier Applications Applications consist of tasks –Many separate components –Running on different machines Commodity computers –Many general-purpose computers –Not one big mainframe –Easier scaling 5 Front end Server Aggregator … Aggregator Worker … …

6 Enabling Technology: Virtualization Multiple virtual machines on one physical machine Applications run unmodified as on real machine VM can migrate from one computer to another 6

7 Status Quo: Virtual Switch in Server 7

8 Top-of-Rack Architecture Rack of servers –Commodity servers –And top-of-rack switch Modular design –Preconfigured racks –Power, network, and storage cabling Aggregate to the next level 8

9 Modularity Containers Many containers 9

10 10 Data Center Challenges Traffic load balance Support for VM migration Achieving bisection bandwidth Power savings / Cooling Network management (provisioning) Security (dealing with multiple tenants)

11 Data Center Costs (Monthly Costs) Servers: 45% –CPU, memory, disk Infrastructure: 25% –UPS, cooling, power distribution Power draw: 15% –Electrical utility costs Network: 15% –Switches, links, transit 11

12 12 Common Data Center Topology Internet Servers Layer-2 switch Access Data Center Layer-2/3 switch Aggregation Layer-3 router Core

13 Data Center Network Topology 13 CR AR... S S S S Internet S S S S A A A A A A … S S S S A A A A A A …... Key CR = Core Router AR = Access Router S = Ethernet Switch A = Rack of app. servers ~ 1,000 servers/pod

14 Requirements for future data center To catch up with the trend of mega data center, DCN technology should meet the requirements as below –High Scalability –Transparent VM migration (high agility) –Easy deployment requiring less human administration –Efficient communication –Loop free forwarding –Fault Tolerance Current DCN technology cant meet the requirements. –Layer 3 protocol can not support the transparent VM migration. –Current Layer 2 protocol is not scalable due to the size of forwarding table and native broadcasting for address resolution.

15 15 Problems with Common Topologies Single point of failure Over subscription of links higher up in the topology Tradeoff between cost and provisioning

16 Capacity Mismatch 16 CR AR S S S S S S S S A A A A A A … S S S S A A A A A A …... S S S S S S S S A A A A A A … S S S S A A A A A A … ~ 5:1 ~ 40:1 ~ 200:1

17 Data-Center Routing 17 CR AR... S S S S DC-Layer 3 Internet S S S S A A A A A A … S S S S A A A A A A …... DC-Layer 2 Key CR = Core Router (L3) AR = Access Router (L3) S = Ethernet Switch (L2) A = Rack of app. servers ~ 1,000 servers/pod == IP subnet SSSS SS

18 Reminder: Layer 2 vs. Layer 3 Ethernet switching (layer 2) –Cheaper switch equipment –Fixed addresses and auto-configuration –Seamless mobility, migration, and failover IP routing (layer 3) –Scalability through hierarchical addressing –Efficiency through shortest-path routing –Multipath routing through equal-cost multipath So, like in enterprises… –Data centers often connect layer-2 islands by IP routers 18

19 19 Need for Layer 2 Certain monitoring apps require server with same role to be on the same VLAN Using same IP on dual homed servers Allows organic growth of server farms Migration is easier

20 20 Review of Layer 2 & Layer 3 Layer 2 –One spanning tree for entire network Prevents loops Ignores alternate paths Layer 3 –Shortest path routing between source and destination –Best-effort delivery

21 21 FAT Tree-Based Solution Connect end-host together using a fat-tree topology –Infrastructure consist of cheap devices Each port supports same speed as endhost –All devices can transmit at line speed if packets are distributed along existing paths –A k-port fat tree can support k3/4 hosts

22 22 Fat-Tree Topology

23 23 Problems with a Vanilla Fat-tree Layer 3 will only use one of the existing equal cost paths Packet re-ordering occurs if layer 3 blindly takes advantage of path diversity

24 24 Modified Fat Tree Enforce special addressing scheme in DC –Allows host attached to same switch to route only through switch –Allows inter-pod traffic to stay within pod –unused.PodNumber.switchnumber.Endhost Use two level look-ups to distribute traffic and maintain packet ordering.

25 25 Two-Level Lookups First level is prefix lookup –Used to route down the topology to endhost Second level is a suffix lookup –Used to route up towards core –Diffuses and spreads out traffic –Maintains packet ordering by using the same ports for the same endhost

26 26 Diffusion Optimizations Flow classification –Eliminates local congestion –Assign to traffic to ports on a per-flow basis instead of a per-host basis Flow scheduling –Eliminates global congestion –Prevent long lived flows from sharing the same links –Assign long lived flows to different links

27 27 Drawbacks No inherent support for VLAN traffic Data center is fixed size Ignored connectivity to the Internet Waste of address space –Requires NAT at border

28 Data Center Traffic Engineering Challenges and Opportunities 28

29 Wide-Area Network 29 Router DNS Server DNS-based site selection... Servers Internet Clients Data Centers

30 Wide-Area Network: Ingress Proxies 30 Router Data Centers... Servers Clients Proxy

31 Traffic Engineering Challenges Scale –Many switches, hosts, and virtual machines Churn –Large number of component failures –Virtual Machine (VM) migration Traffic characteristics –High traffic volume and dense traffic matrix –Volatile, unpredictable traffic patterns Performance requirements –Delay-sensitive applications –Resource isolation between tenants 31

32 Traffic Engineering Opportunities Efficient network –Low propagation delay and high capacity Specialized topology –Fat tree, Clos network, etc. –Opportunities for hierarchical addressing Control over both network and hosts –Joint optimization of routing and server placement –Can move network functionality into the end host Flexible movement of workload –Services replicated at multiple servers and data centers –Virtual Machine (VM) migration 32

33 PortLand: Main Idea Add a new host Transfer a packet Key features - Layer 2 protocol based on tree topology - PMAC encode the position information - Data forwarding proceeds based on PMAC - Edge switchs responsible for mapping between PMAC and AMAC - Fabric mangers responsible for address resolution - Edge switch makes PMAC invisible to end host - Each switch node can identify its position by itself - Fabric manager keep information of overall topology. Corresponding to the fault, it notifies affected nodes.

34 Questions in discussion board Question about Fabric Manager o How to make Fabric Manager robust ? o How to build scalable Fabric Manager ? Redundant deployment or cluster Fabric manager could be solution Question about reality of base-line tree topology o Is the tree topology common in real world ? Yes, multi-rooted tree topology has been a traditional topology in data center. [A scalable, commodity data center network architecture Mohammad La-Fares and et el, SIGCOMM 08]

35 Discussion points o Is the PortLand applicable to other topology ? The Idea of central ARP management could be applicable. To solve forwarding loop problem, TRILL header-like method would be necessary. The benefits of PMAC ? It would require larger size of forwarding table. Question about benefits from VM migration o VM migration helps to reduce traffic going through aggregate/core switches. o How about user requirement change? o How about power consumption ? ETC o Feasibility to mimic PortLand on layer 3 protocol. o How about using pseudo IP ? o Delay to boot up whole data center

36 Status Quo: Conventional DC Network Reference – Data Center: Load balancing Data Center Services, Cisco 2004 CR AR... SS DC-Layer 3 Internet SS AAA … SS AAA …... DC-Layer 2 Key CR = Core Router (L3) AR = Access Router (L3) S = Ethernet Switch (L2) A = Rack of app. servers ~ 1,000 servers/pod == IP subnet

37 Conventional DC Network Problems CR AR SS SS AAA … SS AAA …... SS SS AAA … SS AAA … ~ 5:1~ 5:1 ~ 40:1 ~ 200:1 Dependence on high-cost proprietary routers Extremely limited server-to-server capacity

38 And More Problems … CR AR SS SSSS SS SSSS IP subnet (VLAN) #1 ~ 200:1 Resource fragmentation, significantly lowering utilization (and cost-efficiency) IP subnet (VLAN) #2 A A A … AAA … AA … AA … A A A

39 And More Problems … CR AR SS SSSS SS SSSS IP subnet (VLAN) #1 ~ 200:1 Resource fragmentation, significantly lowering cloud utilization (and cost-efficiency) Complicated manual L2/L3 re-configuration IP subnet (VLAN) #2 A A A … AAA … AA … AA … A A A

40 All We Need is Just a Huge L2 Switch, or an Abstraction of One AAA … AAA …... AAA … AAA … CR AR SS SSSS SS SSSS AA A A AA A A A A A A A A A A A A A AA A A A A...

41 41 VL2 Approach Layer 2 based using future commodity switches Hierarchy has 2: –access switches (top of rack) – load balancing switches Eliminate spanning tree –Flat routing –Allows network to take advantage of path diversity Prevent MAC address learning –4D architecture to distribute data plane information –TOR: Only need to learn address for the intermediate switches –Core: learn for TOR switches Support efficient grouping of hosts (VLAN replacement)

42 42 VL2

43 43 VL2 Components Top-of-Rack switch: –Aggregate traffic from 20 end host in a rack –Performs ip to mac translation Intermediate Switch –Disperses traffic –Balances traffic among switches –Used for valiant load balancing Decision Element –Places routes in switches –Maintain a directory services of IP to MAC End-host –Performs IP-to-MAC lookup

44 44 Routing in VL2 End-host checks flow cache for MAC of flow –If not found ask agent to resolve –Agent returns list of MACs for server and MACs for intermediate routers Send traffic to Top of Router –Traffic is triple-encapsulated Traffic is sent to intermediate destination Traffic is sent to Top of rack switch of destination

45 The Illusion of a Huge L2 Switch 1. L2 semantics 2. Uniform high capacity 3. Performance isolation AAA … AAA … AAA … AAA … AA A A AA A A A A A A A A A A A A A AA A A A A

46 SolutionApproachObjective 2. Uniform high capacity between servers Enforce hose model using existing mechanisms only Employ flat addressing 1. Layer-2 semantics 3. Performance Isolation Guarantee bandwidth for hose-model traffic Flow-based random traffic indirection (Valiant LB) Name-location separation & resolution service TCP Objectives and Solutions

47 Name/Location Separation 47 payload ToR 3... y x Servers use flat names Switches run link-state routing and maintain only switch-level topology Cope with host churns with very little overhead yz payload ToR 4 z ToR 2 ToR 4 ToR 1 ToR 3 y, z payload ToR 3 z... Directory Service … x ToR 2 y ToR 3 z ToR 4 … Lookup & Response … x ToR 2 y ToR 3 z ToR 3 … Allows to use low-cost switches Protects network and hosts from host-state churn Obviates host and switch reconfiguration Allows to use low-cost switches Protects network and hosts from host-state churn Obviates host and switch reconfiguration

48 Clos Network Topology TOR 20 Servers Int... Aggr K aggr switches with D ports 20*(DK/4) Servers Offer huge aggr capacity & multi paths at modest cost D (# of 10G ports) Max DC size (# of Servers) 4811, , ,680

49 Valiant Load Balancing: Indirection 49 xy payload T3T3 y z T5T5 z I ANY Cope with arbitrary TMs with very little overhead Links used for up paths Links used for down paths T1T1 T2T2 T3T3 T4T4 T5T5 T6T6 [ ECMP + IP Anycast ] Harness huge bisection bandwidth Obviate esoteric traffic engineering or optimization Ensure robustness to failures Work with switch mechanisms available today [ ECMP + IP Anycast ] Harness huge bisection bandwidth Obviate esoteric traffic engineering or optimization Ensure robustness to failures Work with switch mechanisms available today 1. Must spread traffic 2. Must ensure dst independence 1. Must spread traffic 2. Must ensure dst independence Equal Cost Multi Path Forwarding

50 50 Properties of Desired Solutions Backwards compatible with existing infrastructure –No changes in application –Support of layer 2 (Ethernet) Cost-effective –Low power consumption & heat emission –Cheap infrastructure Allows host communication at line speed Zero-configuration No loops Fault-tolerant

51 Research Questions What topology to use in data centers? –Reducing wiring complexity –Achieving high bisection bandwidth –Exploiting capabilities of optics and wireless Routing architecture? –Flat layer-2 network vs. hybrid switch/router –Flat vs. hierarchical addressing How to perform traffic engineering? –Over-engineering vs. adapting to load –Server selection, VM placement, or optimizing routing Virtualization of NICs, servers, switches, … 51

52 Research Questions Rethinking TCP congestion control? –Low propagation delay and high bandwidth –Incast problem leading to bursty packet loss Division of labor for TE, access control, … –VM, hypervisor, ToR, and core switches/routers Reducing energy consumption –Better load balancing vs. selective shutting down Wide-area traffic engineering –Selecting the least-loaded or closest data center Security –Preventing information leakage and attacks 52


Download ppt "1 Data Center Networking CS 6250: Computer Networking Fall 2011."

Similar presentations


Ads by Google