Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Center Networking

Similar presentations


Presentation on theme: "Data Center Networking"— Presentation transcript:

1 Data Center Networking
CS 6250: Computer Networking Fall 2011

2 Cloud Computing Elastic resources Multi-tenancy
Expand and contract resources Pay-per-use Infrastructure on demand Multi-tenancy Multiple independent users Security and resource isolation Amortize the cost of the (shared) infrastructure Flexibility service management Resiliency: isolate failure of servers and storage Workload movement: move work to other locations

3 Trend of Data Center 200 million Euro
By J. Nicholas Hoover ,  InformationWeek June 17, :00 AM 200 million Euro Data centers will be larger and larger in the future cloud computing era to benefit from commodities of scale. Tens of thousand  Hundreds of thousands in # of servers Most important things in data center management Economies of scale High utilization of equipment Maximize revenue Amortize administration cost Low power consumption (

4 Cloud Service Models Software as a Service Platform as a Service
Provider licenses applications to users as a service e.g., customer relationship management, , … Avoid costs of installation, maintenance, patches, … Platform as a Service Provider offers software platform for building applications e.g., Google’s App-Engine Avoid worrying about scalability of platform Infrastructure as a Service Provider offers raw computing, storage, and network e.g., Amazon’s Elastic Computing Cloud (EC2) Avoid buying servers and estimating resource needs

5 Multi-Tier Applications
Applications consist of tasks Many separate components Running on different machines Commodity computers Many general-purpose computers Not one big mainframe Easier scaling Front end Server Aggregator … … Aggregator Aggregator Aggregator Worker Worker Worker Worker Worker

6 Enabling Technology: Virtualization
Multiple virtual machines on one physical machine Applications run unmodified as on real machine VM can migrate from one computer to another

7 Status Quo: Virtual Switch in Server

8 Top-of-Rack Architecture
Rack of servers Commodity servers And top-of-rack switch Modular design Preconfigured racks Power, network, and storage cabling Aggregate to the next level

9 Modularity Containers Many containers

10 Data Center Challenges
Traffic load balance Support for VM migration Achieving bisection bandwidth Power savings / Cooling Network management (provisioning) Security (dealing with multiple tenants)

11 Data Center Costs (Monthly Costs)
Servers: 45% CPU, memory, disk Infrastructure: 25% UPS, cooling, power distribution Power draw: 15% Electrical utility costs Network: 15% Switches, links, transit

12 Common Data Center Topology
Internet Data Center Layer-3 router Core Layer-2/3 switch Aggregation Before describing the problems with current middlebox deployment approaches, let me first describe the commonly used 3-tier data center network topology. At the top is the core-tier, whose layer-3 routers connect the data center to the Internet or to the rest of the campus network. At the bottom is the access tier, containing the layer-2 switches into which servers are plugged in. In between the access and core tiers are the layer 2/3 switches of the aggregation tier. Middleboxes are commonly deployed at the aggregation tier. Multiple redundant links connect together the various switches and servers. To prevent forwarding loops, we use mechanisms like spanning tree construction to block out some of the links. For example, the topology as shown here. Layer-2 switch Access Servers

13 Data Center Network Topology
Internet CR CR . . . AR AR AR AR S S . . . S S S S Key CR = Core Router AR = Access Router S = Ethernet Switch A = Rack of app. servers A A A A A A ~ 1,000 servers/pod

14 Requirements for future data center
To catch up with the trend of mega data center, DCN technology should meet the requirements as below High Scalability Transparent VM migration (high agility) Easy deployment requiring less human administration Efficient communication Loop free forwarding Fault Tolerance Current DCN technology can’t meet the requirements. Layer 3 protocol can not support the transparent VM migration. Current Layer 2 protocol is not scalable due to the size of forwarding table and native broadcasting for address resolution.

15 Problems with Common Topologies
Single point of failure Over subscription of links higher up in the topology Tradeoff between cost and provisioning

16 Capacity Mismatch . . . … … … … CR CR ~ 200:1 AR AR AR AR S S S S
~ 40:1 . . . S S S S S S S S ~ 5:1 A A A A A A A A A A A A

17 Data-Center Routing . . . . . . DC-Layer 3 DC-Layer 2 … … Internet CR
AR AR AR AR DC-Layer 2 S S S S S S S S . . . S S S S Key CR = Core Router (L3) AR = Access Router (L3) S = Ethernet Switch (L2) A = Rack of app. servers A A A A A A ~ 1,000 servers/pod == IP subnet

18 Reminder: Layer 2 vs. Layer 3
Ethernet switching (layer 2) Cheaper switch equipment Fixed addresses and auto-configuration Seamless mobility, migration, and failover IP routing (layer 3) Scalability through hierarchical addressing Efficiency through shortest-path routing Multipath routing through equal-cost multipath So, like in enterprises… Data centers often connect layer-2 islands by IP routers

19 Need for Layer 2 Certain monitoring apps require server with same role to be on the same VLAN Using same IP on dual homed servers Allows organic growth of server farms Migration is easier

20 Review of Layer 2 & Layer 3 Layer 2 Layer 3
One spanning tree for entire network Prevents loops Ignores alternate paths Layer 3 Shortest path routing between source and destination Best-effort delivery In this talk, I shall next explain the problems with current middlebox deployment mechanisms. I shall then describe how our solution, the policy-aware switching layer simplifies middlebox deployment and achieves the properties I mentioned earlier. I shall briefly discuss related work, and our prototype implementation and evaluation of the policy-aware switching layer.

21 FAT Tree-Based Solution
Connect end-host together using a fat-tree topology Infrastructure consist of cheap devices Each port supports same speed as endhost All devices can transmit at line speed if packets are distributed along existing paths A k-port fat tree can support k3/4 hosts

22 Fat-Tree Topology

23 Problems with a Vanilla Fat-tree
Layer 3 will only use one of the existing equal cost paths Packet re-ordering occurs if layer 3 blindly takes advantage of path diversity

24 Modified Fat Tree Enforce special addressing scheme in DC
Allows host attached to same switch to route only through switch Allows inter-pod traffic to stay within pod unused.PodNumber.switchnumber.Endhost Use two level look-ups to distribute traffic and maintain packet ordering.

25 Two-Level Lookups First level is prefix lookup
Used to route down the topology to endhost Second level is a suffix lookup Used to route up towards core Diffuses and spreads out traffic Maintains packet ordering by using the same ports for the same endhost

26 Diffusion Optimizations
Flow classification Eliminates local congestion Assign to traffic to ports on a per-flow basis instead of a per-host basis Flow scheduling Eliminates global congestion Prevent long lived flows from sharing the same links Assign long lived flows to different links

27 Drawbacks No inherent support for VLAN traffic
Data center is fixed size Ignored connectivity to the Internet Waste of address space Requires NAT at border

28 Data Center Traffic Engineering
Challenges and Opportunities

29 Wide-Area Network . . . Internet Data Centers Servers Router
DNS Server DNS-based site selection . . . Servers Internet Clients Data Centers

30 Wide-Area Network: Ingress Proxies
Router Data Centers . . . Servers Clients Proxy

31 Traffic Engineering Challenges
Scale Many switches, hosts, and virtual machines Churn Large number of component failures Virtual Machine (VM) migration Traffic characteristics High traffic volume and dense traffic matrix Volatile, unpredictable traffic patterns Performance requirements Delay-sensitive applications Resource isolation between tenants

32 Traffic Engineering Opportunities
Efficient network Low propagation delay and high capacity Specialized topology Fat tree, Clos network, etc. Opportunities for hierarchical addressing Control over both network and hosts Joint optimization of routing and server placement Can move network functionality into the end host Flexible movement of workload Services replicated at multiple servers and data centers Virtual Machine (VM) migration

33 PortLand: Main Idea Add a new host Key features Transfer a packet
Layer 2 protocol based on tree topology PMAC encode the position information Data forwarding proceeds based on PMAC Edge switch’s responsible for mapping between PMAC and AMAC Fabric manger’s responsible for address resolution Edge switch makes PMAC invisible to end host Each switch node can identify its position by itself Fabric manager keep information of overall topology. Corresponding to the fault, it notifies affected nodes. Transfer a packet

34 Questions in discussion board
Question about Fabric Manager How to make Fabric Manager robust ? How to build scalable Fabric Manager ?  Redundant deployment or cluster Fabric manager could be solution Question about reality of base-line tree topology Is the tree topology common in real world ?  Yes, multi-rooted tree topology has been a traditional topology in data center. [A scalable, commodity data center network architecture Mohammad La-Fares and et el, SIGCOMM ‘08]

35 Discussion points Question about benefits from VM migration ETC
Is the PortLand applicable to other topology ?  The Idea of central ARP management could be applicable.  To solve forwarding loop problem, TRILL header-like method would be necessary.  The benefits of PMAC ? It would require larger size of forwarding table. Question about benefits from VM migration VM migration helps to reduce traffic going through aggregate/core switches. How about user requirement change? How about power consumption ? ETC Feasibility to mimic PortLand on layer 3 protocol. How about using pseudo IP ? Delay to boot up whole data center

36 Status Quo: Conventional DC Network
Status Quo: Conventional DC Network Internet CR CR DC-Layer 3 . . . AR AR AR AR DC-Layer 2 Key CR = Core Router (L3) AR = Access Router (L3) S = Ethernet Switch (L2) A = Rack of app. servers S S . . . S S S S A A A A A A ~ 1,000 servers/pod == IP subnet Reference – “Data Center: Load balancing Data Center Services”, Cisco 2004

37 Conventional DC Network Problems
Conventional DC Network Problems CR CR ~ 200:1 AR AR AR AR S S S S ~ 40:1 . . . S S S S S S S S ~ 5:1 A A A A A A A A A A A A Dependence on high-cost proprietary routers Extremely limited server-to-server capacity

38 And More Problems … CR CR ~ 200:1 AR AR AR AR S S S S S S S S S S S S A A A A A A A A A A A A A IP subnet (VLAN) #1 IP subnet (VLAN) #2 Resource fragmentation, significantly lowering utilization (and cost-efficiency)

39 Complicated manual L2/L3 re-configuration
And More Problems … CR CR ~ 200:1 AR AR AR AR Complicated manual L2/L3 re-configuration S S S S S S S S S S S S A A A A A A A A A A A A A IP subnet (VLAN) #1 IP subnet (VLAN) #2 Resource fragmentation, significantly lowering cloud utilization (and cost-efficiency)

40 All We Need is Just a Huge L2 Switch, or an Abstraction of One
All We Need is Just a Huge L2 Switch, or an Abstraction of One CR AR S . . . . . . A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A

41 VL2 Approach Layer 2 based using future commodity switches
Hierarchy has 2: access switches (top of rack) load balancing switches Eliminate spanning tree Flat routing Allows network to take advantage of path diversity Prevent MAC address learning 4D architecture to distribute data plane information TOR: Only need to learn address for the intermediate switches Core: learn for TOR switches Support efficient grouping of hosts (VLAN replacement)

42 VL2

43 VL2 Components Top-of-Rack switch: Intermediate Switch
Aggregate traffic from 20 end host in a rack Performs ip to mac translation Intermediate Switch Disperses traffic Balances traffic among switches Used for valiant load balancing Decision Element Places routes in switches Maintain a directory services of IP to MAC End-host Performs IP-to-MAC lookup

44 Routing in VL2 End-host checks flow cache for MAC of flow
If not found ask agent to resolve Agent returns list of MACs for server and MACs for intermediate routers Send traffic to Top of Router Traffic is triple-encapsulated Traffic is sent to intermediate destination Traffic is sent to Top of rack switch of destination

45 The Illusion of a Huge L2 Switch 3. Performance isolation
The Illusion of a Huge L2 Switch 1. L2 semantics 2. Uniform high capacity 3. Performance isolation A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A

46 Objectives and Solutions
Objectives and Solutions Objective Approach Solution 1. Layer-2 semantics Employ flat addressing Name-location separation & resolution service 2. Uniform high capacity between servers Guarantee bandwidth for hose-model traffic Flow-based random traffic indirection (Valiant LB) 3. Performance Isolation Enforce hose model using existing mechanisms only TCP

47 Name/Location Separation
Cope with host churns with very little overhead VL2 Directory Service Switches run link-state routing and maintain only switch-level topology Allows to use low-cost switches Protects network and hosts from host-state churn Obviates host and switch reconfiguration x  ToR2 y  ToR3 z  ToR4 x  ToR2 y  ToR3 z  ToR3 ToR1 . . . ToR2 . . . ToR3 . . . ToR4 Lazy update of stale cache entries when packet erroneously reaches the old ToR (who forwards to the directory server, who updates with unicast) ToR3 y payload Lookup & Response x y y, z z ToR3 ToR4 z z payload payload Servers use flat names

48 Clos Network Topology VL2
Offer huge aggr capacity & multi paths at modest cost VL2 Int . . . D (# of 10G ports) Max DC size (# of Servers) 48 11,520 96 46,080 144 103,680 Aggr . . . K aggr switches with D ports D/2 ports facing up, and D/2 ports facing down… . . . TOR . . . 20 Servers 20*(DK/4) Servers

49 Valiant Load Balancing: Indirection
Cope with arbitrary TMs with very little overhead Links used for up paths Links used for down paths IANY IANY IANY [ ECMP + IP Anycast ] Harness huge bisection bandwidth Obviate esoteric traffic engineering or optimization Ensure robustness to failures Work with switch mechanisms available today T1 T2 T3 T4 T5 T6 IANY T3 T5 y z payload payload 1. Must spread traffic 2. Must ensure dst independence Equal Cost Multi Path Forwarding x y z

50 Properties of Desired Solutions
Backwards compatible with existing infrastructure No changes in application Support of layer 2 (Ethernet) Cost-effective Low power consumption & heat emission Cheap infrastructure Allows host communication at line speed Zero-configuration No loops Fault-tolerant

51 Research Questions What topology to use in data centers?
Reducing wiring complexity Achieving high bisection bandwidth Exploiting capabilities of optics and wireless Routing architecture? Flat layer-2 network vs. hybrid switch/router Flat vs. hierarchical addressing How to perform traffic engineering? Over-engineering vs. adapting to load Server selection, VM placement, or optimizing routing Virtualization of NICs, servers, switches, …

52 Research Questions Rethinking TCP congestion control?
Low propagation delay and high bandwidth “Incast” problem leading to bursty packet loss Division of labor for TE, access control, … VM, hypervisor, ToR, and core switches/routers Reducing energy consumption Better load balancing vs. selective shutting down Wide-area traffic engineering Selecting the least-loaded or closest data center Security Preventing information leakage and attacks


Download ppt "Data Center Networking"

Similar presentations


Ads by Google