CMPE 252A : Computer Networks

Slides:

Advertisements

Similar presentations

1 S4: Small State and Small Stretch Routing for Large Wireless Sensor Networks Yun Mao 2, Feng Wang 1, Lili Qiu 1, Simon S. Lam 1, Jonathan M. Smith 2.

Advertisements

COS 461 Fall 1997 Routing COS 461 Fall 1997 Typical Structure.

Multicast in Wireless Mesh Network Xuan (William) Zhang Xun Shi.

Datacenter Network Topologies

Chuanxiong Guo, Haitao Wu, Kun Tan,

Ji-Yong Shin * Bernard Wong +, and Emin Gün Sirer * * Cornell University + University of Waterloo 2 nd ACM Symposium on Cloud ComputingOct 27, 2011 Small-World.

A Scalable, Commodity Data Center Network Architecture Mohammad AI-Fares, Alexander Loukissas, Amin Vahdat Presented by Ye Tao Feb 6 th 2013.

A Scalable, Commodity Data Center Network Architecture

A Scalable, Commodity Data Center Network Architecture.

1 Algorithms for Bandwidth Efficient Multicast Routing in Multi-channel Multi-radio Wireless Mesh Networks Hoang Lan Nguyen and Uyen Trang Nguyen Presenter:

High Throughput Route Selection in Multi-Rate Ad Hoc Wireless Networks Dr. Baruch Awerbuch, David Holmer, and Herbert Rubens Johns Hopkins University Department.

Quasi Fat Trees for HPC Clouds and their Fault-Resilient Closed-Form Routing Technion - EE Department; *and Mellanox Technologies Eitan Zahavi* Isaac Keslassy.

Network Aware Resource Allocation in Distributed Clouds.

Routing & Architecture

Network and Communications Ju Wang Chapter 5 Routing Algorithm Adopted from Choi’s notes Virginia Commonwealth University.

Chi-Cheng Lin, Winona State University CS 313 Introduction to Computer Networking & Telecommunication Chapter 5 Network Layer.

Floodless in SEATTLE : A Scalable Ethernet ArchiTecTure for Large Enterprises. Changhoon Kim, Matthew Caesar and Jenifer Rexford. Princeton University.

A Routing Underlay for Overlay Networks Akihiro Nakao Larry Peterson Andy Bavier SIGCOMM’03 Reviewer: Jing lu.

Load-Balancing Routing in Multichannel Hybrid Wireless Networks With Single Network Interface So, J.; Vaidya, N. H.; Vehicular Technology, IEEE Transactions.

InterConnection Network Topologies to Minimize graph diameter: Low Diameter Regular graphs and Physical Wire Length Constrained networks Nilesh Choudhury.

TELE202 Lecture 6 Routing in WAN 1 Lecturer Dr Z. Huang Overview ¥Last Lecture »Packet switching in Wide Area Networks »Source: chapter 10 ¥This Lecture.

Routing Networks and Protocols Prepared by: TGK First Prepared on: Last Modified on: Quality checked by: Copyright 2009 Asia Pacific Institute of Information.

Dual Centric Data Center Network Architectures DAWEI LI, JIE WU (TEMPLE UNIVERSITY) ZHIYONG LIU, AND FA ZHANG (CHINESE ACADEMY OF SCIENCES) ICPP 2015.

Jennifer Rexford Fall 2014 (TTh 3:00-4:20 in CS 105) COS 561: Advanced Computer Networks TCP.

Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.

CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.

1 Traffic Engineering By Kavitha Ganapa. 2 Introduction Traffic engineering is concerned with the issue of performance evaluation and optimization of.

Performance Comparison of Ad Hoc Network Routing Protocols Presented by Venkata Suresh Tamminiedi Computer Science Department Georgia State University.

VL2: A Scalable and Flexible Data Center Network

William Stallings Data and Computer Communications

Data Center Architectures

Chen Qian, Xin Li University of Kentucky

Advanced Computer Networks

Yiting Xia, T. S. Eugene Ng Rice University

CIS 700-5: The Design and Implementation of Cloud Networks

Confluent vs. Splittable Flows

Lecture 2: Leaf-Spine and PortLand Networks

Data Center Network Architectures

ECE 544: Traffic engineering (supplement)

Improving Datacenter Performance and Robustness with Multipath TCP

Data Streaming in Computer Networking

A Study of Group-Tree Matching in Large Scale Group Communications

3. Internetworking (part 2: switched LANs)

Impact of Neighbor Selection on Performance and Resilience of Structured P2P Networks Sushma Maramreddy.

Routing Protocols and Concepts

Chapter 4 Data Link Layer Switching

Link-State Routing Protocols

ElasticTree Michael Fruchtman.

Surviving Holes and Barriers in Geographic Data Reporting for

FAR: A Fault-avoidance Routing Method for Data Center Networks with Regular Topology Please send.

Network Topologies CIS 700/005 – Lecture 3

NTHU CS5421 Cloud Computing

Intra-Domain Routing Jacob Strauss September 14, 2006.

ISP and Egress Path Selection for Multihomed Networks

Multi-Core Parallel Routing

Chuanxiong Guo, Haitao Wu, Kun Tan,

High Throughput Route Selection in Multi-Rate Ad Hoc Wireless Networks

Link-State Routing Protocols

Interconnection Network Design Lecture 14

NTHU CS5421 Cloud Computing

Dr. Rocky K. C. Chang 23 February 2004

Jellyfish: Networking Data Centers Randomly

EE 122: Lecture 7 Ion Stoica September 18, 2001.

Link-State Routing Protocols

Data Center Architectures

2019/10/9 A Weighted ECMP Load Balancing Scheme for Data Centers Using P4 Switches Presenter：Hung-Yen Wang Authors：Jin-Li Ye, Yu-Huang Chu, Chien Chen.

Towards Predictable Datacenter Networks

Presentation transcript:

CMPE 252A : Computer Networks Chen Qian Computer Engineering UCSC Baskin Engineering Lecture 12

Jellyfish: Networking Data Centers Randomly Paper by Ankit Singla, et.al . NDSI 2012. Some figures are from slides presented by Chi-Yao Hong, UIUC.

Facebook Amazon ‘Add capacity On a DAILY BASIS’ Datacenter is useful, Realistic world. Growing. http://news.netcraft.com/archives/2013/05/20/amazon-web-services-growth-unrelenting.html

Fat-Tree Topology Incremental growth??

Structured networks Hypercube, Fattree

Fat tree: Structure VS Limit N_switches: 3-level Fat tree : 5k2/4 for fat tree using k-port switches 24-port  3456 hosts 32-port  8192 hosts 48-port  27648 hosts What for 10000 hosts? Over utilize? Leave unused ports? Over utilize -> congection. Leaf of the trees. Modify. Leave unused, -> huge cost.

No structure = no restriction Goals Bandwidth & Capacity Better VM Placement  Reduce Traffic Better Topology  Avoid Bottleneck Robustness Failure resistance Flexbility: Incremental Expansion Easy to add VM Easy to remove VM Let’s consider this : What are the goals when building a datacenter. Jellyfish is based on the intuition that if we do not have a strict topology structure, When modifying the network, we do not to follow the rules, that makes high flexibility. In the later slides we will know that this no structure topology also provides high bandwidth. No structure = no restriction

Jellyfish : no structure Propose, topology of Jelly fish, The Jellyfish approach is to construct a RG at the top-of-rack switch layer. Severs connected on the broader of the random graph.

Virtualized jellyfish topology Each Switch has 12 neighbor switches. What does it look like? (click) Large core, and antennaes.. Topology of jellyfish networks for 432 severs, 180 switches, degree = 12

Random graph Regular Graph Random Regular Graph RG(n,r) Each vertex has the same degree r Random Regular Graph Random sampled from all RG(n,r) Hard to generate Question: How to generate? Before introducing the jellyfish details , We first introduce some concepts. (click) namely RG(n,r) A Random .. Means a random selected graph from all RG(n,r)s NP -hard

Not-so-uniform Random-RG(n,r) :: RRG(n,r) Procedure to modify RRG(n-1,r) to RRG(n,r) r=3 RRG(4,3) RRG(5,3) Actually use a not-so-uniform RG, A little different from the paper. R Refers to k-r demonstrate…. Required that RRG(n,r) has at least r+1 nodes, full graph.

Goals Bandwidth & Capacity Incremental Expansion Better VM Placement  Reduce Traffic Better Topology  Avoid Bottleneck Incremental Expansion Easy to add VM Easy to remove VM 接下来用一些examples 来show是一个better

About the Evaluation bisection bandwidth: Theoretical calculation for RRG, Bollobas’ theoretical lower bounds Throughput: random permutation traffic Each host choose one to send (at full speed)

Jellyfish VS LEGUP LEGUP attemps to maximize the bandwith, optimizes for bisection bandwidth, The drop … occurs because the number of servers increase in that step.

Vs. FatTree Bisection bandwidth Jellyfish: larger B-bandwidth using same # switches & servers Jellyfish: more servers under the same B-bandwidth and # switches

Lower cost

Better failure resilience

Larger Throughput Shows the number of servers at full throughput, under the assumption that: optimal routing, Caculated, not simulated.

Jellyfish vs. Small World Small World 3D Hexagon Torus (5 reg + 1 rand) Smallworld: grid + random Small World Ring (2 reg + 4 rand) Small World 2D Torus (4 reg + 2 rand)

Jellyﬁsh has higher capacity than the (same equipment) small world data center topologies [41] built using a ring, a 2D-Torus, and a 3D-Hex-Torus as the underlying lattice. Results are averaged over 10 runs It is not clearly shown in the paper what does this 1 mean, but I’m think this normalized throughput refers to the ration that, Jellyfish throughput divided by the total server bandwitdth

Reason of better performance

redundancy

Better than jellyfish ??? More hosts using same # of switches? Connecting more switches , each of which has same # ports, (limit the diameter) How many switches can be connected , with 3 switch-to-switch ports , and switch-to-switch path length <= 2? Petersen Graph After knowing that Jellyfish makes less redundancy than Fat Tree, while it does not produce congestion, We might ask, can we do better? Can we link more hosts using the same # of switchs? That is to say that we are trying to build a larger network topology, In the larger network, each switch has the same # of ports as switches in jellyfish. And the routing cost , which can Be indicated by the diameter of the graph, does not increase. More mathematical question, make it specific, combinational mathematics, 10.

Degree-diameter-graph Generating a large delta-

Degree-Diameter Graph have (nearly) highest throughput Jellyfish is only little bit worse.

But… Practical constraint: Routing / Congestion Control Cable

Routing & Congestion Control Utilize capacity without structure no layers! Routing : ECMP: fail to provide large path diversity K shortest path: Congestion Control TCP/ multipath TCP If all available capacity is fully utilized,

O(k2N*ShortestPath(N)) K-shortest path Different Path S-e1-e2-e3-…ex...-en-T S-e1-e2-e3-…ey…-em-T Algorithm to find 2nd-shortest path: Find a shortest path P from S to T in G For each e in P …Remove e from G …Calculate shortest path on G , namely SP(e) …add e back to Graph Return min(SP(e)) O(k2N*ShortestPath(N))

K-shortest path forwarding Shortest Paths (S,T): SAB1C1DT, SAB2C2DT, SAB3C2DT, B (B1,T) C1 A C S D A T (S,T) A (B4,T) C2 A choose a random node in the routing table to forward, E.g, A Interestingly, package in from S to T, when goes though B, can be forwarded backward to A, This is because the graph is not so well-connected, But a random regular graph is theoretically well connected, and degrees are much larger, Here we draw thick edges, that means the edge (S,A) is on three paths, while C2-D on 2. Namely , for the links between switches, if some links can be used on different sessions, that means different connect pairs can use the same link, Then we can utilize the link more. (A,T) B1 B2 B3

About 1000 edge is simultaneously on at most 1 routing paths. Inter-switch link’s path count in ECMP and k-shortestpath routing for random permutation traffic at the server-level on a typical Jellyfish of 686 servers. For each link, we count the number of distinct paths it is on.

Multi Path TCP (MPTCP) http://blogs.citrix.com/2013/08/23/networking-beyond-tcp-the-mptcp-way/

Packet simulation results for different routing and congestion control protocols

cabling Jellyfish uses 20% less # cables ,

Cabling in large data centers Topology generated automatically, Cables connected manually.. ( 10% of cost) Error detect : link-layer discovery protocol.

Jellyfish of Jellyfish Restrict some connections in pod Result: 2-layered random Graph

Jellyfish of Jellyfish Restrict some connections in pod Result: 2-layered random Graph

Cables between pods can be aggregated

Conclusion Bandwidth & Capacity Incremental Expansion Lower Cost Limitation: slow to compute forwarding paths. Large forwarding tables. Enough, even higher than fattree.

Space Shuffle: A Scalable, Flexible, and High-Bandwidth Data Center Network Ye Yu and Chen Qian

Motivation: Goals of Data Center Design High-bandwidth Data center applications generates high internal & external communication Flexibility Adding servers and expanding network bandwidth incrementally. Scalability Routing and Forwarding should rely on small forwarding state.

Motivation: Existing Data Center Architectures Network Bandwidth Incremental Growth (Flexible) Forwarding State per switch (Scalability) FatTree [SIGCOMM’ 08] Good No Fixed SWDC [SOCC’11] Fair Yes Constant Jellyfish [NSDI’12] Better than FatTree & SWDC Large and grows fast No shortest paths. Does not support multipath well. Greedy Routing Random Interconnection K-shortest path routing is inefficient. Big forwarding state.

Motivation: Goal of Space Shuffle (S2) How to build a flexible data center architecture that achieves high-throughput and scalability ? Approach: Greedy routing on random interconnection. Challenges: How to build a random interconnection that enables greedy routing? How does the greedy routing protocol achieve high-throughput and near-optimal path length?

Outline Motivation Space Shuffle Data Center Topology The Routing Protocol in Space Shuffle Data Center Discussion & Evaluation

S2 Topology Construction -Assign Servers Servers and Top-Of-Rack switches. Uniformly assign servers to switches. Connect servers to switches. The rest ports are used for inter-switch connections.

S2 Topology Construction: -Virtual Coordinates

S2 Topology Construction: -Virtual Spaces Switch ID Coor. 1 Coor. 2 A 0.05 0.17 B 0.13 0.62 C 0.23 0.91 D 0.36 0.42 E 0.53 F 0.51 0.58 G 0.63 0.73 H 0.78 0.26 I 0.97 A B C D E F G H I A B I D E F G H C Space 2 Space 1

S2 Topology Construction: -Connect the switches A G A B C D E F G H I H B F D E A B C D E F G H I Space 1 Space 2 A switch is physically connected to switches that are adjacent to itself in at least one space

S2 Topology Construction: -Connect the switches A I B H C D G F E

S2 Topology Construction: -Deploy-as-a-whole Construction Step 1 Assign hosts / switches Step 2 Generate coordinates (randomly) Step 3 Wire the network according to the coordinates.

S2 Topology Construction: -Incremental Construction Add a new switch T into existing S2 network Assign coordinate for T. For each space: Place T on the circle Find the switch SL and SR on the left/right side of T Disconnect SL,SR Connect T,SL; Connect T,SR SR T SL

Outline Motivation Space Shuffle Data Center Topology The Routing Protocol in Space Shuffle Data Center Discussion & Evaluation

Routing Protocol in S2: -Routable Address Step 1 Step 2 S2 uses greedy routing and greedy forwarding. In S2, the switches decide which port to forward the packet by estimating the distance between the destination and the possible next-hop switches. The key of greedy routing is how to represent the destination and how to estimate the distance to the destination. The routable add of a packet to hosth is defined as a pair , xh and idh, Where xh, (the switch that connected to h) idh, The routing protocol of s2 is defined as follow. 1 greedylite route the pkt to …. 2 the …

Routing Protocol in S2 -Definition of Distance CD(0.05,0.23) = |0.23-0.05| = 0.18 CD(0.17,0.91) = 0.17+(1-0.91) = 0.28 MCD2(A,C) = min(0.18,0.28)= 0.18 Switch Coor. 1 Coor. 2 A 0.05 0.17 C 0.23 0.91 A C A C

Routing Protocol in S2 -Forwarding Decision using MCD Switch MCD to the destination H 0.35 A 0.18 D 0.13 G 0.19 I 0.06 The switch with minimum MCD to the destination gets the packet Minimum of Minimum CD: Greediest

Routing Protocol in S2 -Multipath Next-hop candidates: all neighbor switch with smaller MCD to the destination than current. It provides enough path diversity by doing such selection only on the first switch of the path. Switch MCD to the destination Current 0.3 Neighbor 1 0.5 Neighbor 2 0.1 Neighbor 3 0.2 Neighbor 4 0.4 the packet goes to the destination as long as MCD decreases

Routing Protocol in S2: -Balanced Random Coordinates More traffic on links with small end-to-end MCD values. Uniformly distributed coordinates improves load balancing. Pure random generator may produce crowded coordinates. May lead to heavy-loaded links and hurt the network performance. Balanced Random Coordinate Generator avoids crowded coordinates. Provides better link-fairness & better network performance

Outline Motivation Space Shuffle Data Center Topology The Routing Protocol in Space Shuffle Data Center Discussion & Evaluation

Evaluation Topology property Routing efficiency Practical throughput

Evaluation -Topology Property S2 and Jellyfish: Flexible FatTree: Fixed Bisection bandwidth # of switches S2 & Jellyfish topologies share similar theoretical throughput, better than FatTree.

Evaluation -Routing Table Length 10 inter-switch ports

Evaluation -Routing Path Length SWDC: long routing paths, lower throughput. S2: near-optimal routing paths Jellyfish: optimal paths , expensive 12 inter-switch ports

Evaluation -Practical Throughput Greedy routing of S2 exploits the path diversity. S2 achieves near-Jellyfish throughput. S2 & Jellyfish both outperform SWDC 250-switch 500-host network

Comparing S2 with Jellyfish Construction Coordinates Ring Topology Generate ‘Almost’ Random Regular Graph Routing Greediest K-shortest path Hard to fit a Jellyfish topology into a routable coordinate space

Key-based Routing: -Definition Key-value stores https://www.facebook.com/photo.php?fbid= 677700648959984 Key-based Routing: route to the destination using the key of the content. (Not necessarily to know the IP) IP-based Routing: IP of the destination.

Key-based Routing: -Delivery Guarantee For any destination coordinate X, greediest routing will route the packet to a switch S, S is closest to X in at least one space. Solution: Keep one replica in each of fist r spaces and route using MCDr , r <=L For data a with key Ka, use global hash function H to calculate the destination coordinate X=H(Ka) In each of the r spaces, the access switch of the server for a is selected using global hash function H(Ka)

S2 Topology Construction- -Overview H servers and N Top-Of-Rack switches. Uniformly assign switches to servers. Generate Virtual Coordinates of switches. Connect the switches according to the coordinates, using the rest ports. (x1,x2,…) The rest ports are used for inter-switch connections

Summary High-bandwidth Flexibility Scalability S2 demonstrate high-bandwidth and high network throughput. Flexibility S2 supports incremental construction. Scalability Greedy routing in S2 only requires constant size of routing state.

Thank you! Q & A