Presentation is loading. Please wait.

Presentation is loading. Please wait.

Yiting Xia, T. S. Eugene Ng Rice University

Similar presentations


Presentation on theme: "Yiting Xia, T. S. Eugene Ng Rice University"— Presentation transcript:

1 Yiting Xia, T. S. Eugene Ng Rice University
Flat-tree A Convertible Data Center Network Architecture from Clos to Random Graph Yiting Xia, T. S. Eugene Ng Rice University

2 Clos Topology 3-stage folded Clos
- standard data center network architecture Core Switches Aggregation Switches Edge Switches Pods 1

3 Clos Topology Implementation friendly - central wiring
- flexible scale and oversubscription - Pod modular design Suboptimal performance - long paths - congested network core 2

4 Random Graph Good performance Hard to implement
- low average path length - rich bandwidth - optimal throughput for uniform traffic Hard to implement - neighbor-to-neighbor wiring complicated [Jellyfish NSDI’12] 3

5 Can we combine the best of both worlds?
Why fixed topology? Tree Network Flat Network vs. Easy implementation Good performance Can we combine the best of both worlds? 4

6 Why fixed topology? Fluid data center traffic
Fat-tree SIGCOMM’08 BCube SIGCOMM’09 DCell SIGCOMM’08 HyperX SC’09 Easy implementation Good performance Fluid data center traffic - each topology has sweet spots - one-size-fit-all topology impossible Cloud service constantly changing - fixed topology not adaptive to new demands 5

7 Convertible Network Flat-tree Tree Network Flat Network 6

8 Design Highlights Flat-tree starts from a Clos network and converts the topology to approximate random graphs. Challenges: Relocate servers from edge switches to aggregation and core switches Connect edge and core switches directly Easy peer-wise wiring between switches Random graphs of different scales Combinations of different topologies Packaging in Pods 7

9 Converter Switch Small port-count Low cost Physical layer device A B C
- as packet switch * simple switching logic * no bandwidth contention * no expensive processor/buffering - as circuit switch * not sensitive to delay * small scale Physical layer device A B C D 8

10 Converter Switch Configurations
9

11 Flat-tree Example Clos Pod 10 Core Switch Edge Switch
Aggregation Switch Server 10

12 Flat-tree Example Flat-tree Pod 11 Core Switch Edge Switch
Converter Switch Aggregation Switch Server 11

13 Clos Network 12 Core Switch Converter Switch Aggregation Switch Server
Edge Switch 12

14 Approximate Random Graph
Core Switch Converter Switch Aggregation Switch Server Edge Switch 13

15 Approximate Local Random Graph
Core Switch Converter Switch Aggregation Switch Server Edge Switch 14

16 Flat-tree Pod Blade B 15

17 Flat-tree Pod 16

18 Pod-Core Wiring 17

19 Server Distribution Choice of m and n Network profiling
- how many servers per switch of different types - flat-tree maintains structure  not purely random * Clos connections between edge and aggregation switches * Pod-core connections * peer-wise connections between adjacent Pods - place servers to leverage shorter paths Network profiling - vary m and n - minimize average path length 18

20 Inter-Pod Wiring Simple shifting wiring pattern No repeated connection
- <i, j> in Pod p  <i, (d/2-1-j+i)%(d/2)> in Pod p+1 No repeated connection Same number of “side” and “cross” connections Multi-link connectors - streamline the connection between adjacent Pods - hide wiring complexity 19

21 Evaluation Compared networks Metric - fat-tree - random graph
- two-level random graph - flat-tree global (approximated global random graph) - flat-tree local (approximated pod-level random graph) - flat-tree hybrid (part flat-tree global and part flat-tree local) Metric - average path length - throughput * optimal routing * server links unbounded * linear programming solution

22 Evaluation Traffic patterns Locality
- hot spots: broadcast/incast traffic in 1000-server clusters - clusters: all-to-all traffic in 20-server clusters Locality - (strong) locality * workload placed continuously across servers - weak locality * workload placed randomly in Pods - no locality * workload placed randomly in the entire network

23 Summary of Simulation Results
Global Average Path Length Flat-tree Random Graph Clos ~4.75 ~4.6 ~5.9 Pod-Level Average Path Length Flat-tree Two-Level Random Graph Random Graph Clos ~3.4 ~3.6 ~4.6 ~3.9 20

24 Summary of Simulation Results
Throughput of hot-spot traffic - flat-tree ≈ random graph - flat-tree = 1.5x Clos Throughput of small-clustered traffic - flat-tree > two-level random graph for 1/3 cases - flat-tree >= 91% two-level random graph - flat-tree = 1.15x random graph - flat-tree = 1.6x Clos 21

25 Global Average Path Length

26 Pod-Level Average Path Length

27 Throughput of Hot Spots Traffic

28 Throughput of Clustered Traffic

29 Conclusion Flat-tree converts between Clos topology and random graphs of different scales Low cost - inexpensive converter switches Easy implementation - changes packaged in Pods - regular Pod-core wiring patterns - multi-links between adjacent Pods Hybrid mode - network zones with different topologies Performance similar to random graphs - < 5% longer average path length - < 9% lower throughput 22

30 Impact and Inspiration
Flat-tree is one design point of convertible network Motivate further study of relationship between different topologies Traffic optimization - joint optimization with routing and workload placement Network management - self recovery from failures - automatic up/down scale network at busy/idle time 23


Download ppt "Yiting Xia, T. S. Eugene Ng Rice University"

Similar presentations


Ads by Google