Presentation is loading. Please wait.

Presentation is loading. Please wait.

© Sudhakar Yalamanchili, Georgia Institute of Technology (except as indicated) Topologies.

Similar presentations


Presentation on theme: "© Sudhakar Yalamanchili, Georgia Institute of Technology (except as indicated) Topologies."— Presentation transcript:

1 © Sudhakar Yalamanchili, Georgia Institute of Technology (except as indicated) Topologies

2 ECE 8813a (2) Overview Direct Networks Indirect Networks Cost Model Comparison of Direct and Indirect Networks

3 ECE 8813a (3) Classification Shared medium networks  Example: backplane buses Direct networks  Example: k-ary n-cubes, meshes, and trees Indirect networks  Example: multistage interconnection networks Hybrid Networks  Example: hypergraph topologies

4 ECE 8813a (4) Direct Networks Buses do not scale, electrically or in bandwidth Full connectivity too expensive (not the same as Xbars) Network built on point-to-point transfers Topologies: Strongly and weakly orthogonal ProcessorMemory Router Ejection channels injection channels

5 ECE 8813a (5) System View ProcessorMemory Router Ejection channels injection channels SB NB NI Processor PCIe High latency region Performance critical From http://www.psc.edu/publications/tech_reports/PDIO/CrayXT3- ScalableArchitecture.jpg

6 ECE 8813a (6) Common Properties Diameter Node degree Bisection BW Channel length Regularity and symmetry Latency I/O BW (pin-out) Throughput Latency Routing and path diversity

7 ECE 8813a (7) Evaluation Metrics Bisection bandwidth  This is minimum bandwidth across any bisection of the network Bisection bandwidth is a limiting attribute of performance bisection

8 ECE 8813a (8) Engineering Considerations Distinguish between layout (physical) and topology (logical) Average channel wire length

9 ECE 8813a (9) Extensions to Higher Dimensions Interleaved layout significant reduces the wire/cable length Improves packaging modularity  Note the end-around connections Impacts performance and cost Adapted from “Scalable Switching Fabrics for Internet Routers,” by W. J. Dally (can be found at www.avici.com)

10 ECE 8813a (10) Common Topologies Binary hypercube Torus Multidimensional mesh

11 ECE 8813a (11) Common Topologies Definition Basic connectivity properties  Diameter  I/O (also referred to as node size or pin-out)  Bisection bandwidth Routing

12 ECE 8813a (12) Metrics NetworkBisection WidthNode Size k-ary n-cube2Wk n-1 2Wn Binary n-cubeNW/2nW n-dimensional meshWk n-1 2Wn

13 ECE 8813a (13) Less Common Topologies Routing Basic properties

14 ECE 8813a (14) Less Common Topologies (cont.) Routing Basic properties A note on irregular topologies

15 ECE 8813a (15) Generalized Hypercubes Generalization of tori to multiple dimensions and multiple radices  Unique radix in each dimension Preserves the structure of addressing and routing techniques

16 ECE 8813a (16) Indirect Networks Switches may or may not host end-points

17 ECE 8813a (17) Multistage Interconnection Networks 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 Switch states Interconnect specified as a permutation Number of stages = log 2 N Can be generalized to KxK switches Networks defined by inter-stage permutations © T.M. Pinkston, J. Duato, with major contributions by J. Filch

18 ECE 8813a (18) The Shuffle Interconnection shuffle(i)

19 ECE 8813a (19) The Baseline Interconnection baseline(i)

20 ECE 8813a (20) The Butterfly Interconnection butterfly(i)

21 ECE 8813a (21) The Cube Interconnection cube(i) switch

22 ECE 8813a (22) Omega Network 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 shuffle © T.M. Pinkston, J. Duato, with major contributions by J. Filch

23 ECE 8813a (23) Baseline Network 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 sub-shuffle i © T.M. Pinkston, J. Duato, with major contributions by J. Filch

24 ECE 8813a (24) Butterfly 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 butterfly i © T.M. Pinkston, J. Duato, with major contributions by J. Filch

25 ECE 8813a (25) Cube Network 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 cube i © T.M. Pinkston, J. Duato, with major contributions by J. Filch

26 ECE 8813a (26) Routing in MINs Routing can be modeled as a sequence address transformations  Each stage transforms a bit of the source address into a bit of the destination address Routing Implementation: a single bit of the destination address determines the output port

27 ECE 8813a (27) Basic Properties Diameter, path length and pin-out Bisection bandwidth

28 ECE 8813a (28) Blocking vs. Non-blocking Networks blocking topology X non-blocking topology 7 6 5 4 3 2 1 0 7 6 54 3210 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 Consider the permutation behavior  Model the input-output requests as permutations of the source addresses © T.M. Pinkston, J. Duato, with major contributions by J. Filch

29 ECE 8813a (29) Blocking Behavior Strictly non-blocking  A new connection can always be set up  Every permutation can be realized Weakly non-blocking  Strictly non-blocking only under some routing protocols Blocking  Some permutations cannot be realized Rearrangeable  Every permutation can be realized by rearranging existing connections

30 ECE 8813a (30) Crossbar Network 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 © T.M. Pinkston, J. Duato, with major contributions by J. Filch

31 ECE 8813a (31) Non-Blocking Clos Network 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 © T.M. Pinkston, J. Duato, with major contributions by J. Filch

32 ECE 8813a (32) Clos Network Properties General 3 stage non-blocking network  Originally conceived for telephone networks Recursive decomposition  Produces the Benes network with 2x2 switches

33 ECE 8813a (33) Clos Network: Recursive Decomposition 16 port, 5-stage Clos network 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 © T.M. Pinkston, J. Duato, with major contributions by J. Filch

34 ECE 8813a (34) Clos Network: Recursive Decomposition 16 port, 7 stage Clos network = Benes topology 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 © T.M. Pinkston, J. Duato, with major contributions by J. Filch

35 ECE 8813a (35) Path Diversity Alternative paths from 0 to 1. 16 port, 7 stage Clos network = Benes topology 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 © T.M. Pinkston, J. Duato, with major contributions by J. Filch

36 ECE 8813a (36) Path Diversity 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 Alternative paths from 4 to 0. 16 port, 7 stage Clos network = Benes topology © T.M. Pinkston, J. Duato, with major contributions by J. Filch

37 ECE 8813a (37) Path Diversity 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 Contention free, paths 0 to 1 and 4 to 1. 16 port, 7 stage Clos network = Benes topology © T.M. Pinkston, J. Duato, with major contributions by J. Filch

38 ECE 8813a (38) Bidirectional MINs

39 ECE 8813a (39) Routing in Bidirectional MINS Networks are multi-path Routing takes place in two steps: route to an intermediate node followed by routing to destination  Multiple intermediate nodes can be selected  Path from intermediate node to destination is unique

40 ECE 8813a (40) Moving to Fat Trees Nodes at tree leaves Switches at tree vertices Total link bandwidth is constant across all tree levels, with full bisection bandwidth Equivalent to folded Benes topology Preferred topology in many system area networks Folded Clos = Folded Benes = Fat tree network Network Bisection © T.M. Pinkston, J. Duato, with major contributions by J. Filch

41 ECE 8813a (41) Fat Trees: Another View Equivalent to the preceding multistage implementation Common topology in many supercomputer installations Forward Backward

42 ECE 8813a (42) Generalized MINs Generalized switch radix Routing and mathematics uniform across switch radix values

43 ECE 8813a (43) Hybrid Networks 2D Hypermesh Cluster based 2D Mesh

44 ECE 8813a (44) A Cost Model Crossbar costs  Switch – N 2  Link costs – 2N Multistage interconnection networks (MINs)  MINs interconnect N input/output ports using k x k switches olog k N switch stages, each with N/k switches oN/k(log k N) total number of switches  Example: Compute the relative switch and link costs of interconnecting 4096 nodes © T.M. Pinkston, J. Duato, with major contributions by J. Filch

45 ECE 8813a (45) Example Example: compute the relative switch and link costs, N = 4096 relative_cost(2 × 2) switches = 4096 2 / (2 2 × 4096/2 × log 2 4096) = 170 relative_cost(4 × 4) switches = 4096 2 / (4 2 × 4096/4 × log 4 4096) = 170 relative_cost(16 × 16) switches = 4096 2 / (16 2 × 4096/16 × log 16 4096) = 85 relative_cost(2 × 2) links = 8192 / (4096 × (log 2 4096 + 1)) = 2/13 = 0.1538 relative_cost(4 × 4) links = 8192 / (4096 × (log 4 4096 + 1)) = 2/7 = 0.2857 relative_cost(16 × 16) links = 8192 / (4096 × (log 16 4096 + 1)) = 2/4 = 0.5 cost(crossbar) switches = 4096 2 cost(crossbar) links = 8192 © T.M. Pinkston, J. Duato, with major contributions by J. Filch

46 ECE 8813a (46) Example (cont.) Relative link costRelative switch cost Relative switch and link costs for various values of k and N (crossbar relative to a MIN) © T.M. Pinkston, J. Duato, with major contributions by J. Filch

47 ECE 8813a (47) Comparison of Direct and Indirect Networks N = 16, k = 4 fat tree-like MIN Indirect networks have end nodes connected at network periphery © T.M. Pinkston, J. Duato, with major contributions by J. Filch

48 ECE 8813a (48) Comparison of Direct and Indirect Networks N = 8, k = 4 2D torus Direct networks have end nodes connect in network area/volume © T.M. Pinkston, J. Duato, with major contributions by J. Filch

49 ECE 8813a (49) Comparison of Direct and Indirect Networks N = 8, k = 4 2D torus Direct networks have end nodes connect in network area/volume © T.M. Pinkston, J. Duato, with major contributions by J. Filch

50 ECE 8813a (50) Comparison of Direct and Indirect Networks N = 16, k = 4 2D torus Direct networks have end nodes connect in network area/volume © T.M. Pinkston, J. Duato, with major contributions by J. Filch

51 ECE 8813a (51) Comparison of Direct and Indirect Networks 64-node system with 8-port switches, b = 4 32-node system with 8-port switches Bristling can be used to reduce direct network switch & link costs  “b” end nodes connect to each switch, where b is bristling factor  Allows larger systems to be built from fewer switches and links  Requires larger switch degree  For N = 32 and k = 8, fewer switches and links than fat tree © T.M. Pinkston, J. Duato, with major contributions by J. Filch

52 ECE 8813a (52) Comparison of Direct and Indirect Networks Switches End Nodes Distance scaling problems may be exacerbated in on-chip MINs © T.M. Pinkston, J. Duato, with major contributions by J. Filch

53 ECE 8813a (53) Blocking reduced by maximizing dimensions (switch degree)  Can increase bisection bandwidth, but oAdditional dimensions may increase wire length (must observe 3D packaging constraints) oFlow control issues (buffer size increases with link length) oPin-out constraints (limit the number of dimensions achievable) Evaluation categoryBus BW Bisection in # links Ring2D mesh 1 HypercubeFat tree2D torus Fully connected 281632 1024 Max (ave.) hop count1 (1)32 (16)14 (7)8 (4)6 (3)11 (9)1 (1) I/O ports per switchNA3557464 Number of switchesNA64 19264 Number of net. links1641121281923202016 Total number of links11281761922563842080 Performance and cost of several network topologies for 64 nodes. Values are given in terms of bidirectional links & ports. Hop count includes a switch and its output link (in the above, end node links are not counted for the bus topology). Perf. Cost Comparison of Direct and Indirect Networks © T.M. Pinkston, J. Duato, with major contributions by J. Filch

54 ECE 8813a (54) Commercial Machines Company System [Network] Name Max. number of nodes [x # CPUs] Basic network topology Injection [Recept’n] node BW in MBytes/s # of data bits per link per direction Raw network link BW per direction in Mbytes/sec Raw network bisection BW (bidir) in Gbytes/s Intel ASCI Red Paragon 4,510 [x 2] 2-D mesh 64 x 64 400 [400] 16 bits400 IBM ASCI White SP Power3 [Colony] 512 [x 16] BMIN w/8-port bidirect. switches (fat- tree or Omega) 500 [500] 8 bits (+1 bit of control) 500 Intel Thunter Itanium2 Tiger4 [QsNet II ] 1,024 [x 4] fat tree w/8-port bidirectional switches 928 [928] 8 bits (+2 control for 4b/5b enc) 1,333 51.2 256 1,365 Cray XT3 [SeaStar] 30,508 [x 1] 3-D torus 40 x 32 x 24 3,200 [3,200] 12 bits3,800 CrayX1E 1,024 [x 1] 4-way bristled 2-D torus (~ 23 x 11) with express links 1,600 [1,600] 16 bits1,600 IBM ASC Purple pSeries 575 [Federation] >1,280 [x 8] BMIN w/8-port bidirect. switches (fat-tree or Omega) 2,000 [2,000] 8 bits (+2 bits of control) 2,000 IBM Blue Gene/L eServer Sol. [Torus Net] 65,536 [x 2] 3-D torus 32 x 32 x 64 612,5 [1,050] 1 bit (bit serial) 175 5,836.8 51.2 2,560 358.4 © T.M. Pinkston, J. Duato, with major contributions by J. Filch

55 ECE 8813a (55) A Unified View of Direct and Indirect Networks Switch designs in both cases are coalescing  Generic network may have 0, 1, or more compute nodes/switch Switches implement programmable routing functions Differences are primarily an issue of topology  Imagine the use of source routed messages Deadlock avoidance

56 ECE 8813a (56) Summary and Research Directions Use of hybrid interconnection networks  Best way to utilize existing pin-out? Engineering considerations rapidly prune the space of candidate topologies Routing + switching + topology = network Onto routing…….


Download ppt "© Sudhakar Yalamanchili, Georgia Institute of Technology (except as indicated) Topologies."

Similar presentations


Ads by Google