1 Lecture 23: Interconnection Networks Topics: communication latency, centralized and decentralized switches (Appendix E)

Slides:



Advertisements
Similar presentations
Comparison Of Network On Chip Topologies Ahmet Salih BÜYÜKKAYHAN Fall.
Advertisements

Shantanu Dutt Univ. of Illinois at Chicago
Flattened Butterfly Topology for On-Chip Networks John Kim, James Balfour, and William J. Dally Presented by Jun Pang.
1 Lecture 12: Interconnection Networks Topics: dimension/arity, routing, deadlock, flow control.
1 Interconnection Networks Direct Indirect Shared Memory Distributed Memory (Message passing)
CSCI 8150 Advanced Computer Architecture Hwang, Chapter 2 Program and Network Properties 2.4 System Interconnect Architectures.
Interconnection Networks 1 Interconnection Networks (Chapter 6) References: [1,Wilkenson and Allyn, Ch. 1] [2, Akl, Chapter 2] [3, Quinn, Chapter 2-3]
1 CSE 591-S04 (lect 14) Interconnection Networks (notes by Ken Ryu of Arizona State) l Measure –How quickly it can deliver how much of what’s needed to.
NUMA Mult. CSE 471 Aut 011 Interconnection Networks for Multiprocessors Buses have limitations for scalability: –Physical (number of devices that can be.
Interconnection Network PRAM Model is too simple Physically, PEs communicate through the network (either buses or switching networks) Cost depends on network.
Communication operations Efficient Parallel Algorithms COMP308.
1 Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control Final exam reminders:  Plan well – attempt every question.
1 Tuesday, September 26, 2006 Wisdom consists of knowing when to avoid perfection. -Horowitz.
1 Lecture 24: Interconnection Networks Topics: communication latency, centralized and decentralized switches (Sections 8.1 – 8.5)
Interconnection Network Topologies
Examples Broadcasting and Gossiping. Broadcast on ring (Safe and Forward)
Interconnection Network Topology Design Trade-offs
1 Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control.
1 Lecture 25: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) Review session,
1 Static Interconnection Networks CEG 4131 Computer Architecture III Miodrag Bolic.
ECE669 L16: Interconnection Topology March 30, 2004 ECE 669 Parallel Computer Architecture Lecture 16 Interconnection Topology.
Interconnect Network Topologies
Interconnection Networks. Applications of Interconnection Nets Interconnection networks are used everywhere! ◦ Supercomputers – connecting the processors.
1 Lecture 23: Interconnection Networks Topics: Router microarchitecture, topologies Final exam next Tuesday: same rules as the first midterm Next semester:
Computer Science Department
Interconnect Networks
Network Topologies Topology – how nodes are connected – where there is a wire between 2 nodes. Routing – the path a message takes to get from one node.
PPC Spring Interconnection Networks1 CSCI-4320/6360: Parallel Programming & Computing (PPC) Interconnection Networks Prof. Chris Carothers Computer.
CSE Advanced Computer Architecture Week-11 April 1, 2004 engr.smu.edu/~rewini/8383.
Dynamic Interconnect Lecture 5. COEN Multistage Network--Omega Network Motivation: simulate crossbar network but with fewer links Components: –N.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
1 Dynamic Interconnection Networks Miodrag Bolic.
Multiprocessor Interconnection Networks Todd C. Mowry CS 740 November 3, 2000 Topics Network design issues Network Topology.
Lecture 3 Innerconnection Networks for Parallel Computers
Anshul Kumar, CSE IITD CSL718 : Multiprocessors Interconnection Mechanisms Performance Models 20 th April, 2006.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 January Session 4.
1 Lecture 13: LRC & Interconnection Networks Topics: LRC implementation, interconnection characteristics.
InterConnection Network Topologies to Minimize graph diameter: Low Diameter Regular graphs and Physical Wire Length Constrained networks Nilesh Choudhury.
Anshul Kumar, CSE IITD ECE729 : Advanced Computer Architecture Lecture 27, 28: Interconnection Mechanisms In Multiprocessors 29 th, 31 st March, 2010.
Birds Eye View of Interconnection Networks
1 Interconnection Networks. 2 Interconnection Networks Interconnection Network (for SIMD/MIMD) can be used for internal connections among: Processors,
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
Super computers Parallel Processing
Topology How the components are connected. Properties Diameter Nodal degree Bisection bandwidth A good topology: small diameter, small nodal degree, large.
1 Lecture 24: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix F)
Spring EE 437 Lillevik 437s06-l22 University of Portland School of Engineering Advanced Computer Architecture Lecture 22 Distributed computer Interconnection.
1 Lecture 14: Interconnection Networks Topics: dimension vs. arity, deadlock.
Effective bandwidth with link pipelining Pipeline the flight and transmission of packets over the links Overlap the sending overhead with the transport.
COMP8330/7330/7336 Advanced Parallel and Distributed Computing Tree-Based Networks Cache Coherence Dr. Xiao Qin Auburn University
Interconnection Networks Communications Among Processors.
1 Lecture 22: Interconnection Networks Topics: Routing, deadlock, flow control, virtual channels.
Parallel Architecture
Interconnect Networks
CS 704 Advanced Computer Architecture
Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Interconnection Networks (Part 2) Dr.
Lecture 23: Interconnection Networks
Interconnection topologies
Refer example 2.4on page 64 ACA(Kai Hwang) And refer another ppt attached for static scheduling example.
Lecture 14: Interconnection Networks
Indirect Networks or Dynamic Networks
Interconnection Network Design Lecture 14
Mesh-Connected Illiac Networks
Communication operations
Lecture: Interconnection Networks
High Performance Computing & Bioinformatics Part 2 Dr. Imad Mahgoub
Lecture: Networks Topics: TM wrap-up, networks.
Lecture: Interconnection Networks
CS 6290 Many-core & Interconnect
Topology What are the consequences of using fewer than n2 connections?
Presentation transcript:

1 Lecture 23: Interconnection Networks Topics: communication latency, centralized and decentralized switches (Appendix E)

2 Topologies Internet topologies are not very regular – they grew incrementally Supercomputers have regular interconnect topologies and trade off cost for high bandwidth Nodes can be connected with  centralized switch: all nodes have input and output wires going to a centralized chip that internally handles all routing  decentralized switch: each node is connected to a switch that routes data to one of a few neighbors

3 Centralized Crossbar Switch P1 P2 P3 P4 P5 P6 P7 P0 Crossbar switch

4 Centralized Crossbar Switch P1 P2 P3 P4 P5 P6 P7 P0

5 Crossbar Properties Assuming each node has one input and one output, a crossbar can provide maximum bandwidth: N messages can be sent as long as there are N unique sources and N unique destinations Maximum overhead: WN 2 internal switches, where W is data width and N is number of nodes To reduce overhead, use smaller switches as building blocks – trade off overhead for lower effective bandwidth

6 Switch with Omega Network P1 P2 P3 P4 P5 P6 P7 P

7 Omega Network Properties The switch complexity is now O(N log N) Contention increases: P0  P5 and P1  P7 cannot happen concurrently (this was possible in a crossbar) To deal with contention, can increase the number of levels (redundant paths) – by mirroring the network, we can route from P0 to P5 via N intermediate nodes, while increasing complexity by a factor of 2

8 Tree Network Complexity is O(N) Can yield low latencies when communicating with neighbors Can build a fat tree by having multiple incoming and outgoing links P0P3P2P1P4P7P6P5

9 Bisection Bandwidth Split N nodes into two groups of N/2 nodes such that the bandwidth between these two groups is minimum: that is the bisection bandwidth Why is it relevant: if traffic is completely random, the probability of a message going across the two halves is ½ – if all nodes send a message, the bisection bandwidth will have to be N/2 The concept of bisection bandwidth confirms that the tree network is not suited for random traffic patterns, but for localized traffic patterns

10 Distributed Switches: Ring Each node is connected to a 3x3 switch that routes messages between the node and its two neighbors Effectively a repeated bus: multiple messages in transit Disadvantage: bisection bandwidth of 2 and N/2 hops on average

11 Distributed Switch Options Performance can be increased by throwing more hardware at the problem: fully-connected switches: every switch is connected to every other switch: N 2 wiring complexity, N 2 /4 bisection bandwidth Most commercial designs adopt a point between the two extremes (ring and fully-connected):  Grid: each node connects with its N, E, W, S neighbors  Torus: connections wrap around  Hypercube: links between nodes whose binary names differ in a single bit

12 Topology Examples Grid Hypercube Torus CriteriaBusRing2Dtorus6-cubeFully connected Performance Bisection bandwidth Cost Ports/switch Total links

13 Topology Examples Grid Hypercube Torus CriteriaBusRing2Dtorus6-cubeFully connected Performance Bisection bandwidth Cost Ports/switch Total links

14 k-ary d-cube Consider a k-ary d-cube: a d-dimension array with k elements in each dimension, there are links between elements that differ in one dimension by 1 (mod k) Number of nodes N = k d Number of switches : Switch degree : Number of links : Pins per node : Avg. routing distance: Diameter : Bisection bandwidth : Switch complexity : Should we minimize or maximize dimension?

15 k-ary d-Cube Consider a k-ary d-cube: a d-dimension array with k elements in each dimension, there are links between elements that differ in one dimension by 1 (mod k) Number of nodes N = k d Number of switches : Switch degree : Number of links : Pins per node : Avg. routing distance: Diameter : Bisection bandwidth : Switch complexity : N 2d + 1 Nd 2wd d(k-1)/2 d(k-1) 2wk d-1 Should we minimize or maximize dimension? (2d + 1) 2 (with no wraparound)

16 Title Bullet