Anshul Kumar, CSE IITD CSL718 : Multiprocessors Interconnection Mechanisms Performance Models 20 th April, 2006.

Slides:



Advertisements
Similar presentations
Shantanu Dutt Univ. of Illinois at Chicago
Advertisements

SE-292 High Performance Computing
Packet Switching COM1337/3501 Textbook: Computer Networks: A Systems Approach, L. Peterson, B. Davie, Morgan Kaufmann Chapter 3.
1 Lecture 12: Interconnection Networks Topics: dimension/arity, routing, deadlock, flow control.
1 Lecture 23: Interconnection Networks Topics: communication latency, centralized and decentralized switches (Appendix E)
1 CSE 591-S04 (lect 14) Interconnection Networks (notes by Ken Ryu of Arizona State) l Measure –How quickly it can deliver how much of what’s needed to.
NUMA Mult. CSE 471 Aut 011 Interconnection Networks for Multiprocessors Buses have limitations for scalability: –Physical (number of devices that can be.
1 Lecture 24: Interconnection Networks Topics: communication latency, centralized and decentralized switches (Sections 8.1 – 8.5)
The importance of switching in communication The cost of switching is high Definition: Transfer input sample points to the correct output ports at the.
1 Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control.
1 Lecture 25: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) Review session,
1 Static Interconnection Networks CEG 4131 Computer Architecture III Miodrag Bolic.
ECE669 L16: Interconnection Topology March 30, 2004 ECE 669 Parallel Computer Architecture Lecture 16 Interconnection Topology.
Switching, routing, and flow control in interconnection networks.
Interconnect Network Topologies
Interconnection Networks. Applications of Interconnection Nets Interconnection networks are used everywhere! ◦ Supercomputers – connecting the processors.
1 The Turn Model for Adaptive Routing. 2 Summary Introduction to Direct Networks. Deadlocks in Wormhole Routing. System Model. Partially Adaptive Routing.
Computer Science Department
Interconnect Networks
Network Topologies Topology – how nodes are connected – where there is a wire between 2 nodes. Routing – the path a message takes to get from one node.
1 Interconnects Shared address space and message passing computers can be constructed by connecting processors and memory unit using a variety of interconnection.
ATM SWITCHING. SWITCHING A Switch is a network element that transfer packet from Input port to output port. A Switch is a network element that transfer.
Introduction 9th January, 2006 CSL718 : Architecture of High Performance Systems.
PPC Spring Interconnection Networks1 CSCI-4320/6360: Parallel Programming & Computing (PPC) Interconnection Networks Prof. Chris Carothers Computer.
CSE Advanced Computer Architecture Week-11 April 1, 2004 engr.smu.edu/~rewini/8383.
1 Lecture 7: Interconnection Network Part I: Basic Definitions Part II: Message Passing Multicomputers.
Dynamic Interconnect Lecture 5. COEN Multistage Network--Omega Network Motivation: simulate crossbar network but with fewer links Components: –N.
Parallel Computer Architecture and Interconnect 1b.1.
Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University
Multiprocessor Interconnection Networks Todd C. Mowry CS 740 November 3, 2000 Topics Network design issues Network Topology.
Lecture 3 Innerconnection Networks for Parallel Computers
شبکه های میان ارتباطی 1 به نام خدا دکتر محمد کاظم اکبری مرتضی سرگلزایی جوان
Univ. of TehranAdv. topics in Computer Network1 Advanced topics in Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 January Session 4.
Anshul Kumar, CSE IITD ECE729 : Advanced Computer Architecture Lecture 27, 28: Interconnection Mechanisms In Multiprocessors 29 th, 31 st March, 2010.
Birds Eye View of Interconnection Networks
1 Interconnection Networks. 2 Interconnection Networks Interconnection Network (for SIMD/MIMD) can be used for internal connections among: Processors,
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
Super computers Parallel Processing
CS440 Computer Networks 1 Packet Switching Neil Tang 10/6/2008.
INTERCONNECTION NETWORKS Work done as part of Parallel Architecture Under the guidance of Dr. Edwin Sha By Gomathy Gowri Narayanan Karthik Alagu Dynamic.
Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies Alvin R. Lebeck CPS 220.
Topology How the components are connected. Properties Diameter Nodal degree Bisection bandwidth A good topology: small diameter, small nodal degree, large.
1 Lecture 24: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix F)
Spring EE 437 Lillevik 437s06-l22 University of Portland School of Engineering Advanced Computer Architecture Lecture 22 Distributed computer Interconnection.
1 Switching and Forwarding Sections Connecting More Than Two Hosts Multi-access link: Ethernet, wireless –Single physical link, shared by multiple.
1 Lecture 14: Interconnection Networks Topics: dimension vs. arity, deadlock.
Univ. of TehranIntroduction to Computer Network1 An Introduction to Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.
COMP8330/7330/7336 Advanced Parallel and Distributed Computing Tree-Based Networks Cache Coherence Dr. Xiao Qin Auburn University
Interconnection Networks Communications Among Processors.
Switching By, B. R. Chandavarkar, CSE Dept., NITK, Surathkal Ref: B. A. Forouzan, 5 th Edition.
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
Overview Parallel Processing Pipelining
Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Interconnection Networks (Part 2) Dr.
Dynamic connection system
Lecture 23: Interconnection Networks
Packet Switching Outline Store-and-Forward Switches
Refer example 2.4on page 64 ACA(Kai Hwang) And refer another ppt attached for static scheduling example.
Azeddien M. Sllame, Amani Hasan Abdelkader
Parallel and Multiprocessor Architectures
Static Interconnection Networks
High Performance Computing & Bioinformatics Part 2 Dr. Imad Mahgoub
Advanced Computer and Parallel Processing
Interconnection Networks Contd.
Embedded Computer Architecture 5SAI0 Interconnection Networks
CS 6290 Many-core & Interconnect
Advanced Computer and Parallel Processing
Static Interconnection Networks
Chapter 2 from ``Introduction to Parallel Computing'',
Presentation transcript:

Anshul Kumar, CSE IITD CSL718 : Multiprocessors Interconnection Mechanisms Performance Models 20 th April, 2006

Anshul Kumar, CSE IITD slide 2 Connecting Processors and Memories Shared Buses Interconnection Networks –Static Networks –Dynamic Networks PPPP MMM Interconnection Network M MMM PPPP MMM M MMM Global Interconnection Network MMM

Anshul Kumar, CSE IITD slide 3 Shared Bus each processor sees this picture: processing bus access prob of a processor using the bus =  prob of a processor not using the bus = 1 –  prob of none of the n processors using the bus = (1 –  ) n prob of at least one processor using the bus = 1 – (1 –  ) n achieved BW on a relative scale = 1 – (1 –  ) n required BW = n  available BW = 1

Anshul Kumar, CSE IITD slide 4 Effect of re-submitted requests AW  (1-P A ) 1-  +  P A 1-P A PAPA prob = q A prob = q W

Anshul Kumar, CSE IITD slide 7 Waiting time

Anshul Kumar, CSE IITD slide 8 Switched Networks BUS Shared media Lower Cost Lower throughput Scalability poor Switched Network Switched paths Higher cost Higher throughput Scalability better

Anshul Kumar, CSE IITD slide 9 Interconnection Networks Topology : who is connected to whom Direct / Indirect : where is switching done Static / Dynamic : when is switching done Circuit switching / packet switching : how are connections established Store & forward / worm hole routing : how is the path determined Centralized / distributed : how is switching controlled Synchronous/asynchronous : mode of operation

Anshul Kumar, CSE IITD slide 10 PMPM Direct and Indirect Networks PMSPMS PMSPMS SMPSMP SMPSMP PMPM PMPM PMPM SWITCH DIRECT INDIRECT node link node link

Anshul Kumar, CSE IITD slide 11 Static and Dynamic Networks Static Networks –fixed point to point connections –usually direct –each node pair may not have a direct connection –routing through nodes Dynamic Networks –connections established as per need –usually indirect –path can be established between any pair of nodes –routing through switches

Anshul Kumar, CSE IITD slide 12 Static Network Topologies Linear Star 2D-Mesh Tree Non-uniform connectivity

Anshul Kumar, CSE IITD slide 13 Static Networks Topologies- contd. Ring Fully Connected Torus Uniform connectivity

Anshul Kumar, CSE IITD slide 14 Illiac IV Mesh Network neighbors of node r : (r  1) mod 9 and (r  3) mod 9 Chordal Ring

Anshul Kumar, CSE IITD slide 15 Fat Tree Network

Anshul Kumar, CSE IITD slide 16 Dynamic Networks k  k cross -bar switch building block for multi-stage dynamic networks 2  2 switch straightexchangeupper broadcast lower broadcast simplest cross-bar

Anshul Kumar, CSE IITD slide 17 Baseline Network blocking can occur

Anshul Kumar, CSE IITD slide 18 Benes Network non-blocking

Anshul Kumar, CSE IITD slide 19 Switching Mechanism Circuit Switching (connection oriented communication) –A circuit is established between the source and the destination Packet Switching (connectionless communication) –Information is divided into packets and each packet is sent independently from node to node

Anshul Kumar, CSE IITD slide 20 Routing in Networks node incoming message outgoing message header payload/data store & forward routing worm hole routing time

Anshul Kumar, CSE IITD slide 21 Routing in presence of congestion Worm hole routing –When message header is blocked, many links get blocked with the message Solution: cut-through routing –When message header is blocked, tail is allowed to move, compressing the message into a single node

Anshul Kumar, CSE IITD slide 22 Routing Options Deterministic routing: always same path followed Adaptive routing: best path selected to minimize congestion Source based routing: message specifies path to destination Destination based routing: message specifies only destination address

Anshul Kumar, CSE IITD slide 23 Some Performance Parameters time sender receiver time of flight overhead Tx time=bytes/BW transport latency total latency

Anshul Kumar, CSE IITD slide 24 Other Parameters Throughput  Bandwidth (no credit for header) Bisection bandwidth = BW across a bisection Node degree Network Diameter Cost Fault Tolerance

Anshul Kumar, CSE IITD slide 25 Multidimensional Grid/Mesh Size =k  k  ….  k (n times) = k n Diameter = (k-1)  n without end around connections = k  n /2 with end around connections k-ary n-cube for (Binary) Hypercube : k = 2

Anshul Kumar, CSE IITD slide 26 Grid/Mesh Performance - 1 kdkd

Anshul Kumar, CSE IITD slide 27 Grid/Mesh Performance - 2

Anshul Kumar, CSE IITD slide 28 Grid/Mesh Performance - 3 k-ary n-cube

Anshul Kumar, CSE IITD slide 29 Switch Performance k  m cross -bar switch

Anshul Kumar, CSE IITD slide 30 Switch Performance – contd.

Anshul Kumar, CSE IITD slide 31 Switch Performance – contd.

Anshul Kumar, CSE IITD slide 32 Effect of re-submitted requests

Anshul Kumar, CSE IITD slide 33 Effect of buffering There are two possibilities Buffering before switching (k buffers, one at each input port) Buffering after switching (m buffers, one at each output port)

Anshul Kumar, CSE IITD slide 34 Switch with input buffers Rate of messages at input and output of each queue is same in steady state - r per cycle Service time includes delays due to conflicts, calculated as earlier. This has an exponential distribution – recall the analysis for a shared bus. M/M/1 open queue model can be used to calculate queuing delay. Details are omitted.

Anshul Kumar, CSE IITD slide 35 Switch with output buffers Here we assume that all the messages destined for same output are queued in the same buffer, in some order. That is no rejections and no re-submissions. For each queue, Messages arriving per service cycle =  = Prob of a request coming from one of the k sources = p = Apply M B /D/1 model for finding queuing delay T w

Anshul Kumar, CSE IITD slide 36 ReferencesReferences D. Sima, T. Fountain, P. Kacsuk, "Advanced Computer Architectures : A Design Space Approach", Addison Wesley, K. Hwang, "Advanced Computer Architecture : Parallelism, Scalability, Programmability", McGraw Hill, 1993.