1 Message passing architectures and routing CEG 4131 Computer Architecture III Miodrag Bolic Material for these slides is taken from the book: W. Dally,

Slides:



Advertisements
Similar presentations
Presentation of Designing Efficient Irregular Networks for Heterogeneous Systems-on-Chip by Christian Neeb and Norbert Wehn and Workload Driven Synthesis.
Advertisements

Routing in a Parallel Computer. A network of processors is represented by graph G=(V,E), where |V| = N. Each processor has unique ID between 1 and N.
Flattened Butterfly Topology for On-Chip Networks John Kim, James Balfour, and William J. Dally Presented by Jun Pang.
REAL-TIME COMMUNICATION ANALYSIS FOR NOCS WITH WORMHOLE SWITCHING Presented by Sina Gholamian, 1 09/11/2011.
1 Message passing architectures and routing CEG 4131 Computer Architecture III Miodrag Bolic Material for these slides is taken from the book: W. Dally,
Flattened Butterfly: A Cost-Efficient Topology for High-Radix Networks ______________________________ John Kim, William J. Dally &Dennis Abts Presented.
What is Flow Control ? Flow Control determines how a network resources, such as channel bandwidth, buffer capacity and control state are allocated to packet.
1 Chapter 9 Computer Networks. 2 Chapter Topics OSI network layers Network Topology Media access control Addressing and routing Network hardware Network.
CSCI 8150 Advanced Computer Architecture
Communication operations Efficient Parallel Algorithms COMP308.
Predictive Load Balancing Reconfigurable Computing Group.
Rotary Router : An Efficient Architecture for CMP Interconnection Networks Pablo Abad, Valentín Puente, Pablo Prieto, and Jose Angel Gregorio University.
Interconnection Networks
Architecture and Routing for NoC-based FPGA Israel Cidon* *joint work with Roman Gindin and Idit Keidar.
Issues in System-Level Direct Networks Jason D. Bakos.
1 Indirect Adaptive Routing on Large Scale Interconnection Networks Nan Jiang, William J. Dally Computer System Laboratory Stanford University John Kim.
Storage area network and System area network (SAN)
Dragonfly Topology and Routing
Performance and Power Efficient On-Chip Communication Using Adaptive Virtual Point-to-Point Connections M. Modarressi, H. Sarbazi-Azad, and A. Tavakkol.
Connecting LANs, Backbone Networks, and Virtual LANs
Switching, routing, and flow control in interconnection networks.
Lecture 1, 1Spring 2003, COM1337/3501Computer Communication Networks Rajmohan Rajaraman COM1337/3501 Textbook: Computer Networks: A Systems Approach, L.
High Performance Embedded Computing © 2007 Elsevier Lecture 16: Interconnection Networks Embedded Computing Systems Mikko Lipasti, adapted from M. Schulte.
1 The Turn Model for Adaptive Routing. 2 Summary Introduction to Direct Networks. Deadlocks in Wormhole Routing. System Model. Partially Adaptive Routing.
Chapter 2 The Infrastructure. Copyright © 2003, Addison Wesley Understand the structure & elements As a business student, it is important that you understand.
Interconnect Networks
1 Lecture 7: Interconnection Network Part I: Basic Definitions Part II: Message Passing Multicomputers.
Data and Computer Communications
QoS Support in High-Speed, Wormhole Routing Networks Mario Gerla, B. Kannan, Bruce Kwan, Prasasth Palanti,Simon Walton.
Author : Jing Lin, Xiaola Lin, Liang Tang Publish Journal of parallel and Distributed Computing MAKING-A-STOP: A NEW BUFFERLESS ROUTING ALGORITHM FOR ON-CHIP.
Computer Networks Performance Metrics. Performance Metrics Outline Generic Performance Metrics Network performance Measures Components of Hop and End-to-End.
Course Wrap-Up Miodrag Bolic CEG4136. What was covered Interconnection network topologies and performance Shared-memory architectures Message passing.
1 Dynamic Interconnection Networks Miodrag Bolic.
Deadlock CEG 4131 Computer Architecture III Miodrag Bolic.
Sami Al-wakeel 1 Data Transmission and Computer Networks The Switching Networks.
Network-on-Chip Introduction Axel Jantsch / Ingo Sander
Anshul Kumar, CSE IITD CSL718 : Multiprocessors Interconnection Mechanisms Performance Models 20 th April, 2006.
CS 8501 Networks-on-Chip (NoCs) Lukasz Szafaryn 15 FEB 10.
Anshul Kumar, CSE IITD ECE729 : Advanced Computer Architecture Lecture 27, 28: Interconnection Mechanisms In Multiprocessors 29 th, 31 st March, 2010.
Interconnect simulation. Different levels for Evaluating an architecture Numerical models – Mathematic formulations to obtain performance characteristics.
Interconnect simulation. Different levels for Evaluating an architecture Numerical models – Mathematic formulations to obtain performance characteristics.
McGraw-Hill©The McGraw-Hill Companies, Inc., 2004 Connecting Devices CORPORATE INSTITUTE OF SCIENCE & TECHNOLOGY, BHOPAL Department of Electronics and.
Lecture 3 Applications of TDM ( T & E Lines ) & Statistical TDM.
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
CS440 Computer Networks 1 Packet Switching Neil Tang 10/6/2008.
Virtual-Channel Flow Control William J. Dally
1 Lecture 24: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix F)
1 Switching and Forwarding Sections Connecting More Than Two Hosts Multi-access link: Ethernet, wireless –Single physical link, shared by multiple.
Effective bandwidth with link pipelining Pipeline the flight and transmission of packets over the links Overlap the sending overhead with the transport.
Univ. of TehranIntroduction to Computer Network1 An Introduction to Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.
Review of Useful Definitions Statistical multiplexing is a method of sharing a link among transmissions. When computers use store-and-forward packet switching,
COMP8330/7330/7336 Advanced Parallel and Distributed Computing Communication Costs in Parallel Machines Dr. Xiao Qin Auburn University
Network Layer COMPUTER NETWORKS Networking Standards (Network LAYER)
Chapter 3 Part 3 Switching and Bridging
How to Train your Dragonfly
EE 122: Lecture 19 (Asynchronous Transfer Mode - ATM)
Lecture 23: Interconnection Networks
ECE 544 Protocol Design Project 2016
Azeddien M. Sllame, Amani Hasan Abdelkader
Chapter 3 Part 3 Switching and Bridging
Interconnection Networks: Routing
Communication operations
Storage area network and System area network (SAN)
CEG 4131 Computer Architecture III Miodrag Bolic
EE 122: Lecture 7 Ion Stoica September 18, 2001.
EE382C Lecture 6 Adaptive Routing 4/14/11 What is tornado traffic?
Chapter 3 Part 3 Switching and Bridging
Switching, routing, and flow control in interconnection networks
Switching Chapter 2 Slides Prepared By: -
Multiprocessors and Multi-computers
Presentation transcript:

1 Message passing architectures and routing CEG 4131 Computer Architecture III Miodrag Bolic Material for these slides is taken from the book: W. Dally, B. Towles, Principles and Practices of Interconnection Networks, Morgan Kaufmann, 2004

2 Definitions [1] A network channel c=(x,y) is characterized by –width w c : the number of parallel signals it contains, –frequency f c : the rate at which bits are transported at each signal –latency t c is the time required for a bit to travel from x to y. A bandwidth of a channel is W= w c * f c. The throughput Θ of a network is the data rate in bits per second that network accepts per input port. Under a particular traffic pattern, the channel that carries the largest fraction of the traffic determines the maximum channel load γ. Load on the channel can be equal or smaller than channel bandwidth. Θ=W/γ

3 Taxonomy of Routing Algorithms [1] Deterministic: The simplest algorithm - for each source, destination pair, there is a single path. This routing algorithm usually achieves poor performance because it fails to use alternative routes, and concentrates traffic on only one set of channels. Oblivious: So named because it ignores the state of the network when determining a path. Unlike deterministic, it considers a set of paths from a source to a destination, and chooses between them. Adaptive: The routing algorithm changes based on the state of the network.

4 Routing algorithms [1] Greedy: Always send the packet in the shortest direction around the ring. For example, always route from 0 to 3 in the clockwise direction and from 0 to 5 in the counterclockwise direction. If the distance is the same in both directions, pick a direction randomly. Uniform random: Randomly pick a direction for each packet, with equal probability of picking either direction. Weighted random: Randomly pick a direction for each packet, but weight the short direction with probability 1 - Δ /8 and the long direction with Δ/8, where Δ is the (minimum) distance between the source and destination. Adaptive: Send the packet in the direction for which the local channel has the lowest load. We may approximate load by either measuring the length of the queue serving this channel or recording how many packets it has transmitted over the last T slots.

5 Example [1] Consider a tornado traffic pattern in which each node i sends a packet to i + 3 mod 8. Which algorithm gives the best worst-case throughput?

6

7 Explanation [1] With the greedy routing algorithm, all of the traffic routes in the clockwise direction around the ring, leaving all of the counterclockwise channels idle and loading the clockwise channels with 3 units of traffic, that is, γ = 3, which gives every terminal a throughput of Θ = W/3. With random routing, the counterclockwise links become the bottleneck with a load of γ = 5/2, since half of the traffic traverses 5 links in the counterclockwise direction. This gives a throughput of 2W/5. Weighting the random decision sends 5/8 of the traffic over 3 links and 3/8 of the traffic over 5 links for a load of γ = 15/8 in both directions giving a throughput of 8W/15. Adaptive routing, with some assumptions on how the adaptivity is implemented, will match this perfect load balance in the steady state, giving the same throughput as weighted random routing.

8 Message Formats [2] Message: logical unit for internode communication Packet: basic unit containing destination address for routing Packets have sequencing # for reassembly Flits: flow control digits of packets Store-and-forward: packets Wormhole routing: flits

9 Packets and Flits [2] Header flits contain routing information and sequence number Flit length affected by network size Packet length determined by routing scheme and network implementation Lengths also dependent on channel b/w, router design, network traffic, etc.

10 Message Format [2]

11 Latency Analysis [2] L=packet length W=channel b/w (bits/s) D=distanceF=flit length T SF =(D + 1)L/W T WH =L/W + D*F/W Store-and-forward: controlled by s/w Wormhole: controlled by h/w

12 From [3]

13 Implementation of a simple network [1] Butterfly network

14 Performance requirements [1] Input ports64 Output ports64 Peak bandwidth0.25GBytes/s Average bandwidth0.25GBytes/s Message latency100ns Message size4-64 bytes Traffic patternrandom Quality of servicedropping acceptance Reliabilitydropping acceptance

15

16 Flow control [1] Packet format for our network Type encoding for our network

17 Router [1]

18 Allocator [1]

19 References 1. W. Dally, B. Towles, Principles and Practices of Interconnection Networks, Morgan Kaufmann, K. 2.Slides are from the course Advanced Computer Architecture by Dr. Anu Bourgeois, Department of Computer Science at Georgia State UniversityAdvanced Computer Architecture 3. Hwang, Advanced Computer Architecture Parallelism, Scalability, Programmability, McGraw-Hill 1993.