1 The Turn Model for Adaptive Routing. 2 Summary Introduction to Direct Networks. Deadlocks in Wormhole Routing. System Model. Partially Adaptive Routing.

Slides:



Advertisements
Similar presentations
A Novel 3D Layer-Multiplexed On-Chip Network
Advertisements

What is Flow Control ? Flow Control determines how a network resources, such as channel bandwidth, buffer capacity and control state are allocated to packet.
©2003 Dror Feitelson Parallel Computing Systems Part II: Networks and Routing Dror Feitelson Hebrew University.
Miguel Gorgues, Dong Xiang, Jose Flich, Zhigang Yu and Jose Duato Uni. Politecnica de Valencia, Spain School of Software, Tsinghua University, China, Achieving.
High Performance Router Architectures for Network- based Computing By Dr. Timothy Mark Pinkston University of South California Computer Engineering Division.
1 Lecture 12: Interconnection Networks Topics: dimension/arity, routing, deadlock, flow control.
NUMA Mult. CSE 471 Aut 011 Interconnection Networks for Multiprocessors Buses have limitations for scalability: –Physical (number of devices that can be.
CSCI 8150 Advanced Computer Architecture
CS 258 Parallel Computer Architecture Lecture 5 Routing February 6, 2008 Prof John D. Kubiatowicz
Communication operations Efficient Parallel Algorithms COMP308.
1 Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control Final exam reminders:  Plan well – attempt every question.
Issues in System-Level Direct Networks Jason D. Bakos.
1 Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control.
1 Lecture 26: Interconnection Networks Topics: flow control, router microarchitecture.
1 Indirect Adaptive Routing on Large Scale Interconnection Networks Nan Jiang, William J. Dally Computer System Laboratory Stanford University John Kim.
1 Lecture 25: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) Review session,
A Comparative Analysis of Deadlock Recovery and Avoidance-Based Routing Algorithms in Wormhole-Switched k-Ary n-Cubes Paper review Reviewer : Nthu CS03.
John Kubiatowicz Electrical Engineering and Computer Sciences
Storage area network and System area network (SAN)
Dragonfly Topology and Routing
Performance and Power Efficient On-Chip Communication Using Adaptive Virtual Point-to-Point Connections M. Modarressi, H. Sarbazi-Azad, and A. Tavakkol.
Switching, routing, and flow control in interconnection networks.
Interconnect Network Topologies
Interconnection Networks. Applications of Interconnection Nets Interconnection networks are used everywhere! ◦ Supercomputers – connecting the processors.
High Performance Embedded Computing © 2007 Elsevier Lecture 16: Interconnection Networks Embedded Computing Systems Mikko Lipasti, adapted from M. Schulte.
Interconnect Networks
On-Chip Networks and Testing
Distributed Routing Algorithms. In a message passing distributed system, message passing is the only means of interprocessor communication. Unicast, Multicast,
CSE Advanced Computer Architecture Week-11 April 1, 2004 engr.smu.edu/~rewini/8383.
1 Lecture 7: Interconnection Network Part I: Basic Definitions Part II: Message Passing Multicomputers.
QoS Support in High-Speed, Wormhole Routing Networks Mario Gerla, B. Kannan, Bruce Kwan, Prasasth Palanti,Simon Walton.
High-Level Interconnect Architectures for FPGAs An investigation into network-based interconnect systems for existing and future FPGA architectures Nick.
High-Level Interconnect Architectures for FPGAs Nick Barrow-Williams.
Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University
Deadlock CEG 4131 Computer Architecture III Miodrag Bolic.
Multiprocessor Interconnection Networks Todd C. Mowry CS 740 November 3, 2000 Topics Network design issues Network Topology.
ECE669 L21: Routing April 15, 2004 ECE 669 Parallel Computer Architecture Lecture 21 Routing.
Anshul Kumar, CSE IITD CSL718 : Multiprocessors Interconnection Mechanisms Performance Models 20 th April, 2006.
CS 8501 Networks-on-Chip (NoCs) Lukasz Szafaryn 15 FEB 10.
MESSAGE ROUTING SCHEMES IN A HYPERCUBE MACHINE
NC2 (No.4) 1 Undeliverable packets & solutions Deadlock: packets are unable to progress –Prevention, avoidance, recovery Livelock: packets cannot reach.
1 Lecture 15: Interconnection Routing Topics: deadlock, flow control.
Anshul Kumar, CSE IITD ECE729 : Advanced Computer Architecture Lecture 27, 28: Interconnection Mechanisms In Multiprocessors 29 th, 31 st March, 2010.
BZUPAGES.COM Presentation On SWITCHING TECHNIQUE Presented To; Sir Taimoor Presented By; Beenish Jahangir 07_04 Uzma Noreen 07_08 Tayyaba Jahangir 07_33.
© Sudhakar Yalamanchili, Georgia Institute of Technology (except as indicated) Deadlock.
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
Super computers Parallel Processing
Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies Alvin R. Lebeck CPS 220.
Virtual-Channel Flow Control William J. Dally
1 Lecture 24: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix F)
1 Lecture 14: Interconnection Networks Topics: dimension vs. arity, deadlock.
Interconnection Networks Communications Among Processors.
1 Lecture 22: Interconnection Networks Topics: Routing, deadlock, flow control, virtual channels.
COMP8330/7330/7336 Advanced Parallel and Distributed Computing Communication Costs in Parallel Machines Dr. Xiao Qin Auburn University
SWITCHING. Switching is process to forward packets coming in from one port to a port leading towards the destination. When data comes on a port it is.
Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Interconnection Networks (Part 2) Dr.
Lecture 23: Interconnection Networks
Deadlock.
Azeddien M. Sllame, Amani Hasan Abdelkader
Switching, routing, and flow control in interconnection networks
Lecture 14: Interconnection Networks
Communication operations
CEG 4131 Computer Architecture III Miodrag Bolic
Lecture: Interconnection Networks
CS 6290 Many-core & Interconnect
Lecture 25: Interconnection Networks
Switching, routing, and flow control in interconnection networks
Multiprocessors and Multi-computers
Chapter 2 from ``Introduction to Parallel Computing'',
Presentation transcript:

1 The Turn Model for Adaptive Routing

2 Summary Introduction to Direct Networks. Deadlocks in Wormhole Routing. System Model. Partially Adaptive Routing Algorithms in 2D Meshes. 2D Meshes. p-cube routing in Hypercube p-cube routing in Hypercube Analysis. Analysis. Conclusion and Future Work. Conclusion and Future Work.

3 What is Direct Network? Each node has a point-to-point, or direct, connection to some number of other nodes. Offers massive parallelism and scalability. Popular architecture for constructing massively parallel computers. Communicates by passing messages

4 Generic Architecture

5 Evaluation of Direct Network Communication latency: Start-up latency + Network latency + Blocking time Start-up latency + Network latency + Blocking time Start-up latency: Start-up latency: It is the time to handle the packet at source & destination nodes. Network latency: Network latency: It is the time between a packet leaving the source and arriving at the destination. the destination. Blocking time Blocking time All possible delays encountered during the life-time of packet. Communication latency depends on the type of switching technique → Wormhole routing

6 Switching Technique Circuit Switching Virtual Cut Through Packet Switching Wormhole Switching Switching Technique Virtual Cut Through

04/21/10 Packet Routing There are two basic approaches to routing packets, based on what a switch does with a packet as its flits begin to arrive. 1)Store-and-forward 2)Cut-through Virtual cut-through Virtual cut-through Wormhole Wormhole

8 Wormhole Switching A packet is divided into a number of flits The header flit of a packet governs the route The remaining flits following in a pipeline fashion fashion Thus a message resembles worm borrowing through the network through the network

9 Transmission Packet is transmitted from a node as flit( smallest unit on which flow control can be performed). Two kinds of Flits Header flit and Data flit. Header Flit tries to get the another channel while the data flits are transmitted through the already obtained channels. A channel is released only when the last flit of the message is passed though it. Flits of two messages cannot be interleaved.

10 Wormhole Routing Based on cut-through concept Relatively independent of path length Relatively independent of path lengthOvercome: Store-and-forward and Virtual cut-through ’s buffer size problem Store-and-forward and Virtual cut-through ’s buffer size problem Circuit switching ’s channel reservation and release problem Circuit switching ’s channel reservation and release problem Two positive effect: The absence of network contention makes the network latency relatively insensitive to path length The absence of network contention makes the network latency relatively insensitive to path length Large packet buffers at each intermediate node are obviated Large packet buffers at each intermediate node are obviated

11 Flit Wormhole Routing Advantages: reduced buffering Flit (routing info) Flit Low latency Wormhole Packet:

12 Characteristics of Routing Algorithms A good routing algorithm: Reduce the network latency Reduce the network latency Easily implemented in hardware Easily implemented in hardware High network throughput High network throughput

13 Deadlock Packets are allowed to hold a resource (channel) while waiting for another resource Deadlock free routing algorithm

14 Deadlocks in Wormhole Switching Header Flit contains all the routing information that is required to move data flits across the network. If the header flit cannot move any further then there will a congestion which causes a chained block in the network which leads to Deadlocks.

15 The Turn Model Basis: Analyze directions in which packets can turn in the network Analyze directions in which packets can turn in the network Determine the cycles that such turns can form Determine the cycles that such turns can form Prohibit just enough turns to break all cycle Prohibit just enough turns to break all cycle Resulting routing algorithms are: Deadlock and livelock free Deadlock and livelock free Minimal/Non-minimal Minimal/Non-minimal Highly Adaptive. Highly Adaptive.

16 The Turn model (contd..) Deadlock free Livelock free livelock occurs when the routing of a packet never lead livelock occurs when the routing of a packet never lead it to its destination. it to its destination.Adaptive from A to B determines the path based on the Network load. from A to B determines the path based on the Network load.Minimal restricts packets to shortest paths. restricts packets to shortest paths.Non-minimal Although minimal routing may initially sound more promising, non-minimal routing provides more choices. Although minimal routing may initially sound more promising, non-minimal routing provides more choices.

17 The Turn Model (contd…) Classify channels according to the direction in which they route packets. Identify the turns that occur between one direction and another, omitting 0-degree and 180-degree turns. Identify the simple cycles these turns can form. Prohibit one turn in each cycle. In the case of k-ary n-cubes, incorporate as many turns as possible that involve wraparound channels. Add 180-degree and 0-degree turns if there are multiple channels in the same direction.

18 The Turn Model (contd.) Simple Illustration: - The possible turns and simple cycles in a two-dimensional mesh. -The four turns allowed by the xy routing algorithm. -Six turns that complete the cycles and allows deadlock.

19 Turn Model (contd.) West-First Routing Algorithm Prohibited turns are the two to the west. two to the west. Route a packet first west if necessary, and then adaptively south, east, and north. Both minimal and non-minimal paths are shown. [Dally and Seitz] proof show that a routing algorithm is deadlock free if the channels in the interconnection network can be numbered so that the algorithm routes every packet along channels with strictly decreasing numbers.

20 S D Deadlock Free Routing West First Algorithm

21 S D West First Algorithm Deadlock Free Routing

22 D S Deadlock Free Routing West First Algorithm

23 Turn Model (contd.) North-Last Routing Algorithm

24 D S Deadlock Free Routing North Last Algorithm

25 Turn Model (contd.) Negative-First Routing Algorithm If the packet in +ve direction it will never turn –ve. it will never turn –ve. +y -x -y +x

26 Turn Model (contd.) n – Dimensional meshes Prohibit n(n-1) 90 degree turns to prevent deadlock. Prohibit n(n-1) 90 degree turns to prevent deadlock. One half of all possible 180 degree turns must be prohibited One half of all possible 180 degree turns must be prohibited Resulting algorithms: Resulting algorithms:All-but-one-negative-firstAll-but-one-positive-lastNegative-first k-ary n-cubes Allows to use the wraparound channels Assigns the wraparound channels a number. Negative first algorithm Classify wraparound channel according to the direction in which it routes packets Apply the algorithm

27 p-cube Routing in Hypercubes Hypercubea are special case of both n-dimensional meshes and k-ary n- cube. S =the binary address of the source node for a packet C=binary address of the node the header flits currently occupy D=binary address of the destination node. Two Phase: Phase 1: route packets along a dimension i for which ci=1 and di=0 in minimal Phase 2:when no such dimension then, route packet along i such that ci=0 and di=1 in minimal.

28 The minimal p-cube …. Example : 10-cube S= D= h0=3 and h1=3 so h=6 36 possible shortest paths. Phase 1: C= = C9 C8 C7 C6 C5 C4 C3 C2 C1 C0 C= ;( ci=1 and di=0) ‘D= ^ R = ( 3 choices) dimension taken d2 C= ‘D= R= (2 Choices) dim. taken d9 C= ‘D= R= (1 choice) dim d6 C= ‘D= R=

29 The minimal p-cube …. Example : 10-cube S= D= h0=3 and h1=3 so h=6 36 possible shortest paths. Phase 2: C= D= ‘C= D= R= Choices = 3 Dim=d5 C= ( 0 to 1) ‘C= D= R= choices= 2 Dim=d0 C= ‘C= D= R= Choice=1 Dim =d3 C= = D

30 The nonminimal p-cube algo. It is desired, because of its increased adaptive ness The first phase can route the packet along any dimension i for which ci=1 and di=1. p : last hop was in the +ve direction. +ve direction. Expect more choices

31 The nonminimal p-cube algo (0,0,0) (1,0,0) (0,0,1) (0,1,1) (1,1,1) (1,0,1) (1,1,0) The meaning of the algorithm's steps are with Dr. Pfeiffer’s help (example): 2- if last hop from +ve. Take the +ve only. If S=011, D=111 One choice 011→ else if last hop was –ve and if destination not any more –ve go –ve or +ve. S=011, D=111 Choice 1 : 011→ 111 Choice 2: 011 → 001 Choice 3: 011 → else going -ve (0,1,0)

32 Simulation Experiments Comparison of the partially adaptive with a non adaptive routing algorithms Simulation of a 16x16 mesh and a binary 8-cube for three different traffic patterns Each of these contains 256 nodes Bandwidth is equal to 20 flits/microsec For uniform traffic in the mesh and the hypercube the nonadaptive routing algorithms have lower latencies at high throughputs than the partially adaptive algorithms

33 Simulation experiments (contd.) At low throughputs,the algorithms performs about the same. For the non-uniform traffic patterns the partially adaptive routing algorithms have lower latencies at high throughputs For the negative-first algo. The degree of adaptive ness higher than the other. According to the hamming distance or if we measure S( algo.)/S( fully adaptive).

34 Conclusion The turn model produces deadlock free, livelock free, minimal or non-minimal and maximally adaptive algorithms. These algorithms perform better than nonadaptive algorithms for nonuniform patterns of message traffic.

35 A Peek into the Future To investigate the effects of different input and output selection policies on network performance. To illustrate the application of the turn model to networks that include extra physical or virtual channel. To apply the turn model to other topologies, such as, hexagonal, octagonal and cube connected cycle networks. Identification of realistic workload distributions.

36 References The Turn Model for adaptive routing Wormhole Routing in Parallel Computers A processor architecture for multiprocessing multiprocessor