Parallel Routing Bruce, Chiu-Wing Sham. Overview Background Routing in parallel computers Routing in hypercube network –Bit-fixing routing algorithm –Randomized.

Slides:



Advertisements
Similar presentations
The Capacity of Wireless Networks Danss Course, Sunday, 23/11/03.
Advertisements

Routing and Congestion Problems in General Networks Presented by Jun Zou CAS 744.
1 SOFSEM 2007 Weighted Nearest Neighbor Algorithms for the Graph Exploration Problem on Cycles Eiji Miyano Kyushu Institute of Technology, Japan Joint.
Routing in a Parallel Computer. A network of processors is represented by graph G=(V,E), where |V| = N. Each processor has unique ID between 1 and N.
Efficient Realization of Hypercube Algorithms on Optical Arrays* Hong Shen Department of Computing & Maths Manchester Metropolitan University, UK ( Joint.
1 Message passing architectures and routing CEG 4131 Computer Architecture III Miodrag Bolic Material for these slides is taken from the book: W. Dally,
Prepared by Ilya Kolchinsky.  n generals, communicating through messengers  some of the generals (up to m) might be traitors  all loyal generals should.
Parallel Scheduling of Complex DAGs under Uncertainty Grzegorz Malewicz.
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
LOCALITY IN DISTRIBUTED GRAPH ALGORITHMS Nathan Linial Presented by: Ron Ryvchin.
Parallel Architectures: Topologies Heiko Schröder, 2003.
Heiko Schröder, 2003 Parallel Architectures 1 Various communication networks State of the art technology Important aspects of routing schemes Known results.
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
Parallel Architectures: Topologies Heiko Schröder, 2003.
1 Interconnection Networks Direct Indirect Shared Memory Distributed Memory (Message passing)
Advanced Topics in Algorithms and Data Structures An overview of the lecture 2 Models of parallel computation Characteristics of SIMD models Design issue.
LSRP: Local Stabilization in Shortest Path Routing Hongwei Zhang and Anish Arora Presented by Aviv Zohar.
Dynamic Hypercube Topology Stefan Schmid URAW 2005 Upper Rhine Algorithms Workshop University of Tübingen, Germany.
Communication operations Efficient Parallel Algorithms COMP308.
Dept. of Computer Science Distributed Computing Group Asymptotically Optimal Mobile Ad-Hoc Routing Fabian Kuhn Roger Wattenhofer Aaron Zollinger.
Models of Parallel Computation Advanced Algorithms & Data Structures Lecture Theme 12 Prof. Dr. Th. Ottmann Summer Semester 2006.
Interconnection Network Topologies
The Byzantine Generals Strike Again Danny Dolev. Introduction We’ll build on the LSP presentation. Prove a necessary and sufficient condition on the network.
1 Static Interconnection Networks CEG 4131 Computer Architecture III Miodrag Bolic.
CS Dept, City Univ.1 The Complexity of Connectivity in Wireless Networks Presented by LUO Hongbo.
ECE669 L16: Interconnection Topology March 30, 2004 ECE 669 Parallel Computer Architecture Lecture 16 Interconnection Topology.
Interconnect Network Topologies
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
Randomized Algorithms Morteza ZadiMoghaddam Amin Sayedi.
Network Topologies Topology – how nodes are connected – where there is a wire between 2 nodes. Routing – the path a message takes to get from one node.
ADVANCED COMPUTATIONAL MODELS AND ALGORITHMS Instructor: Dr. Gautam Das Lecture 23 April 30, 2009 Class notes by Prashanth Kurabalana hundi Hombe gowda[ ]
Oblivious Routing: Tuba Yilmaz & Rajiv Iyer (Prof. Dana Randall - CS 6550) 1.
CSE Advanced Computer Architecture Week-11 April 1, 2004 engr.smu.edu/~rewini/8383.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
© by Kenneth H. Rosen, Discrete Mathematics & its Applications, Sixth Edition, Mc Graw-Hill, 2007 Chapter 9 (Part 2): Graphs  Graph Terminology (9.2)
Lecture 3 Innerconnection Networks for Parallel Computers
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 January Session 4.
1 Permutation routing in n-cube. 2 n-cube 1-cube2-cube3-cube 4-cube.
 Rooted tree and binary tree  Theorem 5.19: A full binary tree with t leaves contains i=t-1 internal vertices.
Expander Graphs for Digital Stream Authentication and Robust Overlay Networks Presented by Neeraj Agrawal, Zifei Zhong.
Copyright 2004 Koren & Krishna ECE655/Koren Part.8.1 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Fault Tolerant Computing ECE.
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture.
Winter 2014Parallel Processing, Fundamental ConceptsSlide 1 2 A Taste of Parallel Algorithms Learn about the nature of parallel algorithms and complexity:
2016/1/6Part I1 A Taste of Parallel Algorithms. 2016/1/6Part I2 We examine five simple building-block parallel operations and look at the corresponding.
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
Super computers Parallel Processing
HYPERCUBE ALGORITHMS-1
Basic Communication Operations Carl Tropper Department of Computer Science.
1 Fault-Tolerant Consensus. 2 Communication Model Complete graph Synchronous, network.
Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb
NOTE: To change the image on this slide, select the picture and delete it. Then click the Pictures icon in the placeholder to insert your own image. Fast.
1 Concurrent Counting is harder than Queuing Costas Busch Rensselaer Polytechnic Intitute Srikanta Tirthapura Iowa State University.
Copyright 2007 Koren & Krishna, Morgan-Kaufman Part.12.1 FAULT TOLERANT SYSTEMS Part 12 - Networks.
5.6 Prefix codes and optimal tree Definition 31: Codes with this property which the bit string for a letter never occurs as the first part of the bit string.
2016/7/2Appendices A and B1 Introduction to Distributed Algorithm Appendix A: Pseudocode Conventions Appendix B: Graphs and Networks Teacher: Chun-Yuan.
Peer-to-Peer Networks 07 Degree Optimal Networks
Overview Parallel Processing Pipelining
Distributed and Parallel Processing
Interconnection topologies
Computing Connected Components on Parallel Computers
Peer-to-Peer and Social Networks
Communication operations
Static Interconnection Networks
High Performance Computing & Bioinformatics Part 2 Dr. Imad Mahgoub
Permutation routing in n-cube
ECE 753: FAULT-TOLERANT COMPUTING
Compact routing schemes with improved stretch
Static Interconnection Networks
Locality In Distributed Graph Algorithms
Chapter 2 from ``Introduction to Parallel Computing'',
Presentation transcript:

Parallel Routing Bruce, Chiu-Wing Sham

Overview Background Routing in parallel computers Routing in hypercube network –Bit-fixing routing algorithm –Randomized routing algorithm

Parallel Computer Architectures Parallel computers consist of multiple processing elements interconnected by a specific interconnection topology Example: –linear array –hypercube –mesh –fat tree

Interconnection Topologies

Routing in Parallel Computers Parallel computers are modeled by directed graphs All interconnections between processors (nodes) occur in synchronous steps Each link can carry at most one unit message (packet) in one step During a step, a node can send at most one packet to each of its neighbors Each node is uniquely identified by a number between 1 and N

Permutation Routing Problem A network of N nodes, {1, …, N} Each node i contains one packet v i that should be routed to the destination node Each destination node d(i) for each node i, for 1  i  N, should form a permutation of {1, …, N}, i.e., every node is the destination of exactly one packet

Oblivious Routing Algorithm Properties: –A route between each node i and each destination node d(i) is specified –The route between the node i and the node d(i) depends on i and d(i) only

Oblivious Routing Algorithm Theorem 1: –For any deterministic oblivious permutation routing algorithm on a network of N nodes each of degree d, there is an instance of permutation routing requiring  ( ) steps Proof: –Paper: C. Kaklamanis, D. Krizanc, T. Tsantilas, “Tight Bounds for Oblivious Routing in the Hypercube”, Pro. of ACM symp. on Parallel alg. & architectures, 1990

Hypercube Topology

Hypercube Network An n-dimensional hypercube network: –Number of nodes: N = 2 n –Degree: n –The node i with address (i 1, i 2, …, i n )  {0, 1} n and the node j with address (j 1, j 2, …, j n )  {0, 1} n are connected if the hamming distance between (i 1, i 2, …, i n ) and (j 1, j 2, …, j n ) is 1

Bit-Fixing Routing Algorithm Algorithm: –Given a destination address d(i) and an intermediate node  (i) –Compare the bits of d(i) with  (i) from left to right –Identify the first bit position at which these two addresses differ –Route this packet to its neighbor n(i) such that  (i) and n(i) differ only in this bit position

Bit-Fixing Routing Algorithm Example: –Source: (0, 0, 0, 0, 0, 0) –Destination: (1, 0, 1, 0, 1, 1) –(0, 0, 0, 0, 0, 0)  (1, 0, 0, 0, 0, 0)  (1, 0, 1, 0, 0, 0)  (1, 0, 1, 0, 1, 0)  (1, 0, 1, 0, 1, 1)

Bit-Fixing Routing Algorithm Corollary 1: –On an n-dimensional hypercube, there is an instance (e.g. transpose permutation) of permutation routing requiring  ( ) steps for the bit-fixing routing algorithm –It satisfies Theorem 1 where N = 2 n and d = n

Bit-Fixing Routing Algorithm Proof: –Let (i.j) be the address of a node, where i and j are two binary strings each of length n/2 and. is the string concatenation operation –Consider the packet stored on node (i.j) is routed to the destination node (j.i) (transpose permutation) and look at the sources where j = 0 only

Bit-Fixing Routing Algorithm Proof: –i.0  0.i –if i is odd, the packet must pass through node (1.0) –No. of nodes = 2 n/2 /2 –Only one packet can be routed on the same edge at a time –Lower bound = 2 n/2 /2

Randomized Routing Algorithm For i = 1 to N –Route a packet v i by executing the following two steps independently of all the other packets Choose a random intermediate destination t i from {1, …, N}, and route v i from i to t i using bit-fixing algorithm Route v i from t i to its final destination d(i) using bit- fixing algorithm Queuing: FIFO (delay occurs)

Randomized Routing Algorithm Lemma 1: –If the bit-fixing algorithm is used to route a packet v i from i to t i and v j from j to t j then their routes do not rejoin after they separate

Randomized Routing Algorithm Proof (lemma 1): –Assume k is the node at which the two paths separate and l is the node at which they rejoin –According to bit-fixing scheme, v i and v j from k to l depends only on the bit representations of k and l –v i and v j must follow the same route –Contradict to the assumption

Randomized Routing Algorithm Let the route of packet v i follow the sequence of edges p i = (e 1, e 2, …, e k ) Let S be the set of packets (other than v i ) whose routes pass through at least one of {e 1, e 2, …, e k } Lemma 2: –The delay incurred by v i is at most |S|

Randomized Routing Algorithm Proof (lemma 2): –Define lag l for any packet w, l=t – j (a packet is ready to follow edge e j at time t –If the lag of v i increase from l to l + 1, some packet should have lag l in front of v i

Randomized Routing Algorithm Proof (lemma 2): –Let t j be the last time step at which any packet in S has lag l –A packet w must follow the edge e j where l= t j – j and it must leave at t j +1.

Randomized Routing Algorithm Proof (lemma 2): –If the lag of v i reaches l + 1, some packet in S leaves p i with lag l –By lemma 1, the routes of different packets will not rejoin after separate –Each member of S whose route intersects p i is charged at most one delay for v i

Randomized Routing Algorithm Define a random variable H ij as: Let delay i be the total delay incurred by v i, then: From linearity of expectation:

Randomized Routing Algorithm For an edge e of the hypercube, let the random variable T(e) be the number of routes that pass through e. If p i = (e 1, …, e k ), then: We have:

Randomized Routing Algorithm All edges in the hypercube are symmetric –E[T(e l )] = E[T(e m )] for any two edges e l and e m –Total number of edges: Nn –The expected length of each route is n/2 –Expected length of total route is Nn/2 –E[T(e)] = 1/2 for all edges We have:

Randomized Routing Algorithm Theorem 2 (Chernoff bound): –Let X 1, X 2, …, X n be the independent Poisson trials such that, for 1  i  n, Pr[X i = 1] = p i, where 0  p i  1. –X = –  = E[X] =

Randomized Routing Algorithm

We have: By using: Put  = 11:

Randomized Routing Algorithm Theorem 3: –With probability at least n, the packet v i reaches t i in 7n or fewer steps Proof: –Since the total number of packets is 2 n, the probability that any of them have a delay exceeding 6n is less than 2 n* 2 -6n = 2 -5n –The packet requires addition n steps to route from the source to the destination

Randomized Routing Algorithm Theorem 4: –A packet reaches its destination in 14n or fewer steps with a probability larger than (1-1/N) Proof: –Phase 2 of the Valiant’s scheme is identical to Phase 1 –Fail probability = 2*2 -5n < 2 -n = 1/N

Conclusion Oblivious routing algorithm may give very poor result at some specific cases Randomized routing algorithm can give satisfactory result for all cases with high probability