Copyright 2004 Koren & Krishna ECE655/Koren Part.8.1 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Fault Tolerant Computing ECE.

Slides:



Advertisements
Similar presentations
1 Routing Protocols I. 2 Routing Recall: There are two parts to routing IP packets: 1. How to pass a packet from an input interface to the output interface.
Advertisements

COS 461 Fall 1997 Routing COS 461 Fall 1997 Typical Structure.
Routing in a Parallel Computer. A network of processors is represented by graph G=(V,E), where |V| = N. Each processor has unique ID between 1 and N.
Copyright 2004 Koren & Krishna ECE655/DataRepl.1 Fall 2006 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Fault Tolerant Computing.
1 Routing Lesson 10 NETS2150/2850 School of Information Technologies.
What's inside a router? We have yet to consider the switching function of a router - the actual transfer of datagrams from a router's incoming links to.
Interconnection Networks 1 Interconnection Networks (Chapter 6) References: [1,Wilkenson and Allyn, Ch. 1] [2, Akl, Chapter 2] [3, Quinn, Chapter 2-3]
Networks Types. Spring 2002Computer Network Applications Data Transfer During the ’70s: Minicomputers became affordable; Need to communicate information;
Computer Networks The Data Link / Network Layer Functions: Routing
Parallel Routing Bruce, Chiu-Wing Sham. Overview Background Routing in parallel computers Routing in hypercube network –Bit-fixing routing algorithm –Randomized.
7. Fault Tolerance Through Dynamic or Standby Redundancy 7.6 Reconfiguration in Multiprocessors Focused on permanent and transient faults detection. Three.
Delivery, Forwarding, and Routing
1 Lecture 25: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) Review session,
EECC694 - Shaaban #1 lec #7 Spring The OSI Reference Model Network Layer.
CSCI 4550/8556 Computer Networks Comer, Chapter 13: WAN Technologies and Routing.
ROUTING ON THE INTERNET COSC Aug-15. Routing Protocols  routers receive and forward packets  make decisions based on knowledge of topology.
Chapter 13: WAN Technologies and Routing 1. LAN vs. WAN 2. Packet switch 3. Forming a WAN 4. Addressing in WAN 5. Routing in WAN 6. Modeling WAN using.
CECS 474 Computer Network Interoperability WAN Technologies & Routing
MODULE IV SWITCHED WAN.
Interconnection Networks. Applications of Interconnection Nets Interconnection networks are used everywhere! ◦ Supercomputers – connecting the processors.
SAvPS – úvod Genči 2009 (bsaed on Tanenbaum’s slides.
Packet-Switching Networks Routing in Packet Networks.
1 Chapter Wide Area Networks (WANs), Routing, and Shortest Paths.
© Janice Regan, CMPT 128, CMPT 371 Data Communications and Networking BGP, Flooding, Multicast routing.
Distributed Routing Algorithms. In a message passing distributed system, message passing is the only means of interprocessor communication. Unicast, Multicast,
All that remains is to connect the edges in the variable-setters to the appropriate clause-checkers in the way that we require. This is done by the convey.
Leader Election Algorithms for Mobile Ad Hoc Networks Presented by: Joseph Gunawan.
CSE Advanced Computer Architecture Week-11 April 1, 2004 engr.smu.edu/~rewini/8383.
Dynamic Interconnect Lecture 5. COEN Multistage Network--Omega Network Motivation: simulate crossbar network but with fewer links Components: –N.
FAULT-TOLERANT NETWORKS AND FAULT-TOLERANT ROUTING SONER DEDEOĞLU 10/12/
Network and Communications Ju Wang Chapter 5 Routing Algorithm Adopted from Choi’s notes Virginia Commonwealth University.
Chapter 22 Network Layer: Delivery, Forwarding, and Routing Part 5 Multicasting protocol.
1 Nasser Alsaedi. The ultimate goal for any computer system design are reliable execution of task and on time delivery of service. To increase system.
Data Communications and Networking Chapter 11 Routing in Switched Networks References: Book Chapters 12.1, 12.3 Data and Computer Communications, 8th edition.
Copyright 1999, S.D. Personick. All Rights Reserved. Telecommunications Networking II Lecture 34 Routing Algorithms Ref: Tanenbaum pp ;
TELE202 Lecture 6 Routing in WAN 1 Lecturer Dr Z. Huang Overview ¥Last Lecture »Packet switching in Wide Area Networks »Source: chapter 10 ¥This Lecture.
William Stallings Data and Computer Communications 7th Edition
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
A Framework for Reliable Routing in Mobile Ad Hoc Networks Zhenqiang Ye Srikanth V. Krishnamurthy Satish K. Tripathi.
HYPERCUBE ALGORITHMS-1
Basic Communication Operations Carl Tropper Department of Computer Science.
Spring Routing: Part I Section 4.2 Outline Algorithms Scalability.
18-WAN Technologies and Dynamic routing Dr. John P. Abraham Professor UTPA.
1 Switching and Forwarding Sections Connecting More Than Two Hosts Multi-access link: Ethernet, wireless –Single physical link, shared by multiple.
Fundamentals of Computer Networks ECE 478/578
Copyright 2007 Koren & Krishna, Morgan-Kaufman Part.1.1 FAULT TOLERANT SYSTEMS Fault tolerant Measures.
Copyright 2007 Koren & Krishna, Morgan-Kaufman Part.12.1 FAULT TOLERANT SYSTEMS Part 12 - Networks.
Distance Vector Routing
Switching By, B. R. Chandavarkar, CSE Dept., NITK, Surathkal Ref: B. A. Forouzan, 5 th Edition.
Chapter 7 Packet-Switching Networks Shortest Path Routing.
IP tutorial - #2 Routing KAIST Dept. of CS NC Lab.
Routing algorithms. D(v): the cost from the source node to destination that has currently the least cost. p(v): previous node along current least.
Network Layer.
Ch 13 WAN Technologies and Routing
Chapter 6 Delivery & Forwarding of IP Packets
THE NETWORK LAYER.
Butterfly Network A butterfly network consists of (K+1)2^k nodes divided into K+1 Rows, or Ranks. Let node (i,j) refer to the jth node in the ith Rank.
Routing in Packet Networks Shortest Path Routing
Graphs Chapter 11 Objectives Upon completion you will be able to:
Mesh-Connected Illiac Networks
High Performance Computing & Bioinformatics Part 2 Dr. Imad Mahgoub
Advanced Computer Networks
Wide Area Networks (WANs), Routing, and Shortest Paths
ECE 753: FAULT-TOLERANT COMPUTING
Wide Area Networks (WANs), Routing, and Shortest Paths
Wide Area Networks (WANs), Routing, and Shortest Paths
Network Layer.
Copyright 2004 Koren & Krishna ECE655/DataRepl.1 Fall 2006 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Fault Tolerant Computing.
Control-Data Plane Separation
Presentation transcript:

Copyright 2004 Koren & Krishna ECE655/Koren Part.8.1 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Fault Tolerant Computing ECE 655 Part 8 Networks - 3

Copyright 2004 Koren & Krishna ECE655/Koren Part.8.2 Hypercube Networks  H n - An n-dimensional hypercube network - 2 nodes  A 0-dimensional hypercube H 0 - a single node  H n constructed by connecting the corresponding nodes of two H n-1 networks  The edges added to connect corresponding nodes are called dimension-(n-1) edges n Dimension-0 edge H 1

Copyright 2004 Koren & Krishna ECE655/Koren Part.8.3 Hypercube - Examples Dimension-0 edges Dimension-1 edges Dimension-2 edges Dimension-3 edges H 4 H 1 H 2 H 3 H 3

Copyright 2004 Koren & Krishna ECE655/Koren Part.8.4 Routing in Hypercubes  Specific numbering to simplify routing  Number expressed in binary - if nodes i and j are connected by a dimension-k edge, the names of i and j differ in only the k-th bit position  Example - nodes 0000 and 0010 differ in only the 2 bit position - connected by a dimension-1 edge  Example - a packet needs to travel from node 14=1110 to node 2=0010 in an H network  Possible routings -  1110  0110 (dimension 3)  0010 (dimension 2)  1110  1010 (dimension 2)  0010 (dimension 3) Dimension-0 edges Dimension-1 edges Dimension-2 edges Dimension-3 edges 1

Copyright 2004 Koren & Krishna ECE655/Koren Part.8.5 Routing - General  In general - the distance between source and destination is the number of different bits in addresses  Going from X to Y can be accomplished by traveling once along each dimension in which they differ  X = x... x ; Y=y... y  Define z = x  y -  is the exclusive-or operator  Packet must traverse an edge in every dimension i for which z = 1 n i i i i

Copyright 2004 Koren & Krishna ECE655/Koren Part.8.6 Fault Tolerance in Hypercubes  Hn (for n  2) can tolerate link failures - multiple paths from any source to any destination  Node failures can disrupt the operation  One way is to increase the number of communication ports of each node from n to n+1 and connecting these extra ports through additional links to one or more spare nodes  Example - two spare nodes - each a spare for 2 nodes of an Hn- 1 sub-cube  Spare nodes may require 2 ports - can be reduced by using several crossbar switches whose outputs is connected to the corresponding spare node  Number of ports of the spare node is reduced to n+ 1 - same as for all other nodes n- 1

Copyright 2004 Koren & Krishna ECE655/Koren Part.8.7 An H 4 Hypercube with Two Spare Nodes S S

Copyright 2004 Koren & Krishna ECE655/Koren Part.8.8 Different Method of Fault-Tolerance  Duplicating the processor in a few selected nodes  Each additional processor - spare also for any of the processors in the neighboring nodes  Example - nodes 0, 7, 8, 15 in H 4 - modified to duplex nodes  Every node now has a spare at a distance no larger than 1  Replacing a faulty processor by a spare results in an additional communication delay

Copyright 2004 Koren & Krishna ECE655/Koren Part.8.9 Routing in Injured Hypercubes  Routing algorithm must be modified to route around the faulty nodes or links  Basic idea - list the dimensions along which the packet must travel, and traverse them one by one  As edges are traversed and are crossed off the list  If, due to a link or a node failure, the desired link is not available - another edge in the list, if any, is chosen for traversal  If packet arrives at some node to find all dimensions on its list down - it backtracks to the previous node and tries again

Copyright 2004 Koren & Krishna ECE655/Koren Part.8.10 Formal Routing Algorithm - Notations  TD - list of dimensions that the message has traveled on - in order of traversal; TD - in reversed order   - exclusive-or operation carried out k times, sequentially  Example -  a means (a  a )  a  D - destination, S - source, d=D  S (  - bitwise exclusive-or operation on corresponding bits of D and S)  SC(A) - set of nodes visited if we travel on each of the dimensions listed in set A  Example - at node SC(1,3)={0000,1000} R k i i=1

Copyright 2004 Koren & Krishna ECE655/Koren Part.8.11 Notations - Cont.  e - n-bit vector consisting of a 1 in the i-th bit position and 0 everywhere else  Example - e = 100  Packets are assumed to consist of  (I) d; d=D  S  (II) Message being transmitted (the ``payload'')  (III) List of dimensions taken so far - TD   - append operation  TD  x - append x to the list TD  transmit(j) - send packet (d  e, message, TD  j) along the j-th-dimensional link from the present node i 2 n j 3

Copyright 2004 Koren & Krishna ECE655/Koren Part.8.12 Routing Algorithm for Injured Hypercubes

Copyright 2004 Koren & Krishna ECE655/Koren Part.8.13 Example - H 3  H 3 with faulty node 011  node 000 wants to send a packet to 111  At 000, d=111 - sends the message out on dimension-0, to node 001  At 001, d = 110 and TD=(0) - attempts dimension-1 edge - impossible  Bit 2 of d is also 1 - checks and finds that the dimension-2 edge to 101 is available - message is sent to 101 and then to 111  Exercise - What if both 011 and 101 are down? Dimension-0 edges Dimension-1 edges Dimension-2 edges

Copyright 2004 Koren & Krishna ECE655/Koren Part.8.14 Reliability of Point-to-Point Networks  Not necessarily a regular structure - often more than one path between any two nodes  Terminal Reliability - the probability that there exists an operational path between two specific nodes, given the probabilities of link failures  Example - calculating the terminal reliability for the source-sink pair N - N 4 1

Copyright 2004 Koren & Krishna ECE655/Koren Part.8.15 Terminal Reliability - Example I  Three paths from N to N  P ={X,X }  P ={X,X,X }  p (q ) - probability that link X is good (faulty)  Nodes are assumed fault-free - if not, their failure probability is incorporated into outgoing links  Set of paths must be modified to an equivalent set of mutually exclusive events - otherwise some events will be counted more than once  Mutually exclusive events - (I) P up ; (II) P up and P down ; (III) P up and both P and P down  ,2 2, ,4 1,3 1,2 2,3 3,4 i,j

Copyright 2004 Koren & Krishna ECE655/Koren Part.8.16 Calculating Terminal Reliability  Terminal reliability of a network with m paths P,..., P from source to sink  E (E ) - event in which path P is operational (faulty)  R = P(Operational Path Exists) = P(  E )  Set of events can be decomposed into mutually exclusive events -  O. P. Exists = E  ( E  E )  ( E  E  E ) ...  ( E  E  …  E )  R = P ( E ) +P ( E  E ) +P ( E  E  E ) P ( E  E  …  E ) 1 m m i=1 i i i 1 i m-1 m m _ _ _ _ _ _ _ _ _ _ _

Copyright 2004 Koren & Krishna ECE655/Koren Part.8.17 Terminal Reliability - Cont.  The last expression can be rewritten using conditional probabilities  R = P ( E ) +P ( E ) P ( E /E ) +P ( E ) P ( E  E /E ) P ( E )P(E  …  E /E )  The problem is calculating the probabilities P(E  …  E /E )  To identify the links which must must fail so that E occurs but not E,…, E, conditional sets are used  S = P - P = { x | x  P and x  P }  Identifying disjoint events in the general case is not always straightforward m m-1 m i-1 1 i _ _ _ _ __ _ i 1 j/i i i j j

Copyright 2004 Koren & Krishna ECE655/Koren Part.8.18 Terminal Reliability - Example II  This six-node network has 9 links - 6 uni-directional and 3 bi-directional  All paths from N 1 to N 6 -  Paths are ordered from shortest to longest

Copyright 2004 Koren & Krishna ECE655/Koren Part.8.19 Calculating Terminal Reliability for Example II  The first term for the reliability equation is P(E )=p p p  To calculate the second term in the reliability equation - the conditional set is used  S = P - P = {x, x }  At least one of the links in this set must fail so that P is faulty (while P is operational)  The second term in the probability equation - p p p (1-p p ) 1/ ,3 3, ,2 3,5 1,3 5,6 2,5 1 5,6 1,3 3,5

Copyright 2004 Koren & Krishna ECE655/Koren Part.8.20 Example II - Cont.  For calculating other terms in the sum - intersection of several conditional sets must be considered  Calculating the fourth term - expression for P - the conditional sets are: S ={x }; S ={x,x,x }; S ={x,x }  S is included in S - if S is faulty, S is faulty  S can be ignored  The fourth term in the reliability equation -  p p p p (1-p )(1-p p ) 3/4 1/4 5,6 4 2/4 2,5 1,3 1,2 5,6 2,4 3,5 4,5 4,6 5,6 1,2 2,4 1,2 1/4 2/4

Copyright 2004 Koren & Krishna ECE655/Koren Part.8.21 Example II - Cont.  Calculating the third term S = {x,x,x } ; S = {x,x }  The two conditional sets are not disjoint  The event both S and S are faulty needs to be divided into disjoint events:  (I) x is faulty  (II) x is operational and both x and x are faulty  (III) Both x and x are up, and both x and x are faulty  Resulting expression for third term  p p p (q + p q q + p p q q )  Remaining terms - calculated similarly  Terminal reliability is the sum of all thirteen terms 2/3 1/3 1,3 5,6 3,5 2,5 5,6 1/3 2/3 5,6 1,3 1,2 1,3 2,5 3,5 2,5 5,6 3,5 2,4 4,6 1,3 2,5