A General Method For Maximizing the Error-Detecting Ability of Distributed Algorithms Martina Schollmeyer and Bruce McMillin IEEE Transactions on parallel.

A General Method For Maximizing the Error-Detecting Ability of Distributed Algorithms Martina Schollmeyer and Bruce McMillin IEEE Transactions on parallel and distributed System, VOL. 8, NO. 2, February 1997 元智大學資訊工程所陳桂慧 1999.04.21

Outline Terminology for MPS Topology Coloring algorithm Maximal fault index (MFI) for common topologies Conclusion

Terminology for MPS Topology The communication environment (CE) of a processor Pi is the set of processors from which Pi will receive information during the execution a program. CE(Pi) is a subset of {P1,P2,…Pn} A fault group of a processor, FG(Pi), of fault tolerance, is the collection of faulty processors in CE(Pi), to guarantee error detection for all errors caused by these faults, required that A collection of processors that must be nonfaulty to guarantee detection of all errors induced by a set of faulty processor P is called the nonfault group of P, NFG(P)

Coloring Algorithm Coloring algorithm - to color the graph, indicating faultiness or nonfaultiness of components, when determine the NFG of an individually faulty processor. –describe how the coloring is done for one fault in each CE( =1) –extend the algorithm for >1 to multicoloring, where each vertex has a chromaticity of, to obtain the NFGs. – this algorithm is used to obtain a possible distribution of component failures for the whole MPS.

for i:=1 to n// n is the total # of processors color Pi faulty; color non-faulty; //NFG({pi}) save NFG({Pi}); reset colors; end for

An algorithm to determine the NFG of a set P of faulty processor To determine a permissible fault distribution for the entire network.

1. Select 1 to be faulty => 2,3 nonfaulty 2a. Arbitrarily node 5 is chosen to be faulty =>4,6,7 nonfaulty 2b.Arbitrarily node 3 is chosen to be faulty =>all nonfaulty This provides a total of three faulty processors, 1, 3, and 5 with at most two faulty component in each CE.

A fault matrix (FM) of an MPS gives, for all sets of faulty processor P, all processors that must be nonfaulty if the elements in P are faulty. A FM corresponds to a collection of NFGs for a specific.

MFI for Meshes - Square Pattern For an arbitrary n*m torus-connected mesh, with =1 MFI = div(m,2) * div(m,2)

The number of resource nodes in k- ary n-cube for perfect 1-adjacency as X=k^n/(2n+1), k>2 A torus-connected mesh is a k-ary 2- cube, if we can guarantee that we have only k*k meshes X = k^2/5 MFI for Meshes - Star Pattern

MFI for Binary Hypercubes

Conclusion The Maximal fault index was introduced to demonstrate how a maximal number of simultaneous component failures can be tolerated by an error detecting algorithm, based on specific distributions of the faults within the interconnection network. The assessment of an error-detecting algorithm based on the concept of its minimal and maximal fault index can be used for safety critical system, especially with respect to the process-to-processor mapping that can be obtained from it.

A General Method For Maximizing the Error-Detecting Ability of Distributed Algorithms Martina Schollmeyer and Bruce McMillin IEEE Transactions on parallel.

Similar presentations

Presentation on theme: "A General Method For Maximizing the Error-Detecting Ability of Distributed Algorithms Martina Schollmeyer and Bruce McMillin IEEE Transactions on parallel."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A General Method For Maximizing the Error-Detecting Ability of Distributed Algorithms Martina Schollmeyer and Bruce McMillin IEEE Transactions on parallel.

Similar presentations

Presentation on theme: "A General Method For Maximizing the Error-Detecting Ability of Distributed Algorithms Martina Schollmeyer and Bruce McMillin IEEE Transactions on parallel."— Presentation transcript:

Similar presentations

About project

Feedback