Fault-tolerant Adaptive Divisible Load Scheduling Xuan Lin, Sumanth J. V. Acknowledge: a few slides of DLT are from Thomas Robertazzi ’ s presentation.
Published byModified over 5 years ago
Presentation on theme: "Fault-tolerant Adaptive Divisible Load Scheduling Xuan Lin, Sumanth J. V. Acknowledge: a few slides of DLT are from Thomas Robertazzi ’ s presentation."— Presentation transcript:
Fault-tolerant Adaptive Divisible Load Scheduling Xuan Lin, Sumanth J. V. Acknowledge: a few slides of DLT are from Thomas Robertazzi ’ s presentation
What is a Divisible Load? A computational & networkable load that is arbitrarily partitionable (divisible) amongst processors and links. There are no precedence relations between subtasks. Communication cost between head-node and the processors should be considered.
Issues when applying the theory How to decide the parameters in run-time? The parameters may change during the computation. Solution: Adaptive Strategy
Condor Grid Environment Existing condor lab pool at UNL. Processing capability of available nodes can vary significantly over time Consider anti-virus scans, OS updates. Can ignore short term variations. Network dynamics can be quite significant. Dynamic number of processors.
Adapting DLT to Condor DLT assumes that execution time for a fixed data set is constant for a given processor. Its predicted execution time can vary significantly from real execution time.
Adaptive Divisible Load Scheduling [D. Ghose et.al. 2005] Two phases: probing phase and optimal loaddistribution phase Probing and Delayed Distribution (PDD)
Probing and Delayed Distribution (PDD) Total workload is divided into p equal pieces. The first piece is used to do the probing. The first piece is further divided into n equal pieces, and each processors are assigned one piece. The second phase does not start until it receives all feedback. When the second phase starts, since we know all the parameters of the system, DLF can guide us to do the optimal distribution.
Limitation of PDD Most current work assumes a cluster computing environment Node failure is ignored. Dynamic change in number of processors is ignored. Once parameter estimation is completed, static environment is assumed. Not truly adaptive. If one or several processors give their feedback significantly slow than others, it will suffer a lot of idle time in the probing face.
Our Algorithm I1- The group contains nodes that have sent back feedback, and we do optimal distribution to them. I2- The group contains nodes that have sent back feedback, and we do not do optimal distribution at this round, but may do optimal distribution in the future. I3- The group contains nodes that have not sent back feedback yet. Two phases.
Our Algorithm – Probing Phase Initially, I1,I2 are empty. I3 contains all the available processors. The total workload is divided into p equal pieces. Step1: One piece will be further divided into n equal pieces and sent to each processors. Step2: When distribution is completed, check whether we get any feedback yet. If not, goto Step1.
Our Algorithm – Optimal Distribution Phase Step3: Assume we get k new feedback, if we this is the first time we get feedback, simply add these processors to I1. Otherwise, goto Step4. Step4: According to the feedback, we can calculate the speed of the processors and the network, calculate the available time of these processors. (These processors may not available now since in the probing phase, we may have sent several probing pieces.
Our Algorithm – Optimal Distribution Phase Step5: If the available time of the processors smaller than the current maximum available time (we will define later), add them to group I1, otherwise, add them to group I2. Step6: Assume the current size of group I1 is K, update their parameters (cpu speed and link speed), also calculate their available time and we record the maximum one as the current maximum available time, then we do the optimal distribution to these K processors. Repeat this step.
Our Algorithm Scheduling Point are defined as every time when we finish distribution of current round. Accept New Nodes: At each Scheduling Point, we will check if there are new processors available. If there are, we send probing pieces to them and add them to I3. Fault Tolerance: At each Scheduling Point, we will check weather some processors are timeout. If so, delete those nodes.
Simulation Initially Configuration Total workload =1000 Initially we have 8 nodes. p =100
Conclusion If some nodes are significant slower than other nodes, our algorithm is better. If the probing information is not accurate, our algorithm is better. If in a long term, the network and the processor's average speeds are stable, single round algorithm will beat multiround. Our algorithm has the ability to adapt new available processors. Our algorithm is fault-tolerant.
Future work More accurate distribution in the second round. More evaluation to find the relation of the performance and the parameters. Mechanism to decide weather we should accept the processors we discarded before.