Presentation is loading. Please wait.

Presentation is loading. Please wait.

5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz.

Similar presentations


Presentation on theme: "5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz."— Presentation transcript:

1 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

2 5 May 2005 2 Outline General consepts about tasks and scheduling Real time systems Fault Tolerant Scheduling Basic approaches used in Fault Tolerant Scheduling Algorithms and their execution details

3 5 May 2005 3 Task Deadline time the task should be finished Preemptive tasks can be stopped during execution restarted Nonpreemptive tasks cannot be restarted interrupted during execution

4 5 May 2005 4 Task Properties Periodic Aperiodic activated only when certain events occur arrival times are not known scheduled dynamically Dependent Independent

5 5 May 2005 5 Task Scheduling Distribution of tasks to the processors according to a given policy. Major goals of task scheduling: distribute the system load reduce total execution time

6 5 May 2005 6 Static & Dynamic Scheduling Static Scheduling compile time scheduling an accurate weight estimation is needed schedules of all tasks are predetermined Dynamic Scheduling scheduling at run time uses actual values of execution times of processes and communication times

7 5 May 2005 7 Real Time Systems Hard Real Time Correctness depends on logical results the result production time missing a deadline may be catastrophic mission-critical or life-critical applications fault tolerance is extremely important Soft Real Time

8 5 May 2005 8 Processors In The System Uniprocessor there is a single processor Multiprocessor there are n processors in the system can be identical (homogenous) can have different properties (heterogenous)

9 5 May 2005 9 Hard Real Time Systems Use multiprocessor Advantages more reliable unless a processor failure causes the whole system to fail can happen if no fault-tolerant capability is provided one processor failure does not cause the whole system to fail more computational power Disadvantage the probability of processor failure is higher

10 5 May 2005 10 Fault Tolerant System The system should produce correct results even in the presence of faults Important for most real time applications Tasks can have deadlines, and should be finished before the deadline fault tolerance required hard real time systems

11 5 May 2005 11 Error Detection in Fault Tolerant Scheduling Fail-Signal notify other processors of a detected fault Alarms or watchdogs detection of timing failures Signatures detection of HW/SW faults Acceptance Tests test results for HW/SW faults

12 5 May 2005 12 Fault Tolerance In Multiprocessor Systems Multiple copies of tasks scheduled on different processors Aim: the task completes before its deadline

13 5 May 2005 13 Fault Tolerante Scheduling In Multiprocessor Systems (Cont.) Multiple copies of tasks are scheduled to different processors One or more copies can run to ensure task completion before deadline PB (Primary/Backup Approach) TMR (Triple Modular Redundancy) Error checking is done by comparing results

14 5 May 2005 14 PB (Primary/Backup Approach) If incorrect results are generated from primary processor, backup processor is activated Small HW resource requirements Tasks are nonpreemptive, aperiodic, real-time

15 5 May 2005 15 An Algorithm For Real Time Fault Tolerant Scheduling in Multiprocessor Systems N periodic tasks are scheduled on a number of processors For each task i, there is a primary copy P i and a backup copy B i If primary copy fails, backup copy is activated Enough time needed to execute backup copies Static scheduling of tasks

16 5 May 2005 16 Scheduling Requirements Each task is executed by one processor at a time All tasks should meet their deadlines Maximize the number of processor failures to be tolerated P i and B i are assigned to only one processor which are different. Tasks are preemptive The number of processors used should be minimized

17 5 May 2005 17 Scheduling Algorithm Primary tasks are arranged in order of decreasing computation times Primary copies are scheduled (m processors are used) assign each copy to existing processors Primary schedule is dublicated for the backup copies (m processors are used) Any pair of primary and backup copies should not overlap

18 5 May 2005 18 An Example Distribution S={T 1, T 2, T 3, T 4, T 5 } C={5, 4, 4, 3, 2} T 1 -> P 1 T 2 -> P 2 T 3 -> P 1 T 4 -> P 2 T 5 -> P 2

19 5 May 2005 19 Example Cont.

20 5 May 2005 20 Another Algorithm Two copies of tasks allowed to start execution on different times Improves schedulability of tasks N identical processors and a scheduling processor are used Dynamic scheduling

21 5 May 2005 21 System Model A task is scheduled if Previously scheduled and the arrived task meet their deadlines Otherwise Task is rejected because its deadline is not met despite of a fault

22 5 May 2005 22 Techniques Used Backup copies are activated only when a fault occurs on the processor executing the primary copy Backup Overloading overlaping multiple slots for backups Backup De-allocation Release the slot for a backup copy when its primary copy is completed successfully

23 5 May 2005 23 Backup Overloading

24 5 May 2005 24 Backup Deallocation

25 5 May 2005 25 Proposed Technique The primary copy and backup copy are scheduled and executed in parallel The backup copy is divided into preceding part executed together with primary copy (redundant part) remaining part executed after the primary copy is completed (backup part) Backup overloading and backup deallocation is used

26 5 May 2005 26 Proposed Technique (Cont.)

27 5 May 2005 27 Scheduling Algorithm Schedule primary copy try to find a free slot between arrival time and deadline time Schedule backup copy schedule both redundant and backup parts

28 5 May 2005 28 System Overwiev

29 5 May 2005 29 Experiments Basic parameters used in experiments system load number of processors and tasks used computation time window size Analysing results rejection rate

30 5 May 2005 30 Experimental Results

31 5 May 2005 31 Thank You ANY QUESTIONS?


Download ppt "5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz."

Similar presentations


Ads by Google