Mehmet Can Kurt, The Ohio State University Sriram Krishnamoorthy, Pacific Northwest National Laboratory Kunal Agrawal, Washington University in St. Louis.

Mehmet Can Kurt, The Ohio State University Sriram Krishnamoorthy, Pacific Northwest National Laboratory Kunal Agrawal, Washington University in St. Louis Gagan Agrawal, The Ohio State University Fault-Tolerant Dynamic Task Graph Scheduling

Motivation Significant transformation in hardware many core processors/coprocessors (Intel Xeon Phi) Trend towards asynchronous execution task graph model (Cilk, TBB, CnC) Resilience important now more than ever! soft errors decreasing MTBF 2 Fault tolerance for task graph model

Background: Task Graph Execution Representation as a DAG vertices (tasks), edges (dependences) Main scheduling rule Improved scalability asynchronous execution load balance via work stealing A C B D E 3

Failure Model Task graph scheduling in presence of detectable soft errors Recover corrupted data blocks and task descriptors Assumptions: 1. existence of an error detector ECC, symptom-based detectors, application-level assertions recovery upon observation 2. logic for task graph creation is resilient through user provided functions 4

Recovery Challenges D fails right after its computation re-compute D (only once), restart B and C Further complications if data blocks are reused C overwrites E re-compute E (only once) Minimum effect on normal scheduling A C B D E Waiting Completed Executing Failed 5

Fault-Tolerant Scheduling Developed on NABBIT* a task graph scheduler using work stealing augmented with additional routines optimality properties maintained Recovery from arbitrary number of soft failures no redundant execution or checkpoint/restart selective task re-execution negligible overheads for a small constant number of faults * IPDPS’10 6

Scheduling Without Failures Traverse predecessors A.status is “Computed”, decrement C.join B.status is “Visited”, enqueue C to B.notifyArray Successors enqueued in notifyArray Compute task when join is 0 Notify successors in notifyArray B A C C’s Task Descriptor join: notifyArray: status: db: number of outstanding predecessors execution status at the moment successors to notify pointer to output 2 { } Visited null 1 {D} D 0 Computed data 7

Fault-Tolerant Scheduling: Properties Non-collective and selective recovery without interfering with other threads re-execute impacted portion of the task graph A C B D E thread 1 thread 2 thread 3 8

Fault-Tolerant Scheduling: Recovery Failures can be handled at any stage of execution Enclosure with try-catch blocks No recovery for non observed failures during traversal A B C during computation A B C A B C during notification C E recover B recover C recover E A B C Predecessor Failure Self Failure Successor Failure 9

Fault-Tolerant Scheduling: Recovery Meta-data of a failed task is correctly recovered. Treat the failed task as new (no backup & restore) Replace failed task descriptor Recovering task traverses its predecessors, computes and notifies A C B D E B’ B’s Task Descriptor join: 0 notifyArray: {C,D} status: Visited db: null B’s Task Descriptor join: 1 notifyArray: {} status: Visited db: null 10

Fault-Tolerant Scheduling: Key Guarantees Guarantee 1: join of a task descriptor is decremented exactly once per predecessor. B recovers and notifies D again D executes prematurely Keep track of notifications D join: 1 (notified by B, waiting for C) A C B D E 11

Fault-Tolerant Scheduling: Key Guarantees Guarantee 2: Every task waiting on a predecessor is notified. Hung execution state if tasks enqueued are not notified! Re-construct notifyArray A C B D E B’ B notifyArr:{C,D} C join: 1 D join: 2 B’ notifyArr:{C,D} B’ notifyArr:{} 12

Fault-Tolerant Scheduling: Key Guarantees Guarantee 3: Each failure is recovered at most once. Both C and D observes failure A separate recovery by each observer Keep track of initiated recoveries A C B D E 13

Fault-Tolerant Scheduling: Key Guarantees Guarantee 4: Overwritten data blocks are distinguished and handled correctly. Did D start overwriting C’s data block? if no, only re-compute D. otherwise, re-compute C, B and A as well. Treat overwritten data blocks as failed BAC D 14 v=0 v=1 v=2

Theoretical Analysis Of Performance 15 NABBIT is provably time efficient asymptotically optimal running time Optimality property maintained for normal execution Optimal fault-tolerant execution no additional increase to critical path length cost depends on number of failures Experiments support analysis

Experiments Platform: four 12-core AMD Opteron 2.3 GHz processors with 256 GB memory only 44 cores out of 48 arithmetic mean (with standard deviation) of 10 runs Benchmarks: LCS, Smith-Waterman, Floyd-Warshall, LU and Cholesky 16

Overheads without Failures Results for LCS and SW 17 40 39 37 36

Overheads without Failures Results for FW (10-15% overhead at 44 cores) 18 42 36

Overheads with Failures Amount of Work Lost: loss is a constant amount of tasks (512), or a percentage of total work (2%, 5%) Failure Time: before compute or after compute Task Type: tasks which produce a data block’s 0 th (v=0), last (v=last), or a random version (v=rand). 19

Overheads with Failures (512 re- executions) Negligible/small in “before compute”/”after compute” scenarios no overhead with 1, 8 and 64 task re-executions 20

Overheads with Failures (2% and 5%) Overheads proportional to the amount of work lost. 3.6% 8.2% 21

Scalability Analysis (5% task re-executions, varying number of cores) Re-execution chains leading to lack of concurrency. Overheads not exceeding 6.5% in most cases. 22 6.5%

Conclusion A fault-tolerant dynamic task graph scheduler non-collective and selective recovery Optimality properties still hold Recovery overheads negligible for small number of failures proportional to the work lost for larger failures 23

Thank you! Questions? 24

Benchmarks 25

After Notify Scenario 26

Mehmet Can Kurt, The Ohio State University Sriram Krishnamoorthy, Pacific Northwest National Laboratory Kunal Agrawal, Washington University in St. Louis.

Similar presentations

Presentation on theme: "Mehmet Can Kurt, The Ohio State University Sriram Krishnamoorthy, Pacific Northwest National Laboratory Kunal Agrawal, Washington University in St. Louis."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Mehmet Can Kurt, The Ohio State University Sriram Krishnamoorthy, Pacific Northwest National Laboratory Kunal Agrawal, Washington University in St. Louis.

Similar presentations

Presentation on theme: "Mehmet Can Kurt, The Ohio State University Sriram Krishnamoorthy, Pacific Northwest National Laboratory Kunal Agrawal, Washington University in St. Louis."— Presentation transcript:

Similar presentations

About project

Feedback