Presentation is loading. Please wait.

Presentation is loading. Please wait.

Department of Biomedical Informatics Multi-Objective Scheduling of Streaming Workflows Naga Vydyanathan 1, Umit V. Catalyurek 2,3, Tahsin Kurc 2, P. Sadayappan.

Similar presentations


Presentation on theme: "Department of Biomedical Informatics Multi-Objective Scheduling of Streaming Workflows Naga Vydyanathan 1, Umit V. Catalyurek 2,3, Tahsin Kurc 2, P. Sadayappan."— Presentation transcript:

1 Department of Biomedical Informatics Multi-Objective Scheduling of Streaming Workflows Naga Vydyanathan 1, Umit V. Catalyurek 2,3, Tahsin Kurc 2, P. Sadayappan 1 and Joel Saltz 1,2 1 Dept. of Computer Science & Engineering 2 Dept. of Biomedical Informatics 3 Dept. of Electrical & Computer Engineering The Ohio State University 2nd “Scheduling in Aussois” Workshop May 18-21, 2008 Bi-criteria Scheduling of Streaming Workflows

2 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 2 2nd Scheduling in Aussois Workshop, May 21, 2008 Current and Emerging Applications Satellite Data Processing DCE-MRI Analysis High Energy Physics Quantum Chemistry Image Processing Multimedia Video Surveillance Montage

3 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 3 2nd Scheduling in Aussois Workshop, May 21, 2008 Challenges Complex and diverse processing structures Bag-of-Tasks Model Non-streaming Workflows Streaming Data Analysis Applications Bag-of-Tasks Applications Workflows Non-streaming Task File Sequential or Parallel Task Streaming Workflows

4 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 4 2nd Scheduling in Aussois Workshop, May 21, 2008 Challenges Complex and diverse processing structures Varied parallelism Bag-of-Tasks Applications Sequential Task File P1P2P3P4 Task-parallelism

5 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 5 2nd Scheduling in Aussois Workshop, May 21, 2008 Complex and diverse processing structures Varied parallelism Bag-of-tasks applications: task-parallelism Challenges

6 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 6 2nd Scheduling in Aussois Workshop, May 21, 2008 Complex and diverse processing structures Varied parallelism Data-parallelismTask-parallelism Non-streaming Workflows Sequential or Parallel Task P1P2P3P4 Challenges

7 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 7 2nd Scheduling in Aussois Workshop, May 21, 2008 Complex and diverse processing structures Varied parallelism Bag-of-tasks: task-parallelism Non-streaming workflows: task- and data-parallelism Challenges

8 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 8 2nd Scheduling in Aussois Workshop, May 21, 2008 Complex and diverse processing structures Varied parallelism Data-parallelism Streaming Workflows Sequential or Parallel Task P1P2P3P4 Pipelined-parallelismTask-parallelism Challenges

9 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 9 2nd Scheduling in Aussois Workshop, May 21, 2008 Complex and diverse processing structures Varied parallelism Bag-of-tasks: task-parallelism Non-streaming workflows: task- and data-parallelism Streaming workflows: task-, data- and pipelined- parallelism Challenges

10 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 10 2nd Scheduling in Aussois Workshop, May 21, 2008 Different performance criteria Bag-of-tasks: batch execution time [CCGrid’05, HCW’05, JSSPP’06, HPDC’06] Non-streaming workflows: makespan [ICPP’05, HCW’06, ICPP’06, Cluster’06] Streaming workflows: latency, throughput [SC’02, EuroPar’07, ICPP’08] Significant communication/data transfer overheads Challenges

11 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 11 2nd Scheduling in Aussois Workshop, May 21, 2008 Data Analysis Applications Bag-of-Tasks ApplicationsWorkflows Non-streaming Streaming Scheduling Streaming Workflows

12 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 12 2nd Scheduling in Aussois Workshop, May 21, 2008 Scheduling Streaming Workflows Image processing, multimedia, computer vision applications often act on a stream of input data Scheduling challenges Multiple performance criteria Latency (time to process one data item) Throughput (aggregate rate of processing) Multiple forms of parallelism Pipelined parallelism Task parallelism Data parallelism

13 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 13 2nd Scheduling in Aussois Workshop, May 21, 2008 An Example Pipelined Schedule T1 T3T2 T T1(1)T1(2)T1(3) T2(2) T1(k) Time Processors P1 P2 P3 P4 P5 P6 T2(1) T3(1) T3(2) T4(1) T2(k-1) T3(k-1) T4(k-2) T1(4) T2(3) T3(3) T4(2) T1(2) T2(1) Pipelined Parallelism T2(1) T3(1) Task Parallelism T3(1) T3(2) Data Parallelism Throughput=0.1 Latency=37 T3 T4

14 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 14 2nd Scheduling in Aussois Workshop, May 21, 2008 Optimizing Latency while meeting Throughput Constraints Given: A workflow DAG with runtime and data volume estimates A collection of homogeneous processors A throughput constraint Goal: Generate a schedule that meets the throughput constraint while minimizing workflow latency This requires leveraging pipelined, task and data parallelism in a co-ordinated manner

15 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 15 2nd Scheduling in Aussois Workshop, May 21, 2008 Pipelined Scheduling Heuristic Three-phase approach Phase 1: Satisfying the throughput requirement Assumes unbounded number of processors Employs clustering, replication and duplication to meet throughput requirement Phase 2: Limiting the number of processors used Merges task clusters to reduce the number of processors used until a feasible schedule is obtained Preference given to decisions that minimize latency Phase 3: Minimizing the workflow latency Minimizes communication costs along the critical path by duplication and clustering

16 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 16 2nd Scheduling in Aussois Workshop, May 21, 2008 Task Clustering

17 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 17 2nd Scheduling in Aussois Workshop, May 21, 2008 Task Replication T1 T3T2 T T3 T4 Throughput = 0.1 Replication for Improve computation throughput Improve communication throughput 18 T1 T2

18 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 18 2nd Scheduling in Aussois Workshop, May 21, 2008 Sample application DAG (a) Schedule without duplication (b) Schedule with duplication Task Duplication

19 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 19 2nd Scheduling in Aussois Workshop, May 21, 2008 In the context of streaming workflows, duplication can be used to avoid bottleneck data transfers without compromising task parallelism Minimize workflow latency T1 T3T2 T Let T=0.1 and P=4 T1’ Duplication based Scheduling of Streaming Workflows

20 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 20 2nd Scheduling in Aussois Workshop, May 21, 2008 Duplication-based Scheduling of Streaming Workflows However, Duplication can require more processors due to redundant computation Depends on weight of duplicated task and throughput constraint Extra communication to broadcast input data to duplicates May increase latency too! Selectively duplicate ancestors Duplication is done only if There are available processors It proves beneficial in terms of latency It does not involve expensive communications that violate throughput requirement

21 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 21 2nd Scheduling in Aussois Workshop, May 21, 2008 Estimating Throughput and Latency Execution Model Realistic k-port communication model Communication & computation overlaps Throughput Estimate = min (CompRate, CommRate) Computation Rate (CompRate) Estimate: Min #Procs(C i ) / exec_time(C i ) for all C i ’s Communication Rate (CommRate) Estimate: Greedy priority based scheduling of communication to channels & ports Min #ParallelTransfers (tr j )/ min_cycle_time (tr j ) for all tr j Latency Estimate Takes into account both communication and computation dependencies

22 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 22 2nd Scheduling in Aussois Workshop, May 21, 2008 An Example P = 4, Throughput constraint T = 0.1 Satisfying the throughput nr(T1) = 0.8, nr(T2)=1, nr(T3)=0.4, nr(T4)=0.5, nr(T5)=0.4, nr(T6)=0.2 Expensive communications : e T1T2, e T3T4, e T3T5 Cluster T1 and T2 Duplicate T3 Limiting the number of processors P used = 5 Two options to reduce P used Merging T1, T2 and T6 Merging T3’, T5 and T6 Merge T3’, T5 and T6 -> reduces latency Minimizing latency Nothing to be done! T1 T3T2 T T5 T T3’ 6 4

23 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 23 2nd Scheduling in Aussois Workshop, May 21, 2008 Time Processors P1 P2 P3 P4 T1(1) Throughput = 0.1, Latency = 28 8 T2(1) 18 T1(2) 10 T3 (1) T3’ (1) T2(2) T4 (1) T5 (1) T6 (1) 23 T1(3)T2(3) 38 T3 (2) T4 (2) 33 T3’ (2) T5 (2) T6 (2) T1(4)T2(4) T3 (3) T4 (3) T3’ (3) T5 (3) T6 (3) 48 T1 T3T2 T T5 T T3’ 6 4 The Pipelined Schedule

24 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 24 2nd Scheduling in Aussois Workshop, May 21, 2008 As CCR is increased, more instances where FCP and EXPERT do not meet throughput requirement Proposed approach always meet throughput requirement and produces lower latencies (a) CCR = 0.1 (b) CCR = 1 (c) CCR = 10 Performance on Synthetic Benchmarks

25 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 25 2nd Scheduling in Aussois Workshop, May 21, 2008 As throughput constraint relaxed, greater benefit observed (more processors for duplication) For negligible throughput constraint, clustering doesn’t have much adverse impact on latency (a) CCR = 1 (b) CCR = 10 Benefit of Task Duplication

26 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 26 2nd Scheduling in Aussois Workshop, May 21, 2008 Performance on Applications MPEG frames are processed in order of arrival – no replication Throughput constraint assumed to be reciprocal of weight of largest task Proposed approach yields similar latency as FCP, but has lower resource utilization Proposed approach generates lower latency than EXPERT DivisionsMOSFCPEXPERT DivisionsMOSFCPEXPERT Performance of MPEG Video Compression on 32 processors, (a) Latency Ratio and (b) Utilization Ratio (a)(b)

27 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 27 2nd Scheduling in Aussois Workshop, May 21, 2008 Relation between throughput and latency Monotonically increasing Binary search algorithm on the inverse problem L – latency required If L >= L_max, output T_max If L_min < L < L_max, do binary search (T=T_max/2….) However, as we use heuristics, the monotonic relation is not guaranteed We use look-ahead techniques to avoid local optima (L_min, ~0)(L_max, T_max) Throughput Optimization under Latency Constraint

28 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 28 2nd Scheduling in Aussois Workshop, May 21, Throughput Optimization under Latency Constraint Proposed approach generates schedules with larger throughputs that meet the latency constraints Meets latency constraints even when other schemes fail (a) CCR = 0.1 (b) CCR = 1

29 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 29 2nd Scheduling in Aussois Workshop, May 21, 2008 Related Work Bag-of-Tasks applications H. Casanova, D. Zagorodnov, F. Berman, and A. Legrand. Heuristics for scheduling parameter sweep applications in grid environments. HCW’00. Arnaud Giersch, Yves Robert, and Fr é d é ric Vivien. Scheduling tasks sharing files on heterogeneous master-slave platforms. Journal of Systems Architecture, K Kaya and C Aykanat. Iterative-improvement-based heuristics for adaptive scheduling of tasks sharing files on heterogeneous master-slave environments. IEEE TPDS, Non-streaming workflows S Ramaswamy, S Sapatnekar, and P Banerjee. A framework for exploiting task and data parallelism on distributed memory multicomputers. IEEE TPDS A. Radulescu and A. van Gemund. A low-cost approach towards mixed task and data parallel scheduling. ICPP, A Radulescu, C Nicolescu, A J. C. van Gemund, and P Jonker. Cpr: Mixed task and data parallel scheduling for distributed systems. IPDPS, K. Aida and H. Casanova. Scheduling Mixed-Parallel Applications with Advance Reservations. HPDC, Streaming workflows F. Guirado, A.Ripoll, C. Roig, and E. Luque. Optimizing latency under throughput requirements for streaming applications on cluster execution. Cluster Computing, Matthew Spencer, Renato Ferreira, Michael Beynon, Tahsin Kurc, Umit Catalyurek, Alan Sussman, and Joel Saltz. Executing multiple pipelined data analysis operations in the grid. SC, 2002 Anne Benoit and Yves Robert. Mapping pipeline skeletons onto heterogeneous platforms. Technical Report LIP RR , Anne Benoit and Yves Robert. Complexity results for throughput and latency optimization of replicated and data-parallel workflows. Technical Report LIP RR , Anne Benoit, Harald Kosch, Veronika Rehn-Sonigo and Yves Robert. Optimizing latency and reliability of pipeline workflow applications, Technical Report LIP RR , 2008.

30 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 30 2nd Scheduling in Aussois Workshop, May 21, 2008 Conclusions & Future Work Streaming Workflows Co-ordinated use of task-, data- and pipelined-parallelism Multiple performance objectives (latency and throughput) Consistently meets throughput requirements Lower latency schedules using fewer resources Larger throughput schedules while meeting latency requirements Future Work Scheduling for multi-core clusters Deeper memory hierarchies Power-aware approaches Fault-tolerant approaches

31 Department of Biomedical Informatics Umit Catalyurek "Scheduling of Streaming Workflows" 31 2nd Scheduling in Aussois Workshop, May 21, 2008 Thanks Questions? Contact Info: Umit Catalyurek OSU Dept. of Biomedical Informatics:


Download ppt "Department of Biomedical Informatics Multi-Objective Scheduling of Streaming Workflows Naga Vydyanathan 1, Umit V. Catalyurek 2,3, Tahsin Kurc 2, P. Sadayappan."

Similar presentations


Ads by Google