Presentation is loading. Please wait.

Presentation is loading. Please wait.

Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Similar presentations


Presentation on theme: "Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University."— Presentation transcript:

1 Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University

2 Parameter Sweep Applications An important class of applications An important class of applications Set of independent tasks Set of independent tasks MCell Application MCell Application 3D simulations for sub-cellular architecture/physiology 3D simulations for sub-cellular architecture/physiology GTOMO (Parallel Tomography) Application GTOMO (Parallel Tomography) Application Multiple view-point simulation Multiple view-point simulation Systems exist for scheduling on the Grid Systems exist for scheduling on the Grid Cluster-based Scheduling? Cluster-based Scheduling?

3 Application Level Schedulers Manage the scheduling of applications Manage the scheduling of applications Break the application to appropriate chunks Break the application to appropriate chunks APST (AppLeS Parameter Sweep Template) APST (AppLeS Parameter Sweep Template) NIMROD NIMROD Greedy approach to schedule PSA chunks Greedy approach to schedule PSA chunks

4 Presentation Roadmap  Job Scheduling in Clusters  Multi-Site Job Scheduling  PSA Scheduling Strategies  Multi-Site Scheduling of PSAs  Performance Evaluation  Conclusions

5 Job Scheduling in Clusters Mapping arriving jobs to available resources Mapping arriving jobs to available resources Multiple Schemes for Scheduling Multiple Schemes for Scheduling First Come First Serve (FCFS) First Come First Serve (FCFS) Conservative Scheduling Conservative Scheduling Aggressive or EASY Scheduling Aggressive or EASY Scheduling Fair-Share Constraints Fair-Share Constraints A user can not have more than ‘N’ queued jobs A user can not have more than ‘N’ queued jobs Submitting the multiple chunks of a PSA job Submitting the multiple chunks of a PSA job Violation of Fair-Share constraints Violation of Fair-Share constraints Combine chunks to form a single parallel job Combine chunks to form a single parallel job

6 Formation of PSAs in Clusters Small Independent Tasks Parallel Parameter Sweep Application

7 Presentation Roadmap  Job Scheduling in Clusters  Multi-Site Job Scheduling  PSA Scheduling Strategies  Multi-Site Scheduling of PSAs  Performance Evaluation  Conclusions

8 Multi-Site Job Scheduling Multiple Simultaneous Requests Multiple Simultaneous Requests Job submitted to multiple sites Job submitted to multiple sites Started on the earliest cluster Started on the earliest cluster Existing schemes have limitations Existing schemes have limitations Heterogeneous Clusters Heterogeneous Clusters Different Scheduling Schemes Different Scheduling Schemes

9 Multiple-simultaneous-requests Meta Scheduler Local Scheduler Meta Scheduler Local Scheduler Meta Scheduler Local Scheduler Jobs Site 1Site 2 Site 3

10 Presentation Roadmap  Job Scheduling in Clusters  Multi-Site Job Scheduling  PSA Scheduling Strategies  Multi-Site Scheduling of PSAs  Performance Evaluation  Conclusions

11 PSA Scheduling Strategies Flooding based Job Shredding Flooding based Job Shredding Submit all chunks in the PSA at once Submit all chunks in the PSA at once Greedy approach Greedy approach Improves User and System metrics Improves User and System metrics Doesn’t ensure fairness to Non-PSA jobs Doesn’t ensure fairness to Non-PSA jobs Opportune Job Shredding Opportune Job Shredding Uses an additional Application-Level Scheduler Uses an additional Application-Level Scheduler Monitors the current schedule of the system Monitors the current schedule of the system If no normal backfill is possible If no normal backfill is possible Allow PSA jobs to shred and backfill Allow PSA jobs to shred and backfill

12 Presentation Roadmap  Job Scheduling in Clusters  Multi-Site Job Scheduling  PSA Scheduling Strategies  Multi-Site Scheduling of PSAs  Performance Evaluation  Conclusions

13 Multi-Site Scheduling for PSAs Two-level Application Level Schedulers Two-level Application Level Schedulers No constraints on sites No constraints on sites Allowed to have different speeds Allowed to have different speeds Allowed to have different scheduling policies Allowed to have different scheduling policies Similar to “Multiple Simultaneous Requests” Similar to “Multiple Simultaneous Requests” Simultaneous requests only for PSAs Simultaneous requests only for PSAs

14 Multi-Site Scheduling for PSAs App-Level Scheduler Job Queue Local Scheduler App-Level Scheduler Job Queue Local Scheduler App-Level Scheduler Job Queue Local Scheduler Meta Application-Level Scheduler Site 1 Site 2 Site 3

15 Presentation Roadmap  Job Scheduling in Clusters  Multi-Site Job Scheduling  PSA Scheduling Strategies  Multi-Site Scheduling of PSAs  Performance Evaluation  Conclusions

16 Performance Metrics Response Time Response Time Completion Time – Submit Time Completion Time – Submit Time Slowdown Slowdown Response Time / Runtime Response Time / Runtime Loss of Capacity (LOC) Loss of Capacity (LOC)  LOC = min {  (waiting jobs procs), idle procs}  LOC = min {  (waiting jobs procs), idle procs}  T = Time for which this state lasts  T = Time for which this state lasts LOC =  LOC x  T LOC =  LOC x  T

17 Evaluation Scheme Simulation based Approach Simulation based Approach CTC trace from Feitelson’s archive CTC trace from Feitelson’s archive EASY backfilling used EASY backfilling used For multi-site evaluation For multi-site evaluation CTC traces from 3 different months CTC traces from 3 different months Processing speeds in the ratio 2:1:3 Processing speeds in the ratio 2:1:3

18 Flooding Based Job Shredding Up to 60% improvement for PSA Jobs Up to 90% worse performance for Non-PSA Jobs

19 Flooding: Job Category wise breakup Narrow Short Non-PSA jobs suffer most Loss of back-filling opportunities is the main reason

20 Flooding: Loss of Capacity Up to 75% improvement in the Loss of Capacity

21 Opportune Job Shredding Up to 70% improvement for PSA Jobs Less than 2% worsening in performance for Non-PSA Jobs

22 Opportune: Job Category wise breakup No category of Non-PSA jobs suffers more than 7%

23 Opportune: Loss of Capacity Up to 12% improvement in the Loss of Capacity

24 Opportune (Multi-Site) Up to 95% improvement for PSA Jobs No significant loss of performance for Non-PSA jobs

25 Opportune (Multi-Site): Response Time Up to 75% improvement for PSA Jobs No significant loss of performance for Non-PSA jobs

26 Opportune (Multi-Site): Slowdown Up to 95% improvement for PSA Jobs No significant loss of performance for Non-PSA jobs

27 Opportune (Multi-Site): Loss of Capacity Up to 45% improvement in the Loss of Capacity

28 Concluding Remarks Opportune Job Shredding Opportune Job Shredding Efficient Scheduling of PSAs Efficient Scheduling of PSAs Single Site and Multi-Site versions Single Site and Multi-Site versions Significant improvement for PSA jobs Significant improvement for PSA jobs Ensures that Non-PSA jobs are not affected Ensures that Non-PSA jobs are not affected Plan to integrate this with Prod. Schedulers Plan to integrate this with Prod. Schedulers

29 Thank You!


Download ppt "Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University."

Similar presentations


Ads by Google