Presentation is loading. Please wait.

Presentation is loading. Please wait.

Workshop on HPC in India Grid Middleware for High Performance Computing Sathish Vadhiyar Grid Applications Research Lab (GARL) Supercomputer Education.

Similar presentations


Presentation on theme: "Workshop on HPC in India Grid Middleware for High Performance Computing Sathish Vadhiyar Grid Applications Research Lab (GARL) Supercomputer Education."— Presentation transcript:

1 Workshop on HPC in India Grid Middleware for High Performance Computing Sathish Vadhiyar Grid Applications Research Lab (GARL) Supercomputer Education and Research Centre (SERC) Indian Institute of Science (IISc) Bangalore - 560012 ATIP 1 st Workshop on HPC in India @ SC-09

2 Sathish VadhiyarATIP 1 st Workshop on HPC in India @ SC-092 Grid Applications Research Lab  Grid and Parallel Computing with primary focus on  developing grid applications,  building strategies for checkpointing, migration, rescheduling, and fault-tolerance for parallel applications on grid systems, and  performance modeling of parallel applications on grids

3 Sathish VadhiyarATIP 1 st Workshop on HPC in India @ SC-093 Motivation  Developing solutions for deployment and use of large-scale scientific applications on grids  Will result in exploration of large- sized problems and long-running applications

4 Sathish VadhiyarATIP 1 st Workshop on HPC in India @ SC-094 Grid Applications Climate Modeling  Enable efficient executions of long-running climate modeling simulations on grid systems with the objective of solving climate science problems  Community Climate System Model (CCSM) – a multi- component global general circulation model  Analyzed the benefits of executing different components with checkpointing and rescheduling in different batch systems of a grid with a novel execution model CCSM

5 Sathish VadhiyarATIP 1 st Workshop on HPC in India @ SC-095 Grid Applications Climate Modeling – General Idea IJHPCA, FGCS  Job submission to a batch system incurs queue waiting time  Waiting time depends on processor requirements  How about decomposing a job into small subjobs with small processor requirements and submitting the subjobs to multiple batch systems of a grid?  Efficiency depends on effective system utilization using checkpointing, migration and rescheduling  Leads to 55% average increase in throughput Novel Execution Model

6 Sathish VadhiyarATIP 1 st Workshop on HPC in India @ SC-096  Predictions of future sequences in an evolutionary tree important for drug discovery, pharmaceutical research and disease control  Different ways of an ancestor sequence to transform to a progeny sequence  Formulated as a search-space exploration problem and used computational grids for explorations of the huge space of possible mutations  Used popular mutations to predict future evolutionary paths.  Performed predictions for hiv sequences and other protein sequences  40% better than random methods Grid Applications DNA Sequence Evolutions JPDC, escience 2009 Master-Worker Architecture for Analyzing Mutations 40% Better Predictions

7 Sathish VadhiyarATIP 1 st Workshop on HPC in India @ SC-097 Rescheduling  It is necessary to adapt application execution to grid resource and application dynamics  SRS – a checkpointing library for malleable applications  Can allow processor reconfiguration between migrations  Supports different data distributions, storage infrastructure, active migration and fault tolerance

8 Sathish VadhiyarATIP 1 st Workshop on HPC in India @ SC-098 1 2 3 4 5 6 7 8 9 10 11 12 13.. N Cluster-1 23 1 2 3 4 5 6 7 8 9 10 11 12 13.. N 1 2 3 4 5 6 7 8 9 10 11 12 13.. N Interval 1 (t 1 ) Interval 2 (t 2 ) Interval 3 (t 3 ) Interval i (t i )  To find {I 1, I 2, …,I Lopt } such that is minimized where L opt – number of intervals; t i – predicted execution time of each interval; rcost – rescheduling cost  Developed 3 novel algorithms for deriving a rescheduling plan  Incremental algorithm, division heuristic and genetic algorithm  Given a parallel application consisting of multiple phases and given a set of resources, the problem is to derive a rescheduling plan  Where to execute the different phases and when to migrate/reschedule Application Phases Division heuristic Resheduling Strategies

9 Sathish VadhiyarATIP 1 st Workshop on HPC in India @ SC-099 Rescheduling Strategies  Performed experiments with five large- scale multi-phase parallel applications  Molecular dynamics, n-body simulations, astrophysical gas dynamics, crack propagation, electromagnetics. Rescheduling MethodTime (hours) Incremental6.8 Division6.58 Genetic5.97 Single Schedule68.77 Huge Benefits due to Rescheduling

10 Sathish VadhiyarATIP 1 st Workshop on HPC in India @ SC-0910 Performance Modeling JPDC,CPE  It is imperative to automatically derive “knowledge” (performance characteristics) of applications  Can be used for effective mapping of applications to resources  Built techniques for automatically deriving performance model functions for predicting execution costs of parallel applications on grids  First effort to deal with load changes during application executions  Less than 30% modeling errors – best reported for non-dedicated systems  Have also developed novel scheduling algorithms that use the model functions  Generates 80% better schedules than existing approaches Box Elimination (BE) [red bars] 50-80% more efficient! Performance Model Accuracy for Parallel QR Scheduling Method Scheduling Results

11 Sathish VadhiyarATIP 1 st Workshop on HPC in India @ SC-0911 Grid Middleware  Created a grid middleware for parallel multi-phase applications with rescheduling capabilities  Have successfully run multi-phase applications on grid consisting of multiple batch and interactive clusters in two geographically distributed sites  Also created a grid middleware for multi-component applications for coordinating the executions of the components on the different systems Grid Middleware for Multi-Phase Applications Grid Middleware for Multi-Component Applications

12 Sathish VadhiyarATIP 1 st Workshop on HPC in India @ SC-0912 Other Research  Checkpointing Interval Selection  For efficient execution in the presence of failures  A Markov Model consisting of 3 kinds of states for performance prediction  Extensive simulations with 9-year real supercomputer failure traces on 8 parallel systems, 3 rescheduling policies, and 3 parallel applications  Our model’s checkpointing intervals lead to high amount of useful work by the applications in the presence of failures  Compiler-aided checkpointing instrumentation  A source-to-source precompiler for automatic insertion of checkpointing calls  Performs live-variable analysis for determining data and wrappers for finding data sizes  Can handle parallel applications with block-distribution (molecular dynamics)

13 Sathish VadhiyarATIP 1 st Workshop on HPC in India @ SC-0913 Summary  Primary endeavor to aid scientific advancement in different domain areas using grid systems  Grid research in two different application areas that resulted in significant application benefits using grids  Contributed novel scheduling and rescheduling algorithms, performance modeling strategies and robust grid middleware for use by scientific community

14 Sathish VadhiyarATIP 1 st Workshop on HPC in India @ SC-0914 Areas of Collaborations  Scalability of large-scale and peta applications  Fault tolerance in high performance systems  Setting up Indo-US grids  Grid middleware collaborations


Download ppt "Workshop on HPC in India Grid Middleware for High Performance Computing Sathish Vadhiyar Grid Applications Research Lab (GARL) Supercomputer Education."

Similar presentations


Ads by Google