Download presentation
Presentation is loading. Please wait.
1
Software Tools for Dynamic Resource Management Irina V. Shoshmina, Dmitry Yu. Malashonok, Sergay Yu. Romanov Institute of High-Performance Computing and Information Systems www.csa.ru {irena,mal,serrom}@csa.ru
2
State of the art Resources: n CONVEX(es) n Parsytec CC/16 n Parsytec CCid n Parsytec Power Mouse System n SPP1600 n SGI OCTANE Workstations n SunUltra 450 n Paritet (intel cluster) www.csa.ru/CSA Scientific problems: n hydroaerodynamics n plasma n nuclear physics n medicine n biology n chemistry n astronomy
3
Difficulties n shortage of resources for soluble scientific problems n unsatisfactory management of tasks (the majority of tasks are parallel)
4
Shortage of resources integrate computational resources of several scientific centres Advantages of integration n increase access and activity of usage of computational resources, n promote an integration of scientific community, n increase the range of resolving scientific and technical problems
5
Management of tasks optimisation of task distribution on computational nodes Tools n Codine n SunGridEngine n PBS n Condor Disadvantages of tools n weak support of migration of parallel tasks n unsatisfactory load balancing n dependence on versions of PVM and MPI
6
Main goals of the project n increase of efficiency of use of computing resources n improvement of quality of service of the users Main tasks n migration of parallel tasks n optimisation of distributed resource management n integration resources of several scientific centres
7
Dynamite software developed by University of Amsterdam in the Esprit project 23499 Dynamite advantages n migration and checkpointing of PVM tasks n automatic work-load balancing of PVM tasks (on a cluster of workstations) n migration of dynamically linked tasks n migration of communication end points n reallocation of tasks
8
Dynamite disadvantages n dependence on the PVM versions n absence of migration of MPI tasks n absence of satisfactory monitoring system n absence of advanced scheduling system n absence of modules of global distribution
9
Main steps of the project n Migration of MPI and PVM tasks n Checkpointing of parallel tasks n Monitoring n Resource management n Addition architectures
10
Two-level system Global level Local level
11
Main problems of migration n migration of PVM tasks n migration of MPI tasks n independence from versions and realisations of PVM and MPI n addition of architectures Migration of PVM and MPI tasks n files n sockets n kernel supported threads and etc.
12
Checkpointing of parallel tasks n trace development of parallel tasks n migrate parallel tasks at two levels u migrate of a process of a parallel task (local level) u migrate of a parallel task wholly (global level) n process extreme situations
13
Checkpointing of parallel tasks Global level local level
14
Monitoring Parameters of n computational resources (loading of processors, memory, network), n tasks and queues, n users
15
Resource management n distribution of tasks and queues at the moment n long-time scheduling n dynamic load balancing at global and local levels
16
Integration with Globus Global environment local level Globus local level
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.