Presentation is loading. Please wait.

Presentation is loading. Please wait.

Resource Manager for Grid with global job queue and with planning based on local schedules V.N.Kovalenko, E.I.Kovalenko, D.A.Koryagin, E.Z.Ljubimskii,

Similar presentations


Presentation on theme: "Resource Manager for Grid with global job queue and with planning based on local schedules V.N.Kovalenko, E.I.Kovalenko, D.A.Koryagin, E.Z.Ljubimskii,"— Presentation transcript:

1 Resource Manager for Grid with global job queue and with planning based on local schedules V.N.Kovalenko, E.I.Kovalenko, D.A.Koryagin, E.Z.Ljubimskii, A.V.Orlov, E.V.Huhlaev {kvn,kei,koryagin,ljubimsk,ao,huh}@keldysh.ru Keldysh Institute of Applied Mathematics Keldysh Institute of Applied Mathematics Russian Academy of Sciences Keldysh Institute of Applied Mathematics Keldysh Institute of Applied Mathematics Russian Academy of Sciences 11

2 Job submitting in Globus system Job submitting by means of Broker Broker 22

3  GRID Resource Broker (GRB) – HPC lab, University of Lecce, Italy and CACR, California Institute of Technology. http://sara.unile.It/grb/  EZ-Grid - Department of Computer Science, University of Houston. http: //www.cs.uh.edu/~ ezgrid/ http: //www.cs.uh.edu/~ ezgrid/  GRID Resource Broker (GRB) – HPC lab, University of Lecce, Italy and CACR, California Institute of Technology. http://sara.unile.It/grb/  EZ-Grid - Department of Computer Science, University of Houston. http: //www.cs.uh.edu/~ ezgrid/ http: //www.cs.uh.edu/~ ezgrid/ Resource Brokers  MetaDispatcher – Keldysh Institute of Applied Mathematics, Moscow 33

4 Job submitting in Globus system Job submitting by means of Broker Broker 44

5 Architecture of MetaDispatcher 55

6 Problem of scheduling The problem of scheduling is decided on two sets: 1) the set of jobs and 2) the set of computing elements. Scheduling results: -The dispatch time for each job -The place, where the job should be directed and executed Problem of scheduling The problem of scheduling is decided on two sets: 1) the set of jobs and 2) the set of computing elements. Scheduling results: -The dispatch time for each job -The place, where the job should be directed and executed 66

7 Config.Config. Config. file Two management levels - local and global, each having own objects: job, queue, and management system - Local Resource Monitor (LRM) and MetaDispatcher. Global level LRM Local queue Local level MetaDispatcherMetaDispatcher jobjob jobjob jobjob jobjob Global queue 77

8 Question 1 : In What Order Should the Global Jobs Be Served?  The order, in which the scheduler serves the job queue, should differ from FIFO.  User should have available the management facilities for placing his job at any position in the global queue. To achieve that:  Limited budget is allocated to each user.  Within the budget limits user prices his jobs.  Function GP evaluates global priority of the job: GP=GP(price, required resources, run time ) GP=GP(price, required resources, run time )  The order, in which the scheduler serves the job queue, should differ from FIFO.  User should have available the management facilities for placing his job at any position in the global queue. To achieve that:  Limited budget is allocated to each user.  Within the budget limits user prices his jobs.  Function GP evaluates global priority of the job: GP=GP(price, required resources, run time ) GP=GP(price, required resources, run time ) job jobjob jobjob jobjob jobjob jobjob jobjob new job 88

9 Question 2: When Forward a Job to a Target Computing Element? jobjob jobjob jobjob jobjob Ifdestination point of a job is determined at the moment, when it comes in to a global queue, and the job is immediately routed to a local queue… If destination point of a job is determined at the moment, when it comes in to a global queue, and the job is immediately routed to a local queue… itmay be delayed there because of the local job arrival. At the same time resources of other computing elements may become free and idle. it may be delayed there because of the local job arrival. At the same time resources of other computing elements may become free and idle. The conclusion: It is more reasonablly to store global jobs in global queue as long as possible, best of all up to the moment of start. The conclusion: It is more reasonablly to store global jobs in global queue as long as possible, best of all up to the moment of start. new job jobjob jobjob jobjob jobjob jobjob jobjob jobjob 99

10 The scheduling model of computing installation: A set of resources Resource description: Static attributes: (OS type, CPU time, memory volume) Dynamic attributes: free/busy, resource amount The scheduling model of computing installation: A set of resources Resource description: Static attributes: (OS type, CPU time, memory volume) Dynamic attributes: free/busy, resource amount Question 3: To Which Computing Elements a Job Should Be Passed? Question 3: To Which Computing Elements a Job Should Be Passed? 1010

11 Resource Release Time However the scheduler must have a guarantee, that the planned global job will really start and will not stay waiting in a local queue. Resource Time Running job Busy resources have an additional attribute – release time estimated from the request of a running job. Being aware of the release time, the scheduler is able to plan the future usage of the busy resource. 1111

12 + Question 4: How the Interaction of the Global Scheduler and Local Resource Monitor Should Be Organized? Autonomy of computing element: Each computing element of the Grid belongs to a certain owner that could be able to restrict access for external jobs completely or partly. Autonomy of computing element: Each computing element of the Grid belongs to a certain owner that could be able to restrict access for external jobs completely or partly. If global and local jobs make demands for the same resources, their priorities are compared. For this purpose each computing element i determines the function LPi() that calculates the local priority of a global job. This function depends on job’s price, consumable resources and run time: LPi = LPi (price, consumable resources, run time) LPi = LPi (price, consumable resources, run time) If global and local jobs make demands for the same resources, their priorities are compared. For this purpose each computing element i determines the function LPi() that calculates the local priority of a global job. This function depends on job’s price, consumable resources and run time: LPi = LPi (price, consumable resources, run time) LPi = LPi (price, consumable resources, run time) If two jobs, local and global, ask for free resources, which one should be preferred? Question 4: How should the interaction of the global scheduler and local resource monitor be organized? 1212

13 + Question 4: How the Interaction of the Global Scheduler and Local Resource Monitor Should Be Organized? Question 4: How the Interaction of the Global Scheduler and Local Resource Monitor Should Be Organized? The global scheduler should distribute its jobs so that the global jobs would not withhold the start of any more "expensive” local jobs. Resource Time Running job Global queue P G <P L PGPGPGPG P G = LP(job G ) job G PLPLPLPL Local queue job L 1313

14 ScheduleSchedule Resource Future Time Future Time Running job priority1priority1 priority2priority2 priority4priority4 priority3priority3 The local schedule is the plan of resource occupation by local jobs for some period of time in the future. Local schedule: For each local job {priority, assigned resources, occupation and release time} The local schedule is the plan of resource occupation by local jobs for some period of time in the future. Local schedule: For each local job {priority, assigned resources, occupation and release time} 1414

15 The local schedule is drawn up by the special agents of the global scheduler. Such agents, working on each computing installation, arrange the schedule in precise conformity with scheduling strategy and configuration parameters of the local monitor. The actual state of all local schedules is delivered to the information base of the global scheduler, and, thus, it has available the information about the usage plan of all virtual organization resources. On the basis of this aggregate schedule the scheduler can make up the layout of global jobs allocation to resources. The local schedule is drawn up by the special agents of the global scheduler. Such agents, working on each computing installation, arrange the schedule in precise conformity with scheduling strategy and configuration parameters of the local monitor. The actual state of all local schedules is delivered to the information base of the global scheduler, and, thus, it has available the information about the usage plan of all virtual organization resources. On the basis of this aggregate schedule the scheduler can make up the layout of global jobs allocation to resources. 1515

16 Data Base jobjob jobjob jobjob jobjob Global queue Program architecture of scheduling Agent LRM Agent LRM Agent Queue LRM SchedulerScheduler 1616

17  The global scheduler implementing certain scheduling strategy make up the global schedule.  The information base resides adjacently with the scheduler and stores aggregate schedule. For data management the distributed system like Spitfire of Datagrid project with relational data base as a core is considered.  The local agents of the scheduler works on each computing element. Interacting with the local resource monitor, the agent arranges a local schedule of this computing element and transfers updates to the global scheduler. Proposed implementation is based on Maui scheduler.  The global scheduler implementing certain scheduling strategy make up the global schedule.  The information base resides adjacently with the scheduler and stores aggregate schedule. For data management the distributed system like Spitfire of Datagrid project with relational data base as a core is considered.  The local agents of the scheduler works on each computing element. Interacting with the local resource monitor, the agent arranges a local schedule of this computing element and transfers updates to the global scheduler. Proposed implementation is based on Maui scheduler. 1717

18 Future directions:  Backfill algorithm implementation at the global level to avoid blocking of the jobs.  Advanced resource reservation for distributed multiprocessor jobs.  Economical model of virtual organization as applied to scheduling. Future directions:  Backfill algorithm implementation at the global level to avoid blocking of the jobs.  Advanced resource reservation for distributed multiprocessor jobs.  Economical model of virtual organization as applied to scheduling. 1818


Download ppt "Resource Manager for Grid with global job queue and with planning based on local schedules V.N.Kovalenko, E.I.Kovalenko, D.A.Koryagin, E.Z.Ljubimskii,"

Similar presentations


Ads by Google