Presentation is loading. Please wait.

Presentation is loading. Please wait.

Job Submission in the DataGrid Workload Management System

Similar presentations


Presentation on theme: "Job Submission in the DataGrid Workload Management System"— Presentation transcript:

1 Job Submission in the DataGrid Workload Management System
Massimo Sgaravatto INFN Padova - DataGrid WP1 Good afternoon Ladies and Gentlemen. My name is Gabriel Zaquine from WP12 (project Management) where I have the role of Quality Engineer. I’m from CS-SI which is one of the industrial partners on the project. I’m going to present you the QUALITY ASSURANCE on the DataGrid Project.

2 Job submission example
RB node RLS Network Server UI Workload Manager Inform. Service Job Contr. - CondorG CE characts & status SE characts & status The overall plan defined for the quality objectives is described here after : Computing Element Storage Element

3 Job submission edg-job-submit myjob.jdl RB node UI
JobType = “Normal”; Executable = "$(CMS)/exe/sum.exe"; InputSandbox = {"/home/user/WP1testC","/home/file*”, "/home/user/DATA/*"}; OutputSandbox = {“sim.err”, “test.out”, “sim.log"}; Requirements = other. GlueHostOperatingSystemName == “linux" && other. GlueHostOperatingSystemRelease == "Red Hat 6.2“ && other.GlueCEPolicyMaxWallClockTime > 10000; Rank = other.GlueCEStateFreeCPUs; Job Status RB node Job submission submitted Replica Catalog Network Server UI Workload Manager Inform. Service Job Description Language (JDL) to specify job characteristics and requirements UI: allows users to access the functionalities of the WMS Job Contr. - CondorG CE characts & status SE characts & status The overall plan defined for the quality objectives is described here after : Computing Element Storage Element

4 Job submission NS: network daemon RB node responsible for accepting
incoming requests RB node Job Status Job submission waiting submitted RLS Network Server Job UI Input Sandbox files Workload Manager Inform. Service RB storage Job Contr. - CondorG CE characts & status SE characts & status The overall plan defined for the quality objectives is described here after : Computing Element Storage Element

5 Job submission RB node UI RB WM: responsible to take
Job Status Job submission waiting submitted RLS Network Server UI Job Workload Manager Inform. Service RB storage WM: responsible to take the appropriate actions to satisfy the request Job Contr. - CondorG CE characts & status SE characts & status The overall plan defined for the quality objectives is described here after : Computing Element Storage Element

6 Job submission RB node UI RB waiting submitted RLS Network Server
Job Status Job submission waiting submitted RLS Network Server UI Match- Maker/ Broker Workload Manager Inform. Service RB storage Where must this job be executed ? Job Contr. - CondorG CE characts & status SE characts & status The overall plan defined for the quality objectives is described here after : Computing Element Storage Element

7 Job submission RB node Matchmaker: responsible to find the “best” CE
Job Status Job submission waiting submitted RLS Network Server Matchmaker: responsible to find the “best” CE where to submit a job UI Match- Maker/ Broker Workload Manager Inform. Service RB storage Job Contr. - CondorG CE characts & status SE characts & status The overall plan defined for the quality objectives is described here after : Computing Element Storage Element

8 Job submission RB node UI RB waiting submitted RLS Network Server
Where are (which SEs) the needed data ? Job Status Job submission waiting submitted RLS Network Server UI Match- Maker/ Broker Workload Manager Inform. Service RB storage What is the status of the Grid ? Job Contr. - CondorG CE characts & status SE characts & status The overall plan defined for the quality objectives is described here after : Computing Element Storage Element

9 Job submission RB node UI RB waiting submitted RLS Network Server
Job Status Job submission waiting submitted RLS Network Server UI Match- Maker/ Broker Workload Manager Inform. Service RB storage CE choice Job Contr. - CondorG CE characts & status SE characts & status The overall plan defined for the quality objectives is described here after : Computing Element Storage Element

10 Job submission RB node UI RB JA: responsible for the final “touches”
Job Status Job submission waiting submitted RLS Network Server UI Workload Manager Inform. Service RB storage Job Adapter Job Contr. - CondorG CE characts & status JA: responsible for the final “touches” to the job before performing submission (e.g. creation of wrapper script, etc.) SE characts & status The overall plan defined for the quality objectives is described here after : Computing Element Storage Element

11 Job submission RB node UI RB JC: responsible for the
Job Status Job submission submitted waiting ready RLS Network Server UI Workload Manager Inform. Service RB storage Job Job Contr. - CondorG JC: responsible for the actual job management operations (done via CondorG) CE characts & status SE characts & status The overall plan defined for the quality objectives is described here after : Computing Element Storage Element

12 Job submission RB node UI RB submitted waiting ready scheduled RLS
Job Status Job submission submitted waiting ready scheduled RLS Network Server UI Workload Manager Inform. Service RB storage Job Contr. - CondorG Input Sandbox files CE characts & status SE characts & status The overall plan defined for the quality objectives is described here after : Job Computing Element Storage Element

13 Job submission RB node UI RB submitted waiting ready scheduled running
Job Status Job submission submitted waiting ready scheduled running RLS Network Server UI Workload Manager Inform. Service RB storage Job Contr. - CondorG Input Sandbox The overall plan defined for the quality objectives is described here after : “Grid enabled” data transfers/ accesses Computing Element Storage Element Job

14 Job submission RB node UI RB submitted waiting ready scheduled running
Job Status Job submission submitted waiting ready scheduled running done RLS Network Server UI Workload Manager Inform. Service RB storage Job Contr. - CondorG Output Sandbox files The overall plan defined for the quality objectives is described here after : Computing Element Storage Element

15 Job submission RB node edg-job-get-output <dg-job-id> UI RB
Job Status Job submission edg-job-get-output <dg-job-id> submitted waiting ready scheduled running done RLS Network Server UI Workload Manager Inform. Service RB storage Job Contr. - CondorG Output Sandbox The overall plan defined for the quality objectives is described here after : Computing Element Storage Element

16 Job submission RB node UI RB submitted RLS Network Server waiting
Job Status Job submission submitted RLS Network Server waiting UI ready Output Sandbox files Workload Manager Inform. Service RB storage scheduled Job Contr. - CondorG running done The overall plan defined for the quality objectives is described here after : cleared Computing Element Storage Element

17 Job monitoring edg-job-status <dg-job-id>
RB node Job monitoring edg-job-status <dg-job-id> edg-job-get-logging-info <dg-job-id> Network Server UI LB: receives and stores job events; processes corresponding job status Workload Manager Job status Logging & Bookkeeping Job Contr. - CondorG Log Monitor The overall plan defined for the quality objectives is described here after : Log of job events LM: parses CondorG log file (where CondorG logs info about jobs) and notifies LB Computing Element

18 Job Submission Job submission to the Globus resource (CE) done via CondorG, which relies on the Globus GRAM mechanisms Why Condor-G ? Reliable job submission system Two-phase commit protocol used by CondorG for job management operations Persistency: CondorG keeps a persistent (crash proof) queue of jobs Logging system: CondorG logs all the relevant events (e.g. job started its execution, job execution completed, etc.) concerning the managed jobs Log file the parsed by LogMonitor, which triggers appropriate actions on certain events Need for interoperability with the US Grid projects, of which CondorG is an important component CondorG included in VDT Increased openess of the CondorG framework Condor is going open source

19 Job submission A job wrapper script (created by Job Adapter) is submitted to the CE Download of the input sandbox files from the RB node Execution of user’s job Upload of the produced (by the job) output sandbox files to the RB node Log of some LB events (job running, job done) Set of some env. variables Sandbox transfers Sandbox files transferred between RB node and WN via gridftp (issued from WN)  Outbound IP connectivity needed for WN New GT2 (exploited in new CondorG) offers possibility to stage files at submission time, and then retrieve output files (in the past possible only for executable, stdin, stdout, stderr) To be evaluted if this can be suitable for sandbox transfers Transfers between RB node and gatekeeper node  No outbound IP connectivity needed for WN for sandbox transfers Outbound IP connectivity needed for interactive jobs and for accessing remote SEs

20 Job types Possibility to submit: Normal jobs Interactive jobs
Sequential jobs The only ones supported in release 1 Interactive jobs See next slides Checkpointable jobs Parallel (MPI) jobs Partitionable jobs (not in release 2.0) Original job partitioned in sub-jobs which can be executed in parallel At the end each sub-job must save a final state, then retrieved by a job aggregator, responsible to collect the results of the sub-jobs and produce the overall output Use of job checkpointing and DAGMan mechanisms

21 Interactive jobs UI edg-job-submit jobint.jdl
Submission machine JDL Network Server Workload Manager Job Controller/CondorG edg-job-submit jobint.jdl jobint.jdl [JobType = “”interactive”; Executable = “int-prg.exe"; StdOutput = Outfile; InputSandbox = "/home/user/int-prg.exe”, OutputSandbox = “Outfile”, Requirements = other. GlueHostOperatingSystemName == “linux" && Other.GlueHostOperatingSystemRelease == “RH 6.2“;] User Interface InputSandbox OutputSandbox Job Shadow shadow port shadown host OutputSandbox RSL InputSandbox StdIn StdOut StdErr LB shadow port, shadow host Gatekeeper LRMS Bridging on the UI machine of the job standard streams (stdin, stdout and stderr) by integrating the Condor Bypass software Console Agent Pillow Process Job WN 1 WN Computing n Files transfer Element New flows Usual Job submission flows

22 Job checkpointing Checkpointing: saving from time to time job state
Useful to prevent data loss, due to unexpected failures To allow job preemption Approach: provide users with a “trivial” logical job checkpointing service User can save from time to time the state of the job (defined by the application) A job can be restarted from an intermediate (i.e. “previously” saved) job state Also exploited in the job partitioning framework Different than “classical checkpointing (i.e. saving all the information related to a process: process’s data and stack segments, open files, etc.) Very difficult to apply (e.g. problems to save the state of open network connections) Not necessary for all the DataGrid reference applications Sequential processing cases The state of the application is represented by a small amount of information defined by the application itself

23 Job checkpointing example
int main () { for (int i=event; i < EVMAX; i++) { < process event i>;} ... exit(0); } Example of Application (e.g. HEP MonteCarlo simulation)

24 Job checkpointing example
User code must be easily instrumented in order to exploit the checkpointing framework … #include "checkpointing.h" int main () { JobState state(JobState::job); event = state.getIntValue("first_event"); PFN_of_file_on_SE = state.getStringValue("filename"); …. var_n = state.getBoolValue("var_n"); < copy file_on_SE locally>; for (int i=event; i < EVMAX; i++) { < process event i>; ... state.saveValue("first_event", i+1); < save intermediate file on a SE>; state.saveValue("filename", PFN of file_on_SE); state.saveValue("var_n", value_n); state.saveState(); } exit(0); }

25 Job checkpointing example
#include "checkpointing.h" int main () { JobState state(JobState::job); event = state.getIntValue("first_event"); PFN_of_file_on_SE = state.getStringValue("filename"); …. var_n = state.getBoolValue("var_n"); < copy file_on_SE locally>; for (int i=event; i < EVMAX; i++) { < process event i>; ... state.saveValue("first_event", i+1); < save intermediate file on a SE>; state.saveValue("filename", PFN of file_on_SE); state.saveValue("var_n", value_n); state.saveState(); } exit(0); } User defines what is a state Defined as <var, value> pairs Must be “enough” to restart a computation from a previously saved state

26 Job checkpointing example
#include "checkpointing.h" int main () { JobState state(JobState::job); event = state.getIntValue("first_event"); PFN_of_file_on_SE = state.getStringValue("filename"); …. var_n = state.getBoolValue("var_n"); < copy file_on_SE locally>; for (int i=event; i < EVMAX; i++) { < process event i>; ... state.saveValue("first_event", i+1); < save intermediate file on a SE>; state.saveValue("filename", PFN of file_on_SE); state.saveValue("var_n", value_n); state.saveState(); } exit(0); } User can save from time to time the state of the job

27 Job checkpointing example
#include "checkpointing.h" int main () { JobState state(JobState::job); event = state.getIntValue("first_event"); PFN_of_file_on_SE = state.getStringValue("filename"); …. var_n = state.getBoolValue("var_n"); < copy file_on_SE locally>; for (int i=event; i < EVMAX; i++) { < process event i>; ... state.saveValue("first_event", i+1); < save intermediate file on a SE>; state.saveValue("filename", PFN of file_on_SE); state.saveValue("var_n", value_n); state.saveState(); } exit(0); } Retrieval of the last saved state The job can restart from that point

28 Job checkpointing scenarios
Job submitted to a CE When job runs it saves from time to time its state Job failure, due to a Grid problems (e.g. CE problem) Job resubmitted by the WMS possibly to a different CE Job restarts its computation from the last saved state  No need to restart from the beginning  The computation done till that moment is not lost Scenario 2 Job failure, but not detected by the Grid middleware User can retrieved a saved state for the job (typically the last one) edg-job-get-chkpt … User resubmits the job, specifying that the job must start from a specific (the retrieved one) initial state edg-job-submit –chkpt …

29 Parallel jobs Possibility to submit MPI jobs
MPICH implementation supported Only parallel jobs inside a single CE can be submitted Submission of parallel jobs very similar to normal jobs Just needed to specify the number (n) of requested CPUs in the JDL Matchmaking CE chosen by RB has to have MPICH sw installed, and at least n total CPUs If there are two or more CEs satisfying all the requirements, the one with the highest number of free CPUs is chosen Implementation NodeNumber RSL attribute exploited (to request the needed nodes) Job wrapper performs: mpirun –np #NodeNumber executable

30 Conclusions Job Submission in WMS relies on the CondorG system, which rely on the Globus GRAM mechanisms Reliable job submission environment In new release besides submitting “normal” jobs also possible to submit Interactive jobs Checkpointable jobs Parallel (MPI) jobs More info on WP1 Web Site ( In particular WMS user and admin guide and JDL docs


Download ppt "Job Submission in the DataGrid Workload Management System"

Similar presentations


Ads by Google