Advanced services in gLite Gergely Sipos and Peter Kacsuk MTA SZTAKI.

Advanced services in gLite Gergely Sipos and Peter Kacsuk MTA SZTAKI

Grid Computing School, 10-12 July 2006, Rio de Janeiro Outline Advanced job types Interactive jobs Checkpointing jobs MPI jobs Workflows Condor DAGMan gLite workflow

Grid Computing School, 10-12 July 2006, Rio de Janeiro Normal job We have talked about Normal jobs sequential program takes input performs computation writes output The user gets the output after the execution Other options: Interactive jobs Logical checkpointing jobs MPI jobs Workflows

Grid Computing School, 10-12 July 2006, Rio de Janeiro Interactive Job (I) The Interactive job is a job whose standard streams are forwarded to the submitting client The user has to set the JDL JobType attribute to interactive When an interactive job is submitted, the edg-job-submit command starts a Grid console shadow process in the background that listens on a port assigned by the Operating System The port can be forced through the ListenerPort attribute in the JDL opens a new window where the incoming job streams are forwarded The DISPLAY environment variable has to be set correctly, because an X window is open The user can specify --nogui option, which makes the command provide a simple standard non-graphical interaction with the running job It is not necessary to specify the OutputSandbox attribute in the JDL because the output will be sent to the interactive window

Grid Computing School, 10-12 July 2006, Rio de Janeiro Interactive jobs (II) Specified setting JobType = “Interactive” in JDL When an interactive job is executed, a window for the stdin, stdout, stderr streams is opened Possibility to send the stdin to the job Possibility the have the stderr and stdout of the job when it is running Possibility to start a window for the standard streams for a previously submitted interactive job with command edg-job-attach

Grid Computing School, 10-12 July 2006, Rio de Janeiro Logical Checkpointing Job The Checkpointing job is a job that can be decomposed in several steps In every step the job state can be saved in the LB and retrieved later in case of failures The job state is a set of pairs defined by the user The job can start running from a previously saved state and not from the beginning again The user has to set the JDL JobType attribute to checkpointable

Grid Computing School, 10-12 July 2006, Rio de Janeiro Logical Checkpointing Job When a checkpointable job is submitted and starts from the beginning, the user run simply the edg-job-submit command the number of steps, that represents the job phases, can be specified by the JobSteps attribute e.g. JobSteps = 2; the list of labels, that represents the job phases, can be specified by the JobSteps attribute e.g. JobSteps = {“january”, “february”}; The latest job state can be obtained by using the edg-job-get-chkpt command A specific job state can be obtained by using the edg-job-get-chkpt –cs command When a checkpointable job has to start from an intermediate job state, the user run the edg-job-submit command using the –chkpt option where is a valid job state file, where the state of a previously submitted job was saved

Grid Computing School, 10-12 July 2006, Rio de Janeiro Job checkpointing example int main () { … for (int i=event; i < EVMAX; i++) { ;}... exit(0); } Example of Application (e.g. HEP MonteCarlo simulation)

Grid Computing School, 10-12 July 2006, Rio de Janeiro Job checkpointing example #include "checkpointing.h" int main () { JobState state(JobState::job); event = state.getIntValue("first_event"); PFN_of_file_on_SE = state.getStringValue("filename"); …. var_n = state.getBoolValue("var_n"); ; … for (int i=event; i < EVMAX; i++) { ;... state.saveValue("first_event", i+1); ; state.saveValue("filename", PFN of file_on_SE);... state.saveValue("var_n", value_n); state.saveState(); } … exit(0); } User code must be easily instrumented in order to exploit the checkpointing framework …

Grid Computing School, 10-12 July 2006, Rio de Janeiro Job checkpointing example #include "checkpointing.h" int main () { JobState state(JobState::job); event = state.getIntValue("first_event"); PFN_of_file_on_SE = state.getStringValue("filename"); …. var_n = state.getBoolValue("var_n"); ; … for (int i=event; i < EVMAX; i++) { ;... state.saveValue("first_event", i+1); ; state.saveValue("filename", PFN of file_on_SE);... state.saveValue("var_n", value_n); state.saveState(); } … exit(0); } User defines what is a state Defined as pairs Must be “enough” to restart a computation from a previously saved state

Grid Computing School, 10-12 July 2006, Rio de Janeiro Job checkpointing example #include "checkpointing.h" int main () { JobState state(JobState::job); event = state.getIntValue("first_event"); PFN_of_file_on_SE = state.getStringValue("filename"); …. var_n = state.getBoolValue("var_n"); ; … for (int i=event; i < EVMAX; i++) { ;... state.saveValue("first_event", i+1); ; state.saveValue("filename", PFN of file_on_SE);... state.saveValue("var_n", value_n); state.saveState(); } … exit(0); } User can save from time to time the state of the job

Grid Computing School, 10-12 July 2006, Rio de Janeiro Job checkpointing example #include "checkpointing.h" int main () { JobState state(JobState::job); event = state.getIntValue("first_event"); PFN_of_file_on_SE = state.getStringValue("filename"); …. var_n = state.getBoolValue("var_n"); ; … for (int i=event; i < EVMAX; i++) { ;... state.saveValue("first_event", i+1); ; state.saveValue("filename", PFN of file_on_SE);... state.saveValue("var_n", value_n); state.saveState(); } … exit(0); } Retrieval of the last saved state The job can restart from that point

Grid Computing School, 10-12 July 2006, Rio de Janeiro Other (most relevant) UI commands edg-job-attach Starts an interactive session for previously submitted interactive jobs Srarts a listener process on the UI machine edg-job-get-chkpt Allows the user to retrieve one or more checkpoint states by a previously submitted job

Grid Computing School, 10-12 July 2006, Rio de Janeiro MPI Job There are a lot of libraries supporting parallel jobs, but we decided to support MPICH. The MPI job is run in parallel on several processors The user has to set the JDL JobType attribute to MPICH and specify the NodeNumber attribute that’s the required number of CPUs When a MPI job is submitted, the UI adds in the Requirements attribute Member(“MpiCH”, other.GlueHostApplicationSoftwareRunTimeEnvironment) (the MPICH runtime environment must be installed on the CE) other.GlueCEInfoTotalCPUs >= NodeNumber (a number of CPUs must be at least be equal to the required number of nodes) In the Rank attribute other.GlueCEStateFreeCPUs (it is chosen the CE with the largest number of free CPUs)

Grid Computing School, 10-12 July 2006, Rio de Janeiro [ JobType = "MPICH"; NodeNumber = 2; Executable = "MPItest.sh"; Argument = "cpi 2"; InputSandbox = {"MPItest.sh", "cpi"}; OutputSandbox = "executable.out"; Requirements = other.GlueCEInfoLRMSType == “PBS” || other.GlueCEInfoLRMSType == “LSF”; ] The NodeNumber entry is the number of threads of MPI job The MPItest.sh script only works if PBS or LSF is the local job manager MPI Job

Grid Computing School, 10-12 July 2006, Rio de Janeiro MPItest.shSnapshot of MPItest.sh : # $HOST_NODEFILE contains names of hosts allocated for MPI job for i in `cat $HOST_NODEFILE` ; do echo "Mirroring via SSH to $i" # creates the working directories on all the nodes allocated for parallel execution ssh $i mkdir -p `pwd` # copies the needed files on all the nodes allocated for parallel execution /usr/bin/scp -rp./* $i:`pwd` # sets the permissions of the files ssh $i chmod 755 `pwd`/$EXE ssh $i ls -alR `pwd` done # execute the parallel job with mpirun mpirun -np $CPU_NEEDED -machinefile $HOST_NODEFILE `pwd`/$EXE > executable.out Important: you need shared keys between worker nodesImportant: you need shared keys between worker nodes Avoids sharing of home directories Enforced in GILDA NOT enforced in LCG2 … The VO needs to negotiate on a site by site basis MPI Job

Grid Computing School, 10-12 July 2006, Rio de Janeiro Condor DAGMan Directed Acyclic Graph Manager DAGMan allows you to specify the dependencies between your Condor jobs, so it can manage them automatically for you. (e.g., “Don’t run job “B” until job “A” has completed successfully.”)

Grid Computing School, 10-12 July 2006, Rio de Janeiro What is a DAG? A DAG is the data structure used by DAGMan to represent these dependencies. Each job is a “node” in the DAG. Each node can have any number of “parent” or “children” nodes – as long as there are no loops! Job A Job BJob C Job D

Grid Computing School, 10-12 July 2006, Rio de Janeiro Defining a Condor DAG A DAG is defined by a.dag file, listing each of its nodes and their dependencies: # diamond.dag Job A a.sub Job B b.sub Job C c.sub Job D d.sub Parent A Child B C Parent B C Child D each node will run the Condor job specified by its accompanying Condor submit file Job A Job BJob C Job D

Grid Computing School, 10-12 July 2006, Rio de Janeiro Submitting a Condor DAG To start your DAG, just run condor_submit_dag with your.dag file, and Condor will start a personal DAGMan daemon which to begin running your jobs: % condor_submit_dag diamond.dag condor_submit_dag submits a Scheduler Universe Job with DAGMan as the executable. Thus the DAGMan daemon itself runs as a Condor job, so you don’t have to baby-sit it.

Grid Computing School, 10-12 July 2006, Rio de Janeiro DAGMan Running a Condor DAG DAGMan acts as a “meta-scheduler”, managing the submission of your jobs to Condor based on the DAG dependencies. Condor Job Queue C D A A B.dag File

Grid Computing School, 10-12 July 2006, Rio de Janeiro DAGMan Running a Condor DAG (cont’d) DAGMan holds & submits jobs to the Condor queue at the appropriate times. Condor Job Queue C D B C B A

Grid Computing School, 10-12 July 2006, Rio de Janeiro DAGMan Running a Condor DAG (cont’d) In case of a job failure, DAGMan continues until it can no longer make progress, and then creates a “rescue” file with the current state of the DAG. Condor Job Queue X D A B Rescue File

Grid Computing School, 10-12 July 2006, Rio de Janeiro DAGMan Recovering a Condor DAG Once the failed job is ready to be re-run, the rescue file can be used to restore the prior state of the DAG. Condor Job Queue C D A B Rescue File C

Grid Computing School, 10-12 July 2006, Rio de Janeiro DAGMan Recovering a Condor DAG (cont’d) Once that job completes, DAGMan will continue the DAG as if the failure never happened. Condor Job Queue C D A B D

Grid Computing School, 10-12 July 2006, Rio de Janeiro DAGMan Finishing a Condor DAG Once the DAG is complete, the DAGMan job itself is finished, and exits. Condor Job Queue C D A B

Grid Computing School, 10-12 July 2006, Rio de Janeiro Additional DAGMan Features Provides other handy features for job management… nodes can have PRE & POST scripts failed nodes can be automatically re-tried a configurable number of times

Grid Computing School, 10-12 July 2006, Rio de Janeiro DAG Job in EGEE The DAG job is a Directed Acyclic Graph Job The user has to set in the JDL JobType = „dag”, nodes ( containing the description of the nodes), and dependencies attributes NOTE: A plug-in has been implemented to map an EGEE DAG submission to a Condor DAG submission Some improvements have been applied to the ClassAd API to better address WMS need

Grid Computing School, 10-12 July 2006, Rio de Janeiro nodes = { cmkin1 = [ file = “bckg_01.jdl" ; ], cmkin2 = [ file = “bckg_02.jdl" ; ], …… cmkinN = [ file = “bckg_0N.jdl" ; ] }; dependencies = { {cmkin1, cmkin2}, {cmkin2, cmkin3}, {cmkin2, cmkin5}, {{cmkin4, cmkin5}, cmkinN} } cmk in1 cmk in4 cmk in2 cmk in5 cmk inN cmk in3 DAG Job in EGEE

Advanced services in gLite Gergely Sipos and Peter Kacsuk MTA SZTAKI.

Similar presentations

Presentation on theme: "Advanced services in gLite Gergely Sipos and Peter Kacsuk MTA SZTAKI."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Advanced services in gLite Gergely Sipos and Peter Kacsuk MTA SZTAKI.

Similar presentations

Presentation on theme: "Advanced services in gLite Gergely Sipos and Peter Kacsuk MTA SZTAKI."— Presentation transcript:

Similar presentations

About project

Feedback