Presentation is loading. Please wait.

Presentation is loading. Please wait.

FP6−2004−Infrastructures−6-SSA-026409 www.eu-eela.org E-infrastructure shared between Europe and Latin America Special Jobs Matias Zabaljauregui UNLP.

Similar presentations


Presentation on theme: "FP6−2004−Infrastructures−6-SSA-026409 www.eu-eela.org E-infrastructure shared between Europe and Latin America Special Jobs Matias Zabaljauregui UNLP."— Presentation transcript:

1 FP6−2004−Infrastructures−6-SSA-026409 www.eu-eela.org E-infrastructure shared between Europe and Latin America Special Jobs Matias Zabaljauregui UNLP

2 E-infrastructure shared between Europe and Latin America 2 9th EELA TUTORIAL (USERS AND SYSTEM ADMINISTRATORS) Outline MPI jobs on gLite DAG Job Collection Parametric jobs

3 E-infrastructure shared between Europe and Latin America 3 9th EELA TUTORIAL (USERS AND SYSTEM ADMINISTRATORS) MPI Overview Execution of parallel jobs is an essential issue for modern informatics and applications. Most used library for parallel jobs support is MPI (Message Passing Interface) At the state of the art, parallel jobs can run inside single Computing Elements (CE) only; –several projects are involved into studies concerning the possibility of executing parallel jobs on Worker Nodes (WNs) belonging to different CEs.

4 E-infrastructure shared between Europe and Latin America 4 9th EELA TUTORIAL (USERS AND SYSTEM ADMINISTRATORS) Requirements & settings In order to guarantee that MPI job can run, the following requirements MUST BE satisfied: –the MPICH software must be installed and placed in the PATH environment variable on each WNs of the CE. –Some MPI’s applications require a shared filesystem among the WNs to run.  The variable VO_ _SW_DIR will contain the name of a directory in case of SHARED filesystem.  The variable VO_ _SW_DIR will contain “.” if there is NO SHARED filesystem.

5 E-infrastructure shared between Europe and Latin America 5 9th EELA TUTORIAL (USERS AND SYSTEM ADMINISTRATORS) From the user’s point of view, jobs to be run as MPI are specified setting the JDL JobType attribute to MPICH and specifying the NodeNumber attribute as well. E.g.: … JobType = “MPICH”; NodeNumber = 4; … This attribute defines the required number of CPUs needed for the application

6 E-infrastructure shared between Europe and Latin America 6 9th EELA TUTORIAL (USERS AND SYSTEM ADMINISTRATORS) When the previous two attributes are included in a JDL, the User Interface (UI) automatically adds the following expression: to the JDL Requirements expression in order to find out the best resource where the job can be executed. (other.GlueCEInfoTotalCPUs >= NodeNumber) && Member (“MPICH”,other.GlueHostApplicationSoftwareRunTimeEnvironment)

7 E-infrastructure shared between Europe and Latin America 7 9th EELA TUTORIAL (USERS AND SYSTEM ADMINISTRATORS) Create the file “mpi-test.jdl” inside $HOME/gLite/Other and put this contents inside the file: [ Type = "Job"; JobType = "MPICH"; Executable = “cpi"; NodeNumber = 2; StdOutput = “cpi.out"; StdError = “cpi.err"; InputSandbox = {"cpi"}; OutputSandbox = {“cpi.err",“cpi.out"}; RetryCount = 0; ] MPI exercise

8 E-infrastructure shared between Europe and Latin America 8 9th EELA TUTORIAL (USERS AND SYSTEM ADMINISTRATORS) MPI submission [santiago05@santiago05 Other]$ glite-job-submit -o id mpi-test.jdl Selected Virtual Organisation name (from proxy certificate extension): gilda Connecting to host santiago01.reuna.cl, port 7772 Logging to host santiago01.reuna.cl, port 9002 ======================= glite-job-submit Success ========================== The job has been successfully submitted to the Network Server. Use glite-job-status command to check job current status. Your job identifier is: - https://santiago01.reuna.cl:9000/dM6PkX3RmEetvzxDcCzVWg The job identifier has been saved in the following file: /home/santiago05/gLite/Other/id ======================================================================

9 E-infrastructure shared between Europe and Latin America 9 9th EELA TUTORIAL (USERS AND SYSTEM ADMINISTRATORS) MPI status and output Query the status of the job using the following command: [santiago05@santiago05 Other]$ glite-job-status -i id ……………………………………………. When the status of the job is “DONE”, you can retrieve output with the following command: [santiago05@santiago05 Other]$ glite-job-output -i id ……………………………………………

10 E-infrastructure shared between Europe and Latin America 10 9th EELA TUTORIAL (USERS AND SYSTEM ADMINISTRATORS) LCG-2 User Guide Manuals Series –https://edms.cern.ch/file/454439/LCG-2-UserGuide.pdf http://oscinfo.osc.edu/training/ http://www.netlib.org/mpi/index.html http://www-unix.mcs.anl.gov/mpi/learning.html http://www.ncsa.uiuc.edu/ MPI on the web…

11 E-infrastructure shared between Europe and Latin America 11 9th EELA TUTORIAL (USERS AND SYSTEM ADMINISTRATORS) Workload Manager Proxy

12 E-infrastructure shared between Europe and Latin America 12 9th EELA TUTORIAL (USERS AND SYSTEM ADMINISTRATORS) WMProxy overview WMProxy (Workload Manager Proxy) –is a new service providing access to the gLite Workload Management System (WMS) functionality through a simple Web Services based interface. –has been designed to efficiently handle a large number of requests for job submission and control to the WMS –the service interface addresses the Web Services and SOA architecture standards, in particular adhering to WS-I

13 E-infrastructure shared between Europe and Latin America 13 9th EELA TUTORIAL (USERS AND SYSTEM ADMINISTRATORS) WMProxy : submission & monitoring In order to submit jobs with WMProxy, it’s mandatory to delegate credentials: The submission/monitoring commands are slightly different, but most of the “old” options are supported glite-wms-job-delegate-proxy -d del_ID glite-wms-job-submit -d del_ID collection.jdl glite-wms-job-status jobID glite-wms-job-output jobID

14 E-infrastructure shared between Europe and Latin America 14 9th EELA TUTORIAL (USERS AND SYSTEM ADMINISTRATORS) DAG job A DAG job is a set of jobs where input, output, or execution of one or more jobs can depend on other jobs Dependencies are represented through Directed Acyclic Graphs, where the nodes are jobs, and the edges identify the dependencies nodeA nodeBnodeC NodeF nodeD

15 E-infrastructure shared between Europe and Latin America 15 9th EELA TUTORIAL (USERS AND SYSTEM ADMINISTRATORS) JDL structure

16 E-infrastructure shared between Europe and Latin America 16 9th EELA TUTORIAL (USERS AND SYSTEM ADMINISTRATORS) Attribute: Nodes

17 E-infrastructure shared between Europe and Latin America 17 9th EELA TUTORIAL (USERS AND SYSTEM ADMINISTRATORS) Attribute: Dependencies

18 E-infrastructure shared between Europe and Latin America 18 9th EELA TUTORIAL (USERS AND SYSTEM ADMINISTRATORS) DAG jdl [ type = "dag"; max_nodes_running = 4; nodes = [ nodeA = [ file ="nodes/nodeA.jdl" ; ]; nodeB = [ file ="nodes/nodeB.jdl" ; ]; nodeC = [ file ="nodes/nodeC.jdl" ; ]; nodeD = [ file ="nodes/nodeD.jdl"; ]; dependencies = { {nodeA, nodeB}, {nodeA, nodeC}, { {nodeB,nodeC}, nodeD } } ]; ] Node description could also be done here, instead of using separate files

19 E-infrastructure shared between Europe and Latin America 19 9th EELA TUTORIAL (USERS AND SYSTEM ADMINISTRATORS) Job Collection A job collection is a set of independent jobs that user wants to submit and monitor via a single request Jobs of a collection are submitted as DAG nodes without dependencies JDL is a list of classad, which describes the subjobs [ Type = "collection"; VirtualOrganisation = “gilda"; nodes = { [ ], … }; ]

20 E-infrastructure shared between Europe and Latin America 20 9th EELA TUTORIAL (USERS AND SYSTEM ADMINISTRATORS) ‘Scattered’ Input Sandboxes Input Sandbox can contain –file paths on the UI machine (i.e. the usual way) –URI pointing to files on a remote gridFTP/HTTPS server InputSandbox = { "gsiftp://neo.datamat.it:2811/var/prg/sim.exe", "https://ghemon.cnaf.infn.it:8443/data/idat_1", "file:///home/pacio/myconf“ }; A base URI to be applied to all sandbox files can also be specified InputSandboxBaseURI = "gsiftp://matrix.datamat.it:2811/var"; Only local files ( file:// ) are uploaded to the WMS node File pointed by URIs are directly downloaded on the WN by the JobWrapper just before the job is started

21 E-infrastructure shared between Europe and Latin America 21 9th EELA TUTORIAL (USERS AND SYSTEM ADMINISTRATORS) ‘Scattered’ Output Sandboxes JDL has been enriched with new attributes for specifying the destinations for the files listed in the OutputSandbox attribute list OutputSandbox = {"jobOutput", "run1/event1", "jobError"}; OutputSandboxDestURI = { "gsiftp://matrix.datamat.it/var/jobOutput", "https://grid003.ct.infn.it:8443/home/cms/event1", "gsiftp://matrix.datamat.it/var/jobError"}; A base URI to be applied to all sandbox files can also be specified OutputSandboxBaseDestURI = "gsiftp://neo.datamat.it/home/run1/"; Files are copied when the job has completed execution by the JobWrapper to the specified destination without transiting on the WMS node

22 E-infrastructure shared between Europe and Latin America 22 9th EELA TUTORIAL (USERS AND SYSTEM ADMINISTRATORS) Job collection example [ type = "collection"; InputSandbox = {"date.sh"}; RetryCount = 0; nodes = { [ file ="jobs/job1.jdl" ; ], [ Executable = "/bin/sh"; Arguments = "date.sh"; Stdoutput = "date.out"; StdError = "date.err"; OutputSandbox ={"date.out", "date.err"}; ] ], [ file ="jobs/job3.jdl" ; ] }; ] All nodes will share this Input Sandbox

23 E-infrastructure shared between Europe and Latin America 23 9th EELA TUTORIAL (USERS AND SYSTEM ADMINISTRATORS) Parametric Job A parametric job is a job where one or more of its attributes are parameterized Values of attributes vary according to a parameter Job monitoring / managing is always done through an unique jobID, as if the job was single (see submission of collection [ JobType = "Parametric"; Executable = "/bin/sh"; Arguments = "md5.sh input_PARAM_.txt"; InputSandbox = {"md5.sh", "input_PARAM_.txt"}; StdOutput = "out_PARAM_.txt"; StdError = "err_PARAM_.txt"; Parameters = 4; ParameterStart = 1; ParameterStep = 1; OutputSandbox = {"out_PARAM_.txt", "err_PARAM_.txt"}; ]

24 E-infrastructure shared between Europe and Latin America 24 9th EELA TUTORIAL (USERS AND SYSTEM ADMINISTRATORS) Parametric job / 2 Parameter can be also a list of string InputSandbox (if present) has to be coherent with parameters [ui-test] /home/giorgio/param > cat param2.jdl [ JobType = "Parametric"; Executable = “/bin/cat"; Arguments = “input_PARAM_.txt”; InputSandbox = "input_PARAM_.txt"; StdOutput = "myoutput_PARAM_.txt"; StdError = "myerror_PARAM_.txt"; Parameters = {earth,moon,mars}; OutputSandbox = {“myoutput_PARAM_.txt”}; ] [ui-test] /home/giorgio/param > ls inputEARTH.txt inputMARS.txt inputMOON.txt param2.jdl

25 E-infrastructure shared between Europe and Latin America 25 9th EELA TUTORIAL (USERS AND SYSTEM ADMINISTRATORS) References JDL attributes specification for WM proxy – https://edms.cern.ch/document/590869/1 https://edms.cern.ch/document/590869/1 WMProxy quickstart – http://egee-jra1-wm.mi.infn.it/egee-jra1- wm/wmproxy_client_quickstart.shtml http://egee-jra1-wm.mi.infn.it/egee-jra1- wm/wmproxy_client_quickstart.shtml WMS user guides – https://edms.cern.ch/document/572489/1 https://edms.cern.ch/document/572489/1


Download ppt "FP6−2004−Infrastructures−6-SSA-026409 www.eu-eela.org E-infrastructure shared between Europe and Latin America Special Jobs Matias Zabaljauregui UNLP."

Similar presentations


Ads by Google