Job Submission The European DataGrid Project Team

Slides:



Advertisements
Similar presentations
DataTAG WP4 Meeting CNAF Jan 14, 2003 Interfacing AliEn and EDG 1/13 Stefano Bagnasco, INFN Torino Interfacing AliEn to EDG Stefano Bagnasco, INFN Torino.
Advertisements

Workload Management David Colling Imperial College London.
EGEE is a project funded by the European Union under contract IST EGEE Tutorial Turin, January Hands on Job Services.
EU 2nd Year Review – Jan – Title – n° 1 WP1 Speaker name (Speaker function and WP ) Presentation address e.g.
Workload management Owen Maroney, Imperial College London (with a little help from David Colling)
INFSO-RI Enabling Grids for E-sciencE Workload Management System and Job Description Language.
The Grid Constantinos Kourouyiannis Ξ Architecture Group.
Job Submission The European DataGrid Project Team
A Computation Management Agent for Multi-Institutional Grids
INFSO-RI Enabling Grids for E-sciencE EGEE Middleware The Resource Broker EGEE project members.
The DataGrid Project NIKHEF, Wetenschappelijke Jaarvergadering, 19 December 2002
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Job Submission Fokke Dijkstra RuG/SARA Grid.
The EDG Workload Management System – n° 1 The EDG Workload Management System.
Basic Grid Job Submission Alessandra Forti 28 March 2006.
Job Submission The European DataGrid Project Team
EDG - WP1 (Grid Work Scheduling) Status and plans Massimo Sgaravatto - INFN Padova Francesco Prelz – INFN Milano.
Elisabetta Ronchieri - How To Use The UI command line - 10/29/01 - n° 1 How To Use The UI command line Elisabetta Ronchieri by WP1 elisabetta.ronchieri.
The gLite API – PART I Giuseppe LA ROCCA INFN Catania ACGRID-II School 2-14 November 2009 Kuala Lumpur - Malaysia.
INFSO-RI Enabling Grids for E-sciencE GILDA Praticals GILDA Tutors INFN Catania ICTP/INFM-Democritos Workshop on Porting Scientific.
DataGrid is a project funded by the European Union CHEP 2003 – March 2003 – M. Sgaravatto – n° 1 The EU DataGrid Workload Management System: towards.
Enabling Grids for E-sciencE Workload Management System on gLite middleware Matthieu Reichstadt CNRS/IN2P3 ACGRID School, Hanoi (Vietnam)
M. Sgaravatto – n° 1 The EDG Workload Management System: release 2 Massimo Sgaravatto INFN Padova - DataGrid WP1
DataGrid WP1 Massimo Sgaravatto INFN Padova. WP1 (Grid Workload Management) Objective of the first DataGrid workpackage is (according to the project "Technical.
The Plan for this morning: Description of the EDG WP 1 software: How it works, basic commands, how to get started etc Example of how to submit jobs: From.
Nadia LAJILI User Interface User Interface 4 Février 2002.
Ron Trompert – Testbed1 Software – 7 November n° 1 Partner Logo Testbed1 Software Ron Trompert sara.nl.
INFSO-RI Enabling Grids for E-sciencE Workload Management System Mike Mineter
1 Esther Montes Prado CIEMAT 10th EELA Tutorial Madrid, Hands-on on WMS (Review and Summary)
F.Pacini - Milan - 8 May, n° 1 Results of Meeting on Workload Manager Components Interaction DataGrid WP1 F. Pacini
- Distributed Analysis (07may02 - USA Grid SW BNL) Distributed Processing Craig E. Tull HCG/NERSC/LBNL (US) ATLAS Grid Software.
Job Submission The European DataGrid Project Team
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite job submission Fokke Dijkstra Donald.
Job Submission and Resource Brokering WP 1. Contents: The components What (should) works now and configuration How to submit jobs … the UI and JDL The.
Grid checkpointing in the European DataGrid Project Alessio Gianelle – INFN Padova Rosario Peluso – INFN Padova Francesco Prelz – INFN Milano Massimo Sgaravatto.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Job Submission Fokke Dijkstra RuG/SARA Grid.
EGEE is a project funded by the European Union under contract IST Job Description Language - more control over your Job Assaf Gottlieb University.
EGEE is a project funded by the European Union under contract IST EGEE Tutorial Turin, January Job Services Emidio.
Job Management DIRAC Project. Overview  DIRAC JDL  DIRAC Commands  Tutorial Exercises  What do you have learned? KEK 10/2012DIRAC Tutorial.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
WP1 WMS rel. 2.0 Some issues Massimo Sgaravatto INFN Padova.
E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.
INFSO-RI Enabling Grids for E-sciencE Αthanasia Asiki Computing Systems Laboratory, National Technical.
Enabling Grids for E-sciencE Workload Management System on gLite middleware - commands Matthieu Reichstadt CNRS/IN2P3 ACGRID School, Hanoi.
High-Performance Computing Lab Overview: Job Submission in EDG & Globus November 2002 Wei Xing.
EGEE-0 / LCG-2 middleware Practical.
Workload Management System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR.
INFSO-RI Enabling Grids for E-sciencE Job Description Language (JDL) Giuseppe La Rocca INFN First gLite tutorial on GILDA Catania,
Data Management The European DataGrid Project Team
INFSO-RI Enabling Grids for E-sciencE GILDA Praticals Giuseppe La Rocca INFN – Catania gLite Tutorial at the EGEE User Forum CERN.
EGEE is a project funded by the European Union under contract IST Job Description Language – How to control your Job Nadav Grossaug IsraGrid.
INFN - Ferrara BaBar Meeting SPGrid: status in Ferrara Enrica Antonioli - Paolo Veronesi Ferrara, 12/02/2003.
EDG - WP1 (Grid Work Scheduling) Status and plans Massimo Sgaravatto INFN Padova.
C. Loomis – Demonstration-Dec. 12, n° 1 Testbed 1 Demonstration December 12, 2001
The DataGrid Project NIKHEF, Wetenschappelijke Jaarvergadering, 19 December 2002
EGEE 3 rd conference - Athens – 20/04/2005 CREAM JDL vs JSDL Massimo Sgaravatto INFN - Padova.
Biomed tutorial 1 Enabling Grids for E-sciencE INFSO-RI EGEE is a project funded by the European Union under contract IST JDL Flavia.
LCG2 Tutorial Viet Tran Institute of Informatics Slovakia.
Istituto Nazionale di Astrofisica Information Technology Unit INAF-SI Job with data management Giuliano Taffoni.
GRID commands lines Original presentation from David Bouvet CC/IN2P3/CNRS.
Introduction to Computing Element HsiKai Wang Academia Sinica Grid Computing Center, Taiwan.
Enabling Grids for E-sciencE Work Load Management & Simple Job Submission Practical Shu-Ting Liao APROC, ASGC EGEE Tutorial.
Workload Management System on gLite middleware
EGEE tutorial, Job Description Language - more control over your Job Assaf Gottlieb Tel-Aviv University EGEE is a project.
Workload Management System
5. Job Submission Grid Computing.
The EU DataGrid Job Submission Services
The gLite Workload Management System
EGEE Middleware: gLite Information Systems (IS)
gLite Job Management Christos Theodosiou
Job Submission M. Jouvin (LAL-Orsay)
Presentation transcript:

Job Submission The European DataGrid Project Team

EDG Job Submission Tutorial - n° 2 Contents  Job Submission to the EDG Testbed n Job Preparation n The EDG Workload Management System (WMS) n Job Description Language (JDL) n Job Submission & Monitoring  A simple program example: the job lifecycle

EDG Job Submission Tutorial - n° 3  Job Data requirements (input/output data)  Requirements and Preferences of the computing system  Software dependencies  Which EDG tools are required  How to use them Job Preparation: Let’s think the way the Grid thinks!

EDG Job Submission Tutorial - n° 4 The EDG WMS  The user interacts with Grid via a Workload Management System  The Goal of WMS is the distributed scheduling and resource management in a Grid environment.  What does it allow Grid users to do? To submit their jobs To execute them To get information about their status To retrieve their output  The WMS tries to optimize the usage of resources

EDG Job Submission Tutorial - n° 5 WMS Components  WMS is currently composed of the following parts: 1.User Interface (UI) : access point for the user to the GRID 2.Resource Broker (RB) : the broker of GRID resources, performing the match-making 3.Job Submission System (JSS) : provides a reliable submission system 4.Information Index (II) : a specialized Globus GIIS (LDAP server) used by the Resource Broker as a filter to the information service (IS) to select resources 5.Logging and Bookkeeping services (LB) : store Job Info available for users to query

EDG Job Submission Tutorial - n° 6 Job Description Language (JDL) 1/5  Based upon Condor’s CLASSified ADvertisement language (ClassAd)  ClassAd is a fully extensible language  ClassAd is constructed with the classad construction operator [] It is a sequence of attributes separated by semi-colons. An attribute is a pair (key, value), where value can be a Boolean, an Integer, a list of strings, … = ; So, the JDL allows to define a set of attribute, the WMS takes into account when making its scheduling decision

EDG Job Submission Tutorial - n° 7 Job Description Language (JDL) 2/5  The supported attributes are grouped in two categories: n Computing Resource (Attributes) used to build expressions of Requirements and Rank attributes by the user taken automatically into account by the RB for carrying out the matchmaking algorithm have to be prefixed with “other.” n Data and Storage resources (Attributes) input data to process, SE where to store output data, protocols spoken by application when accessing SEs n Job (Attributes) provided by the user while he/she edits job description file, split up into: Mandatory Mandatory with default value inserted by the UI before submitting the job Define the job itself

EDG Job Submission Tutorial - n° 8 Job Description Language (JDL): relevant attributes 3/5  Mandatory for every single JDL file: 1.Executable (contains the command name)  Mandatory for JDL file dealing with Data Management: 2.ReplicaCatalog (contains the Replica Catalog Identifier) 3.DataAccessProtocol (contains the protocol or the list of protocols which the application is able to speak with for accessing InputData on a given SE) If InputData contains at least one PFN and no LFNs, only DataAccessProtocol is mandatory. If InputData contains at least one LFN, both ReplicaCatalog and DataAccessProtocol are mandatory.

EDG Job Submission Tutorial - n° 9 Job Description Language (JDL): relevant attributes 4/5  Mandatory attributes with default value for every single JDL file: 1.Rank (contains a ClassAd Floating Point expression) The default value is –other.EstimatedTraversalTime. 2.Requirements (contains a ClassAd Boolean expression) The default value is other.Active. The default value of these attributes are in the user interface configuration file.  Special characters are allowed in the Arguments attribute as long as they are between \’ and \’ Arguments = "\'+%s\'"; Arguments = "*";

EDG Job Submission Tutorial - n° 10 Job Description Language (JDL): other attributes 5/5  Others: n OutputSE (contains the Uniform Resource Identifier of the SE) RB uses it to choose a CE that is compatible with the job and is close to SE. OutputSE=“testbed002.cern.ch”; n InputData (refers to data use as input by the job: these data are published in the Replica Catalog and stored in the SEs) n InputSandbox (list of files on the UI local disk needed by the job for running) The listed files are staged from the UI to the remote CE. n OutputSandbox (list of files, generated by the job, which have to be retrieved) StdError = “stderror.log”; StdOutput = “stdoutput.log”; OutputSandbox = {“stderror.log”, “stdoutput.log”, “.BrokerInfo”};

EDG Job Submission Tutorial - n° 11 Example JDL File Executable = “gridTest”; InputData = “LF:testbed ”; ReplicaCatalog = “ldap://sunlab2g.cnaf.infn.it:2010/ \ rc=WP2 INFN Test, dc=infn, dc=it”; DataAccessProtocol = “gridftp”; StdError = “stderr.log”; StdOutput = “stdout.log”; OutputSandbox = {“stderr.log”, “stdout.log”}; InputSandbox = {“home/joda/test/gridTest”}; Rank = “other.MaxCpuTime”; Requirements = other.Architecture==“INTEL” && \ other.OpSys==“LINUX” && other.FreeCpus >=4;

EDG Job Submission Tutorial - n° 12 WMS UI Commands  dg-job-submit submits a job  dg-job-list-match lists resources matching a job description  dg-job-cancel cancels a given job  dg-job-status displays the status of the job (submitted, waiting, ready, scheduled, running, chkpt, done, outputready, aborted, cleared)  dg-job-get-output returns the job-output to the user  dg-job-get-logging-info displays logging information about submitted jobs  dg-job-id-info is a utility for the user to display job info in a formatted style

EDG Job Submission Tutorial - n° 13 Example of UI Command Options  dg-job-submit –r –n -c -o -r the job is submitted by the RB directly to the computing element identified by -n an message containing basic information regarding the job (status and identification) is sent to the specified when the job enters one of the following status: DONE or ABORTED READY RUNNING -c the configuration file is pointed by the UI instead of the standard configuration file -o the generated dg_jobId is written in the  dg-job-status –i (or dg_jobId) -i the bookkeeping information about dg_jobId contained in the are displayed

EDG Job Submission Tutorial - n° 14 A Job Submission Example UI JDL Logging & Book-keeping (LB) Resource Broker (RB) Job Submission Service (JSS) Storage Element (SE) Compute Element CE) Information Service (IS) Replica Catalogue (RC)

EDG Job Submission Tutorial - n° 15 A Job Submission Example UI JDL Logging & Book-keeping (LB) Resource Broker (RB) Job Submission Service (JSS) Storage Element (SE) Compute Element (CE) Information Service (IS) Replica Catalogue (RC) Job Submit Event Input Sandbox Job Status submitted

EDG Job Submission Tutorial - n° 16 A Job Submission Example UI JDL Logging & Book-keeping (LB) Resource Broker (RB) Job Submission Service (JSS) Storage Element (SE) Compute Element (CE) Information Service (IS) Replica Catalogue (RC) Job Status submitted waiting

EDG Job Submission Tutorial - n° 17 A Job Submission Example UI JDL Logging & Book-keeping (LB) Resource Broker (RB) Job Submission Service (JSS) Storage Element (SE) Compute Element (CE) Information Service (IS) Replica Catalogue (RC) Job Status submitted waiting ready

EDG Job Submission Tutorial - n° 18 A Job Submission Example UI JDL Logging & Book-keeping (LB) Resource Broker (RB) Job Submission Service (JSS) Storage Element (SE) Compute Element (CE) Information Service (IS) Replica Catalogue (RC) Job Status submitted waiting ready BrokerInfo scheduled

EDG Job Submission Tutorial - n° 19 A Job Submission Example UI JDL Logging & Book-keeping (LB) Resource Broker (RB) Job Submission Service (JSS) Storage Element (SE) Compute Element (CE) Information Service (IS) Replica Catalogue (RC) Job Status submitted waiting ready scheduled Input Sandbox running

EDG Job Submission Tutorial - n° 20 A Job Submission Example UI JDL Logging & Book-keeping (LB) Resource Broker (RB) Job Submission Service (JSS) Storage Element (SE) Compute Element (CE) Information Service (IS) Replica Catalogue (RC) Job Status submitted waiting ready scheduled Job Status running

EDG Job Submission Tutorial - n° 21 A Job Submission Example UI JDL Logging & Book-keeping Resource Broker Job Submission Service Storage Element ComputeElement Information Service Replica Catalogue submitted waiting ready scheduled running Job Status done Job Status

EDG Job Submission Tutorial - n° 22 A Job Submission Example UI JDL Logging & Book-keeping Resource Broker Job Submission Service Storage Element ComputeElement Information Service Replica Catalogue submitted waiting ready scheduled running done Job Status outputready Output Sandbox

EDG Job Submission Tutorial - n° 23 A Job Submission Example UI JDL Logging & Book-keeping (LB) Resource Broker (RB) Job Submission Service (JS) Storage Element (SE) Compute Element (CE) Information Service (IS) Replica Catalogue (RC) Output Sandbox cleared submitted waiting ready scheduled running done Job Status outputready

EDG Job Submission Tutorial - n° 24 Possible Job States SUBMITTED WAITING READY SCHEDULED RUNNING DONE(ok) DONE(failed) OUTPUTREADY CLEARED ABORTED DONE(cancelled)

EDG Job Submission Tutorial - n° 25

EDG Job Submission Tutorial - n° 26 dg-job-submit myjob.jdl Myjob.jdl Executable = "$(CMS)/exe/sum.exe"; InputData = "LF:testbed "; ReplicaCatalog = "ldap://sunlab2g.cnaf.infn.it:2010/rc=WP2 INFN Test Replica Catalog,dc=sunlab2g, dc=cnaf, dc=infn, dc=it"; DataAccessProtocol = "gridftp"; InputSandbox = {"/home/user/WP1testC","/home/file*”, "/home/user/DATA/*"}; OutputSandbox = {“sim.err”, “test.out”, “sim.log"}; Requirements = other.Architecture == "INTEL" && other.OpSys== "LINUX Red Hat 6.2"; Rank = other.FreeCPUs;

EDG Job Submission Tutorial - n° 27

EDG Job Submission Tutorial - n° 28

EDG Job Submission Tutorial - n° 29

EDG Job Submission Tutorial - n° 30

EDG Job Submission Tutorial - n° 31

EDG Job Submission Tutorial - n° 32

EDG Job Submission Tutorial - n° 33

EDG Job Submission Tutorial - n° 34

EDG Job Submission Tutorial - n° 35 WMS Match Making 1/4  The RB is the core component of WMS.  It has to find the best suitable CE where the job will be executed  It interacts with Data Management service and Information Service They supply RB with all the information required for the resolution of the matches  The CE chosen by RB matches the job requirements (e.g. runtime environment, data access requirements, and so on)

EDG Job Submission Tutorial - n° 36 WMS Match Making 2/4  The RB has to deal with three possible scenarios. 1.Scenario : Direct Job Submission sJob is scheduled on a given CE (specified in the dg-job-submit command via –r option) sRB doesn’t perform any matchmaking algorithm

EDG Job Submission Tutorial - n° 37 WMS Match Making 3/4 2.Scenario : Job Submission without data-access Requirements sNeither CE nor input data are specified. sRB starts the matchmaking algorithm, which consists of two phases: n Requirements check (RB contacts the IS to create a set of the suitable CEs) n Rank computation (RB acquires information about the quality of the just found suitable CEs) If more than one CE satisfies the job requirements, the CE with the best rank is chosen by the RB If the user doesn’t specify any rank value, by default the RB considers resources with the lowest estimated traversal time If all CEs have the same rank value, the RB chooses the first CE in the list

EDG Job Submission Tutorial - n° 38 WMS Match Making 4/4 3.Scenario : Job Submission with data-access Requirements sCE is not specified in the JDL sRB interacts with Data Management service to find out the most suitable CE taking into account also the SEs where both input data sets are physically stored and output data sets should be staged on completion of job execution sRB strategy consists of submitting jobs close to data sThe main two phases of the match making algorithm remain unchanged: n Requirements check n Rank computation sWhat changes with respect to the second scenario? Now, the RB executes the two phases for each class of CEs that satisfy the data-access requirements (i.e. which are close to data)

EDG Job Submission Tutorial - n° 39 Example of Job Submission Sequence  User logs in on the UI  User issues a grid-proxy-init and enters his certificate’s password, getting a valid Globus proxy  User sets up his or her JDL file  Example of Hello World JDL file : Executable = “/bin/echo”; Arguments = “Hello World”; StdOutput = “Messagge.txt”; StdError = “stderr.log”; OutputSandbox = {“Message.txt”,”stderr.log”};

EDG Job Submission Tutorial - n° 40 Example of Job Submission Sequence Cont’d  User issues a: dg-job-submit HelloWorld.jdl and gets back from the system a unique Job Identifier (JobId)  User issues a: dg-job-status JobId to get logging information about the current status of his Job  When the “OutputReady” status is reached, the user can issue a dg-job-get-output JobId and the system returns the name of the temporary directory where the job output can be found on the UI machine.

EDG Job Submission Tutorial - n° 41 Job Submission Example EliJDL]$ dg-job-submit HelloWorld.jdl Connecting to host lxshare0381.cern.ch, port 7771 Logging to host lxshare0381.cern.ch, port ************************************************************************** JOB SUBMIT OUTCOME The job has been successfully submitted to the Resource Broker. Use dg-job-status command to check job current status. Your job identifier (dg_jobId) is: ************************************************************************** JobId

EDG Job Submission Tutorial - n° 42 Job Submission Example Cont’d EliJDL]$ dg-job-status re0381.cern.ch:7771 Retrieving Information from LB server Please wait: this operation could take some seconds. BOOKKEEPING INFORMATION: Printing status info for the Job : dg_JobId = Status = OutputReady Last Update Time (UTC) = Wed Aug 21 12:19: Job Destination = testbed008.cnaf.infn.it:2119/jobmanager-pbs-short Status Reason = terminated Job Owner = /C=IT/O=INFN/OU=Personal Certificate/L=CNAF/CN=Mario Status Enter Time (UTC) = Wed Aug 21 12:19:

EDG Job Submission Tutorial - n° 43 Job Submission Example Cont’d EliJDL]$ dg-job-get-output --dir result re0381.cern.ch:7771 ************************************************************************** JOB GET OUTPUT OUTCOME Output sandbox files for the job: - have been successfully retrieved and stored in the directory: /shift/lxshare072d/data01/UIhome/reale/EliJDL/result/ ************************************************************************** EliJDL]$ more result/ /Message.txt Hello World EliJDL]$ more result/ /stderr.log

EDG Job Submission Tutorial - n° 44 Common Error Messages 1/2  The UI commands accept some arguments in input. If the user makes a mistake via command line, the following messages can appear: Argument * is not allowed (the argument is not known) Argument * must be specified at the end of the command (both the jobId and JDL file name must be put at the end of the command line) Argument * is missing for the “—output” option (the user forgot to add the parameter, required by the argument) Argument “-all” cannot be specified with argument “—input” (some arguments are OR-exclusive) CEId format is: ; /jobmanager-. The provided CEID: “ has a wrong format. (the user has mis-spelled the CE identifier after – resource)  During the calling of the RB API, the following can happen: Resource Broker “grid013g.cnaf.infn.it:7771” not available (can’t open a connection with the RB specified in the UI configuration file) Unable to get LB address from RB “grid013g.cnaf.infn.it” (the function get_lb_contact returned an error)

EDG Job Submission Tutorial - n° 45 Common Error Message 2/2  While the UI commands are checking the JDL file, the following errors may occur: Mandatory Attribute default error in the configuration file “/opt/edg/etc/UI_ConfigENV.cfg” (there aren’t any default values) Mandatory Attribute missing in JDL file “Executable” (Executable is one of the mandatory attributes) Multiple “InputSandbox” attribute found in JDL file (InputSandbox attribute is repeated twice) Wrong function call for list attribute *. Function usage is: “Member/IsMember(List, Value)” (e.g. in the requirements attribute the function Member/IsMember is used with a wrong syntax)  Proxy (this refers to the security grid proxy and not to a proxy machine) n If the user specifies a duration for the proxy that he wants to provide, using the option –h of dg-job-submit, a possible message is Proxy certificate will expire in less then X hours. Creating a new X-hours- duration certificate (this to make sure that at least the required proxy validity is granted )

EDG Job Submission Tutorial - n° 46 WMS Proxy Renewal  Why? n To avoid job failure because it outlived the validity of the initial proxy  WMS support automatic proxy renewal mechanism as long as the user credentials are handled by a proxy server. 1.Create a proxy using grid-proxy-init 2.Register this proxy with the MyProxy server using myproxy-init –s [-t -c ] server is the server address (e.g. lxshare0375.cern.ch) cred is the number of hours the proxy should be valid on the server proxy is the number of hours renewed proxies should be valid 3.Short term proxies can then be used to start jobs using grid-proxy-init –hours command 4.The Proxy is automatic renewed by WMS without user intervention for all the job life

EDG Job Submission Tutorial - n° 47 Further Information  The EDG User’s Guide  WMS and JDL  ClassAd