Presentation is loading. Please wait.

Presentation is loading. Please wait.

EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks gLite job submission Fokke Dijkstra Donald.

Similar presentations


Presentation on theme: "EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks gLite job submission Fokke Dijkstra Donald."— Presentation transcript:

1 EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks gLite job submission Fokke Dijkstra Donald Smits Centre for Information Technology, University of Groningen Utrecht, Grid Tutorial 2008

2 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 2 Introduction ?

3 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 3 Components in the EGEE Grid Workload Management System (WMS) Storage Element (SE) Computing Element (CE) Information System (BDII) LCG File Catalog (LFC) JDL User Interface (UI)

4 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 4 Workload Management System Tasks of the WMS –Find the best resource for your tasks (jobs) –Submit jobs to compute resources –Logging and book keeping –Delegated Grid credential management

5 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 5 Job preparation You need to provide –A complete (enough) job description  What program?  What data?  Any requirements on OS, installed software, ?? –Possibly a program  You’re submitting in unknown territory!  Program portably!  Don’t rely on hard-coded paths or special locations  The program you send may not even be in $HOME! –Perhaps some input data –Perhaps instructions on what to do with the output

6 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 6 Here is a minimal job description (call it hello.jdl) We specified –The program to run and its arguments –Directed the standard error and output streams to files –Told it what to do with the output Executable = “/bin/echo”; Arguments = “Goedemiddag”; StdError = “stderr.log”; StdOutput = “stdout.log”; OutputSandbox = {“stderr.log”, “stdout.log”}; How to Write a Job Description

7 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 7 User issues a voms-proxy-init –enters his certificate’s password –Receives a valid Globus proxy User issues a: glite-wms-job-submit -a mytest.jdl and gets back from the system a unique Job Identifier (JobId) User issues a: glite-wms-job-status JobId to get logging information about the current status of his Job When the “Done” status is reached, the user can issue a glite-wms-job-output JobId and the system returns the name of the temporary directory where the job output can be found on the UI machine. Job Submission Example

8 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 8 $ voms-proxy-init --voms tutor Cannot find file or dir: /admins/fokke/.glite/vomses Enter GRID pass phrase: Your identity: /O=dutchgrid/O=users/O=rug/OU=rc/CN=Fokke Dijkstra Creating temporary proxy........................................... Done Contacting voms.grid.sara.nl:30007 [/O=dutchgrid/O=hosts/OU=sara.nl/CN=voms.grid.sara.nl] "tutor" Done Creating proxy................................................ Done Your proxy is valid until Wed Nov 5 23:11:27 2008 $ glite-wms-job-submit -a hello.jdl Connecting to the service https://wms.grid.sara.nl:7443/glite_wms_wmproxy_server ====================== glite-wms-job-submit Success ====================== The job has been successfully submitted to the WMProxy Your job identifier is: https://wms.grid.sara.nl:9000/V7pw7lTR4MeFMVAz12larQ ========================================================================== JobId Submitting it

9 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 9 UI JDL Workload Management System (WMS) Storage Element (SE) Computing Element (CE) Information System (IS) Job Status submitted LCG File Catalog (LFC) User Interface (UI) waiting Job + Input sandbox A Job Submission Example readyscheduled Job + Input sandbox

10 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 10 $ glite-wms-job-status https://wms.grid.sara.nl:9000/V7pw7lTR4MeFMVAz12larQ ************************************************************* BOOKKEEPING INFORMATION: Status info for the Job : https://wms.grid.sara.nl:9000/V7pw7lTR4MeFMVAz12larQ Current Status: Scheduled Status Reason: Job successfully submitted to Globus Destination: ce.grid.rug.nl:2119/jobmanager-pbs-long Submitted: Wed Nov 5 11:12:15 2008 CET ************************************************************* Checking the status

11 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 11 Check status using browser

12 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 12 UI JDL Workload Management System (WMS) Storage Element (SE) Computing Element (CE) Information System (IS) LCG File Catalog (LFC) Job Status submitted waiting User Interface (UI) readyscheduledrunning Output Sandbox done A Job Submission Example

13 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 13 $ glite-wms-job-output https://wms.grid.sara.nl:9000/V7pw7lTR4MeFMVAz12larQ Connecting to the service https://wms.grid.sara.nl:7443/glite_wms_wmproxy_server ============================================================================== JOB GET OUTPUT OUTCOME Output sandbox files for the job: https://wms.grid.sara.nl:9000/V7pw7lTR4MeFMVAz12larQ have been successfully retrieved and stored in the directory: /tmp/jobOutput/fokke_V7pw7lTR4MeFMVAz12larQ =============================================================================== $ cat /tmp/jobOutput/fokke_V7pw7lTR4MeFMVAz12larQ Goedemiddag Getting the Output

14 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 14 UI JDL Workload Management System (WMS) Storage Element (SE) Computing Element (CE) Information System (IS) LCG File Catalog (LFC) Output Sandbox cleared submitted waiting ready scheduled running done Job Status User Interface (UI) A Job Submission Example

15 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 15 Job Description Language Job Description Language based on Classified Advertisement language Lines: –Attribute = expression; –Can be multiple lines, semicolon is separator –”for strings” –# and // for comments –No blanks after ; !!

16 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 16 The supported attributes are grouped in two categories: –Job Define the job itself –Resources  Taken into account by the WMS for carrying out the matchmaking algorithm  Computing Resource (Attributes) Used to build expressions of Requirements and/or Rank attributes by the user Have to be prefixed with “other.”  Data and Storage resources (Attributes) Input data to process, SE where to store output data, protocols spoken by application when accessing SEs Types of Attributes

17 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 17 Executable (mandatory) –The command name Arguments (optional) –Job command line arguments StdInput, StdOutput, StdErr (optional) –Standard input/output/error of the job Environment (optional) –List of environment settings InputSandbox (optional) –List of files on the UI local disk needed by the job for running –The listed files are staged from the UI to the remote CE –Wildcards allowed –Unique filenames required OutputSandbox (optional) –List of files, generated by the job, which have to be retrieved Job Definition Attributes

18 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 18 Requirements –Job requirements on computing resources –Specified using attributes of resources published in the Information System –other.GlueCEStateStatus == "Production" always included (the resource has to be in the Production grid) –Useful requirements:  Wallclock time and specific sites: Requirements = other.GlueCEPolicyMaxWallClockTime > 720 && RegExp(“nikhef.nl", other.GlueCEUniqueID);  Specific tag published: Requirements = Member("VO-ncf-gromacs- 3.3.2",other.GlueHostApplicationSoftwareRunTimeEnvironment); –Logical expressions:  &&: and  ||: or  !: not Resource Attributes

19 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 19 InputData (optional) –Refers to data used as input by the job: these data are published in the Replica Catalog and stored in the SEs) –GUIDs and/or LFNs –Job must be sent to CE that has the data nearby DataAccessProtocol (mandatory if InputData specified) –The protocol or the list of protocols which the application is able to speak with for accessing InputData on a given SE Data Attributes

20 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 20 The WMS has to find the best suitable CE where the job will be executed It interacts with Data Management service and Information System The CE chosen has to match the job requirements If 2 or more CEs satisfy all the requirements, the one with the best Rank is chosen –Specified using attributes of resources published in the Information Service –If not specified, default value is used: Rank = -other.GlueCEStateEstimatedResponseTime; quickest response time WMS match making and ranking

21 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 21 Executable = “gridTest”; StdError = “stderr.log”; StdOutput = “stdout.log”; InputSandbox = {“/home/joda/test/gridTest”}; OutputSandbox = {“stderr.log”, “stdout.log”}; InputData = “lfn:/grid/tutor/testbed0-00019”; DataAccessProtocol = “gridftp”; Requirements = other.Architecture==“INTEL” && \ other.OpSys==“CentOS” && other.FreeCpus >=4; Rank = “other.GlueHostBenchmarkSF00”; Example JDL File

22 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 22 glite-wms-job-submit [-a] [-d ] [-o ] -o the generated jobId is written in the  Useful for other commands, e.g.: glite-wms-job-status –i (or jobId) -i the status information about edg_jobId contained in the are displayed -a use automatic delegation -d use an existing delegated proxy at the WMS e.g. one generated using: glite-wms-job-delegate-proxy –d Job Submission

23 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 23 glite-wms-job-list-match Lists resources matching a job description Performs the matchmaking without submitting the job glite-wms-job-cancel Cancels a given job glite-wms-job-status Displays the status of the job glite-wms-job-output Returns the job-output (the OutputSandbox files) to the user glite-wms-job-logging-info Displays logging information about submitted jobs (all the events “pushed” by the various components of the WMS) Very useful for debug purposes Other WMS UI Commands

24 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 24 Why? –To avoid job failure because it outlived the validity of the initial proxy –To prevent long term proxies from lying around –Use a safe system for storing long term proxies WMS support automatic proxy renewal mechanism as long as the user credentials are handled by a proxy server. 1.Create a proxy using voms-proxy-init --voms 2.Register this proxy with the MyProxy server using myproxy-init -s [-t -c ] -d -n server is the server address (e.g. px.matrix.sara.nl) cred is the number of hours the proxy should be valid on the server proxy is the number of hours renewed proxies should be valid 3.The Proxy is automatic renewed by WMS without user intervention for all the job life Proxy Renewal

25 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 25 Advanced Job types: Job Collection Set of independent jobs –Collect jobs in single directory: glite-wms-job-submit -a --collection –Advanced collection using global set of attributes [ Type = "Collection"; InputSandbox = {"myjob.exe", "fileA"}; OutputSandboxBaseDestURI = "gsiftp://lxb0707.cern.ch/data/doe"; DefaultNodeShallowRetryCount = 5; Nodes = { [ Executable = "myjob.exe"; InputSandbox = {root.InputSandbox, "fileB"}; OutputSandbox = {"myoutput1.txt"}; Requirements = other.GlueCEPolicyMaxWallClockTime > 1440; ], [ NodeName = "mysubjob"; Executable = "myjob.exe"; OutputSandbox = {"myoutput2.txt"}; ShallowRetryCount = 3; ], [ File = "/home/doe/test.jdl"; ] } ]

26 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 26 Advanced Job types: Parametric Identical jobs, except the value of a parameter Parameters –List of items –Number  ParameterStart and ParameterStep necessary –_PARAM_ replaced by parameter value [ JobType = "Parametric"; Executable = "myjob.exe"; StdInput = "input_PARAM_.txt"; StdOutput = "output_PARAM_.txt"; StdError = "error_PARAM_.txt"; Parameters = 100; ParameterStart = 1; ParameterStep = 1; InputSandbox = {"myjob.exe", "input_PARAM_.txt“}; OutputSandbox = {"output_PARAM_.txt", "error_PARAM_.txt"}; ]

27 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 27 Advanced job types: MPI For programs using the MPI parallel library JobType=”MPICH” NodeNumber = –Request n cores on the remote cluster –Scheduling is determined at remote site Submit script that starts up your program using MPI –You can use mpi-start for this Scheduling SMP nodes not yet possible

28 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 28 Other Advanced Job types Direct Acyclic Graph –Graph shows dependencies between jobs Interactive –Opens graphical window that connects to job nodeA nodeBnodeC mynode nodeD

29 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 29 Pilot jobs Send agent to a site first –Will fetch workload from central service –Both single and multi user frameworks exist Advantages –Hides problematic sites from the user –Probably less overhead per workload –Makes central scheduling possible Disadvantage –Less efficient scheduling at site –Security concerns with multiple users

30 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 30 How to get your program on the Grid? Send it with job –Binary package –Compile it on the fly Use software manager accounts –Write permission on special shared directory –Publish tags in information system Use preinstalled packages –Only possible for special collaborations  Example: VL-e software stack Advantages – You will be able to run anywhere Disadvantages –Portability extremely important, otherwise jobs will fail –Overhead Advantages –Software can be validated –Central management takes burden from users Disadvantages –Sites have to support this –Lot of work for software manager Advantages –Very easy for users Disadvantages –Sites have to support it –Lot of work for software packager

31 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 31 Pointers to advanced topics Can be found in gLite user guide: http://glite.web.cern.ch/glite/documentation/ http://glite.web.cern.ch/glite/documentation/ Advanced sandbox play –Gridftp instead of local files  No space on WMS needed Brokerinfo –Information about local environment (CE, SEs, etc.) Job perusal –Peek at the output while your job is running Automatic retries –RetryCount –ShallowRetryCount


Download ppt "EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks gLite job submission Fokke Dijkstra Donald."

Similar presentations


Ads by Google