Presentation is loading. Please wait.

Presentation is loading. Please wait.

EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org www.glite.org Architecture of the WMS Yaodong Cheng CC-IHEP, Chinese Academy of Sciences.

Similar presentations


Presentation on theme: "EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org www.glite.org Architecture of the WMS Yaodong Cheng CC-IHEP, Chinese Academy of Sciences."— Presentation transcript:

1 EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org www.glite.org Architecture of the WMS Yaodong Cheng CC-IHEP, Chinese Academy of Sciences chyd@ihep.ac.cn The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008

2 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 2/63 outline glite WMS overview Workload architecture and components –Workload components  CE –Job states –WMproxy –Job Description Language References

3 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 3/63 ReplicaCatalogue Logging & Book-keepingStorageElementComputingElement InformationService Job Status Author. &Authen. Job Submit Event Job Query Job Status Input “sandbox” + Broker Info Output “sandbox” Publish SE & CE info “User interface” Workload Management System DataSets info Input “sandbox” gLite components

4 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 4/63 WMS functionality The Workload Management SystemThe Workload Management System (WMS) is the gLite component responsible for the management of user’s jobs : their – submission – scheduling – execution – status monitoring – output retrieval Its core component is the Workload Manager (WM) The WM handles the requests for job management coming from the WMS clients – The submission request hands over the responsibility of the job to the WM.  WM will dispatch the job to an appropriate Computing Element for execution taking into account requirements and the preferences expressed in the job description (JDL file) match-makingThe choice of the best matching resource to be used is the outcome of the so called match-making process.

5 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 5/63 Current WMS architecture gLite provides modules for the following workload-related components: UI ( all major gLite client components to the system ) WMS node ( scheduling on the grid, match-making, complete job management ) –gLite WMS –LCG RB LB server ( logging and bookkeeping ) Computing Element ( access point to a pool of resources) –gLite CE –LCG CE Worker Node ( the actual execution host in a given cluster ) Torque/Maui LRMS ( local scheduler & job management for the CE and the WNs – Also LSF interfaced) The StorageIndex interface ( to data catalogs ) MyProxy server ( user proxy renewal ) VOMS [ security ] : auth / authZ BD-II ( LCG )

6 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 6/63 gLite WMS WMS client UI RGMA,BDIIs client I/O clients LFC clients Globus clients UI WMS CE LB WN 1 WN 2 WN 3 WN 4 Network Server WMproxy Workload Manager (WM) Local Logger Job Controller Condor-C master schedd collector negotiator launcher advertiser Globus gatekeeper Condor-C master schedd BLAHPD LRMS ( PBS serv,sched LSF serv ) Grid FTP Proxy renewd Log Monitor PBS mom PBS mom PBS mom PBS mom logd interlogger Bookkeeping srv SEIndex BD-II CEmon based on CONDOR-C VOMS LFC File Catalog

7 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 7/63 LCG RB WMS client UI RGMA,BDIIs client LFC clients Globus clients UI WMS CE LB WN 1 WN 2 WN 3 WN 4 Network Server Resource Broker WM Local Logger Job Controller Condor-G Globus GRAM Globus gatekeeper Globus GRAM, Globus JobManager (fork,pbs,lsf) LRMS ( PBS serv,sched LSF serv ) Grid FTP Proxy renewd Log Monitor PBS mom PBS mom PBS mom PBS mom logd interlogger File Catalogs Bookkeeping srv BD-II VOMS LFC File Catalog based on CONDOR-G Globus GRAM

8 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 8/63 The Architecture of WMS

9 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 9/63 Job management requests (submission, cancellation) expressed via a Job Description Language (JDL)

10 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 10/63 WMS’s Architecture Keeps submission Requests Requests are kept for a while, waiting for for a while, waiting for being dispatched If there is no matching resource available

11 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 11/63 WMS’s Architecture Repository of resource information information Updated via notifications and/or active polling on sources Provide matchmaker With information to decide best resources for request.

12 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 12/63 WMS’s Architecture Finds an appropriate CE or resource for job request according to the information from ISM. Taking into account job preferences, resource status, policies on resources

13 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 13/63 WMS’s Architecture Performs the actual job submission and monitoring Normally it is Condor.

14 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 14/63 WMS’s Architecture Computing Element is the place where you jobs run

15 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 15/63 Access points to workload: NS and WMproxy The Network Server (NS) is a generic network daemon providing support for the job control functionality. It is responsible for accepting incoming requests from the WMS-UI (e.g. job submission, job removal), which, if valid, are then passed to the Workload Manager. The Workload Manager Proxy (WMProxy) is a service providing access to WMS functionality through a Web Services based interface. Besides being the natural replacement of the NS in the passage to the SOA approach for the WMS architecture, it provides additional features such as bulk submission and the support for shared and compressed sandboxes for compound jobs.

16 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 16/63 WMS components Network Server NS (Old service) –It ‘s a generic daemon accepting requests from the UserInterface and verifying the user’s credentials Workload Manager Proxy WMProxy (New service) –Provides access to WMS functionality through a Web Services based interface –Each job submitted to a WMProxy Service is given the delegated credentials of the user who submitted it. –These credentials can then be used to perform operations requiring interactions with other services –WMProxy advantages:  web service, SOAP  job collections, DAG jobs, shared and compressed  sandboxes –WMProxy caveats:  needs delegated credentials  Delegate once,submit many

17 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 17/63 WMS components (cont.) Workload Manager (WM) –Is responsible for  Calls Matchmaker to find the resource which best matches the job requirements.  Interacting with Information System and File catalog.  Calculates the ranking of all the matchmaked resource Information Supermarket (ISM) –is responsible for  basically consists of a repository of resource information that is available in read only mode to the matchmaking engine Job Adapter –is responsible for  making the final touches to the JDL expression for a job, before it is passed to CondorC for the actual submission  creating the job wrapper script that creates the appropriate execution environment in the CE worker node transfer of the input and of the output sandboxes

18 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 18/63 WMS components (cont.) Job Controller (JC) –Is responsible for  Converts the condor submit file into ClassAd  hands over the job to CondorC Condor –responsible for  performing the actual job management operations: job submission, removal  DAG Manager (DAGMan) –It is a meta-scheduler whose purpose is to navigate the GRAPH (DAG) determine dependencies and follow the execution of the corresponding jobs Log Monitor –is responsible for  watching the Condor log file  intercepting interesting events concerning active jobs events affecting the job state machine  triggering appropriate actions.

19 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 19/63 Task Queue and Scheduling policies –Task Queue  Gives the possibility to keep track of the requests if no resources are immediatelly avalaible  Non-matching requests will be retried periodically (eager scheduling)  Or wait for notification of avalaible resources (lazy scheduling) –eager scheduling (“push” model)  a job is bound to a resource as soon as possible. Once the decision has been taken, the job is handed over to the selected resource for execution. –lazy scheduling (“pull” model)  the job is held by the WM until a resource becomes available. When this happens the resource is matched against the submitted job.

20 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 20/63 gLIte WMS server architecture ismdump.fl file available resources WMS configuration file $GLITE_LOCATION/etc/glite_wms.conf NS WM JC accepts /istantiate connections to / from UI check user authorization forwards requests to the WorkLoad Manager accepts requests from the NS performs match-making submits classAds-based job requests handles submission, job management via the job controller UI starts all job-related CONDOR daemons hands over the job to Condor CONDOR – C CONDOR @ CE CEmon contact BD-II contact

21 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 21/63 WMS -> CE Computing Element is built on a homogeneous farm of computing nodes (called Worker Nodes) Also there are many components inside CE such as gatekeeper, globus-jobmanager,..

22 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 22/63 Gatekeeper Grants access to the CE and map grid user to a local user id.

23 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 23/63 Batch System A cluster of compute nodes controlled by a head node. handles the job execution Example: Torque (Open PBS), PBS

24 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 24/63 A typical case of glite-enabled grid Many CE in glite-enabled grid Few WMS coordinating the CEs and broker jobs to proper CEs.

25 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 25/63 Computing Element Components Gatekeeper –Grants access to the CE.. Authenticate users and map users to local accounts. – forks the globus-jobmanager. globus-jobmanager –Fork Condor-C (in CE) to help submit jobs to batch systems. BLAPHD (Batch Local ASCII Helper Protocol Daemon) –Offer an unique interface for condor-c(in CE) to submit jobs to different batch systems – BLAPHD commands is used by Condor-C (in CE) to submit jobs to the batch system. Batch System –handles the job execution on the available local worker nodes. –Batch System consists of:  torque (formerly known as OpenPBS) resource manager.  maui job scheduler. A cluster MUST be homogeneous. Worker nodes –It is the host executing the jobs. –Also responsible for downloading and uploading jobs’ data from or to WMS or SE.

26 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 26/63 The current gLite CE LSF PBS/ Torque Condor Gatekeeper LCAS LCMAPS WSS CEMon Condor-CBlahpd Notificat ions Launch Condor-C Submit job Local batch system CE

27 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 27/63 WMS - user interaction: JDL file Write your Job Description File (JDL file) (classAds ) Type = “Job”; JobType = “Normal”; Executable = “/bin/bash”; Arguments = “mySimulationShellScritp.sh”; StdInput = “stdin”; StdOutput = “stdout”; StdError = “stderr”; InputSandbox = {“mySimulationShellScritp.sh”,“stdin”,“data-card- 1.file”,”data-card-2.file”}; OutputSandbox = {“stderr”, “stdout”,“outputfile1.data”,”histos.zebra”}; Environment = {“JOB_LOG_FILE=/tmp/myJob.log”}; Requirements = Member(“EGEE-preprod-1.2.4- 1.2”,other.GlueHostApplicationSoftwareRunTimeEnvironment);

28 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 28/63 How to deal with your job Create a proxy : voms-proxy-init --voms egtest Make sure you see available CEs matching it –edg-job-list-match myVeryFirstJob.jdl –glite-job-list-match myVeryFirstJob.jdl Submit your JDL to the WMS / network server –edg-job-submit myVeryFirstJob.jdl –glite-job-submit myVeryFirstJob.jdl Query the job status –edg-job-status https://lxb1419.cern.ch:9000/4wFmeFBaplXJvNXPe6UPXA https://lxb1419.cern.ch:9000/4wFmeFBaplXJvNXPe6UPXA –glite-job-status https://lxb1419.cern.ch:9000/4wFmeFBaplXJvNXPe6UPXA https://lxb1419.cern.ch:9000/4wFmeFBaplXJvNXPe6UPXA Get the job’s output – edg-job-get-output https://lxb1419.cern.ch:9000/4wFmeFBaplXJvNXPe6UPXA https://lxb1419.cern.ch:9000/4wFmeFBaplXJvNXPe6UPXA – glite-job-output https://lxb1419.cern.ch:9000/4wFmeFBaplXJvNXPe6UPXA https://lxb1419.cern.ch:9000/4wFmeFBaplXJvNXPe6UPXA

29 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 29/63 What happens when we submit a job to the gLite WMS server A JDL file is defined on the UI A proxy-file is created by the user starting from his certificates using VOMS ( voms-proxy-init ) The JDL gets submitted to the WMS (NS) –A job-wrapper is created on the UI and transferred to the WMS node, including the user’s proxy –The Network Server checks authorization and then forwards the job to the Workload Manager on the same machine (WMS)

30 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 30/63 a User submits a job…. The WM performs the match-making matching the available resources stored in the ISM and the classAds describing the requirements of the job Hand off to Condor-C : all major required condor processes started by the Job Controller –The corresponding user ‘s condor schedd on the target destination CE has to be started : –This is done using the Globus gatekeeper and jobmanager fork running on that matching CE The ball is in Condor’s court now : condor to condor job management ( Condor-c )

31 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 31/63 What happens when we submit a job to the LCB RB A JDL file is defined on the UI A proxy-file is created by the user starting from his certificates using VOMS ( voms-proxy-init ) The JDL gets submitted to the WMS (NS) –A job-wrapper is created on the UI and transferred to the WMS node, including the user’s proxy –The Network Server checks authorization and then forwards the job to Resource Broker

32 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 32/63 User submits a job… The RB performs the match-making matching the available resources stored in the Information System (BDII) and the classAds describing the requirements of the job –there is no local cache of the IS on the LCG RB Hand off to Condor-G : –Globus GRAM is used to handle job and proxy –Through the Globus gatekeeper the Globus jobmanager (usually PBS or LSF) is accessed on the destination CE

33 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 33/63 Job State Machine

34 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 34/63 Job State Machine Submitted job is entered by the user to the User Interface but not yet transferred to Network Server for processing

35 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 35/63 Job State Machine Waiting job accepted by NS and waiting for Workload Manager processing or being processed by WMHelper modules.

36 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 36/63 Job State Machine Ready job processed by WM but not yet transferred to the CE (local batch system queue).

37 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 37/63 Job State Machine Scheduled job waiting in the queue on the CE.

38 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 38/63 Job State Machine Running job is running.

39 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 39/63 Job State Machine Done job exited or considered to be in a terminal state by CondorC (e.g., submission to CE has failed in an unrecoverable way).

40 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 40/63 Job State Machine Aborted job processing was aborted by WMS (waiting in the WM queue or CE for too long, expiration of user credentials).

41 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 41/63 Job State Machine Cancelled job has been successfully canceled on user request.

42 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 42/63 Job State Machine Cleared output sandbox was transferred to the user or removed due to the timeout.

43 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 43/63 Possible job states Flag Meaning SUBMITTEDsubmission logged in the LB WAITjob match making for resources READYjob being sent to executing CE SCHEDULEDjob scheduled in the CE queue manager RUNNING job executing on a WN of the selected CE queue DONEjob terminated without grid errors CLEAREDjob output retrieved ABORT job aborted by middleware, check reason

44 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 44/63 WMS client tools The most relevant commands to interact with the WMS (NS): –edg-job-submit –edg-job-list-match –edg-job-status –edg-job-get-output –edg-job-cancel In gLite 3.0: –glite-job-submit –glite-job-list-match –glite-job-status –glite-job-output –glite-job-cancel

45 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 45/63 gLite WMProxy WMProxy (Workload Manager Proxy) –is a new service providing access to the gLite Workload Management System (WMS) functionality through a simple Web Services based interface. –has been designed to handle a large number of requests for job submission  gLite 1.5 => ~180 secs for 500 jobs  goal is to get in the short term to ~60 secs for 1000 jobs –it provides additional features such as bulk submission and the support for shared and compressed sandboxes for compound jobs. –It’s the natural replacement of the NS in the passage to the SOA approach.

46 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 46/63 WMproxy: New request types Support for new types strongly relies on newly developed JDL converters and on the DAG submission support –all JDL conversions are performed on the server –a single submission for several jobs All new request types can be monitored and controlled through a single handle (the request id) –each sub-jobs can be however followed-up and controlled independently through its own id “Smarter” WMS client commands/API –allow submission of DAGs, collections and parametric jobs exploiting the concept of “shared sandbox” –allow automatic generation and submission of collections and DAGs from sets of JDL files located in user specified directories on the UI

47 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 47/63 WMProxy CLI commands The commands to interact with WMProxy Service are: glite-wms-job-submit glite-wms-job-list-match glite-wms-job-cancel glite-wms-job-output

48 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 48/63 WMproxy: submitting a collection of jobs Place all JLDs to be submitted in a directory ( for example./Collect) voms-proxy-init --voms gilda glite-wms-job-delegate-proxy –d DelegString glite-job-submit –d DelegString –o myJIDs --collection./Collect glite-wms-job-status -i myJIDs glite-wms-job-output –i myJIDs

49 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 49/63 Job Description Language

50 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 50/63 Job Description Language match-making process The JDL is used in gLite to specify the job’s characteristics and constrains, which are used during the match-making process to select the best resources that satisfy job’s requirements.

51 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 51/63 Job Description Language (cont.) JDL syntax The JDL syntax consists on statements like: Attribute = value; Comments must be preceded by a sharp character # ( # ) or have to follow the C++ syntax WARNING: The JDL is sensitive to blank characters and tabs. No blank characters or tabs should follow the semicolon at the end of a line.

52 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 52/63 Job Description Language (cont.) In a JDL, some attributes are mandatory while others are optional. An “essential” JDL is the following: Executable = “test.sh”; StdOutput = “std.out”; StdError = “std.err”; InputSandbox = {“test.sh”}; OutputSandbox = {“std.out”,”std.err”}; If needed, arguments to the executable can be passed: Arguments = “Hello World!”;

53 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 53/63 Job Description Language (cont.) If the argument contains quoted strings, the quotes must be escaped with a backslash e.g. Arguments = “\”Hello World!\“ 10”; Special characters such as &, |, >, < are only allowed if specified inside a quoted string or preceded by triple \ (e.g. Arguments = "-f file1\\\&file2";)

54 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 54/63 Workload Manager Service The JDL allows the description of the following request types supported by the WMS: Job: a simple application DAG: a direct acyclic graph of dependent jobs With WMSProxy Collection: a set of independent jobs With WMSProxy

55 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 55/63 JDL: Relevant Attributes JobType JobType (optional) Normal (simple, sequential job), Interactive, MPICH, Checkpointable, Partitionable, Parametric Or combination of them Checkpointable, Interactive Checkpointable, MPI JobType = “Normal”; E.g. JobType = “Normal”;

56 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 56/63 JDL: Relevant Attributes (cont.) Executable Executable (mandatory) This is a string representing the executable/command name. The user can specify an executable which is already on the remote CE Executable = {“/opt/EGEODE/GCT/egeode.sh”}; The user can provide a local executable name, which will be staged from the UI to the WN. Executable = {“egeode.sh”}; InputSandbox = {“/home/larocca/egeode/egeode.sh”}; InputSandbox = {“/home/larocca/egeode/egeode.sh”};

57 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 57/63 JDL: Relevant Attributes (cont.) Arguments Arguments (optional) This is a string containing all the job command line arguments. E.g.: If your executable sum has to be started as: $ sum N1 N2 –out result.out Executable = “sum”; Executable = “sum”; Arguments = “N1 N2 –out result.out”; Arguments = “N1 N2 –out result.out”;

58 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 58/63 JDL: Relevant Attributes (cont.) Environment Environment (optional) List of environment settings needed by the job to run properly Environment = {“JAVA_HOME=/usr/java/j2sdk1.4.2_08”}; E.g. Environment = {“JAVA_HOME=/usr/java/j2sdk1.4.2_08”}; InputSandbox InputSandbox (optional) List of files on the UI local disk needed by the job for proper running The listed files will be automatically staged to the remote resource InputSandbox ={“myscript.sh”,”/tmp/cc.sh”}; E.g. InputSandbox ={“myscript.sh”,”/tmp/cc.sh”};

59 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 59/63 JDL: Relevant Attributes (cont.) OutputSandbox OutputSandbox (optional) List of files, generated by the job, which have to be retrieved from the CE OutputSandbox ={ “std.out”,”std.err”, “image.png”}; – E.g. OutputSandbox ={ “std.out”,”std.err”, “image.png”};

60 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 60/63 JDL: Relevant Attributes (cont.) Requirements Requirements (optional) Job requirements on computing resources Specified using attributes of resources published in the Information Service If not specified, default value defined in UI configuration file is considered Requirements = other.GlueCEStateStatus == "Production“; Default. Requirements = other.GlueCEStateStatus == "Production“; Requirements=other.GlueCEUniqueID == “adc006.cern.ch:2119/jobmanager- pbs-infinite” – Requirements=Member(“ALICE-3.07.01”, other.GlueHostApplicationSoftwareRunTimeEnvironment);

61 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 61/63 References JDL Attributes http://server11.infn.it/workload-grid/docs/DataGrid-01-TEN-0142- 0_2.pdf https://edms.cern.ch/document/590869/1 http://egee-jra1-wm.mi.infn.it/egee-jra1- wm/api_doc/wms_jdl/index.html LCG-2 User Guide Manual Series https://edms.cern.ch/file/454439/LCG-2-UserGuide.html

62 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 62/63 References gLite 3.0 User Guide –https://edms.cern.ch/file/722398/1.1/gLite-3-UserGuide.pdfhttps://edms.cern.ch/file/722398/1.1/gLite-3-UserGuide.pdf GLUE Schema –http://infnforge.cnaf.infn.it/glueinfomodel/http://infnforge.cnaf.infn.it/glueinfomodel/ JDL attributes specification for WM proxy –https://edms.cern.ch/document/590869/1https://edms.cern.ch/document/590869/1 WMProxy quickstart –http://egee-jra1-wm.mi.infn.it/egee-jra1-wm/wmproxy_client_quickstart.shtmlhttp://egee-jra1-wm.mi.infn.it/egee-jra1-wm/wmproxy_client_quickstart.shtml WMS user guides –https://edms.cern.ch/document/572489/1https://edms.cern.ch/document/572489/1 EGEE www.eu-egee.orgwww.eu-egee.org gLite http://www.glite.org/http://www.glite.org/ LCG http://lcg.web.cern.ch/LCG/http://lcg.web.cern.ch/LCG/ Open Grid Forum http://www.gridforum.org/ Globus Alliance http://www.globus.org/ VDT http://www.cs.wisc.edu/vdt/http://www.cs.wisc.edu/vdt/

63 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 The 6th Joint Training of OMII-Europe&CNGrid, Hong kong, 10-11 January, 2008 63/63 Questions…


Download ppt "EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org www.glite.org Architecture of the WMS Yaodong Cheng CC-IHEP, Chinese Academy of Sciences."

Similar presentations


Ads by Google