Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Computing Element HsiKai Wang Academia Sinica Grid Computing Center, Taiwan.

Similar presentations


Presentation on theme: "Introduction to Computing Element HsiKai Wang Academia Sinica Grid Computing Center, Taiwan."— Presentation transcript:

1 Introduction to Computing Element HsiKai Wang Academia Sinica Grid Computing Center, Taiwan

2  Interlocution  WMS - Workload Management System  CREAM - Computing Resource Execution And Management  Example  Simple case for WMS  Simple case for CREAM 2 Outline

3 3 API Access Job Mgmt. Services Computing Element Workload Management Metadata Catalog Data Services Storage Element Data Movement File & Replica Catalog Authorization Security Services Authentication Information & Monitoring Information & Monitoring Services Service Discovering Accounting Auditing Job Provenance Package Manager CLI Network Monitoring Overview of gLite Middleware

4 How to work

5 Compute Element 5 Condor-G Globus client gLite WMS User CREAM CEMon ICE CREAM or BES client EGEE authZ, InfoSys, Accounting In production Existing prototype gLite component non-gLite component Batch System LCG-CE (GT2/4 + add-ons) Condor-C BLAH User / Resource User Interface Computing Element GIP Workload Manager

6  Interlocution  WMS - Workload Management System  CREAM - Computing Resource Execution And Management  Example  Simple case for WMS  Simple case for CREAM 6 Outline

7 Workload Management System Ref gLite-3.2 User Guide 7  The purpose of the Workload Management System (WMS):  To accept user jobs  To assign them to the most appropriate Computing Element  To record their status  To retrieve their output  The WMS used to be called Resource Broker (RB).  The service is called gLite-WMS.

8 Job Workflow in gLite-WMS 8 WMS/ Workload Management system File catalog IS/ Information system SE/ Storage Element CE/ Computing Element WN/ Worker Node UI JDL Input Sandbox Output Sandbox U I/ User Interface

9 UI WMProxy Job Contr. - CondorG Workload Manager LFC Inform. Service Computing Element Storage Element CE characts & status SE characts & status RB storage Input Sandbox files Job submitted WMS glite-wms-job-submit myjob.jdl WMProxy is responsible for accepting incoming requests

10 10 UI WMProxy Job Contr. - CondorG Workload Manager LFC Inform. Service Computing Element Storage Element CE characts & status SE characts & status RB storage waiting submitted WM: responsible to take the appropriate actions to satisfy the request Job WMS

11 11 UI WMProxy Job Contr. - CondorG Workload Manager LFC Inform. Service Computing Element Storage Element CE characts & status SE characts & status RB storage waiting submitted Match- Maker/ Broker Where must this job be executed ? WMS Matchmaker: responsible to find the “best” CE where to submit a job

12 12 WMS UI WMProxy Job Contr. - CondorG Workload Manager LFC Inform. Service Computing Element Storage Element CE characts & status SE characts & status RB storage waiting submitted Information supermarket Responsible of resource information available to Matchmaker Match- Maker/ Broker

13 13 UI WMProxy Job Contr. - CondorG Workload Manager LFC Inform. Service Computing Element Storage Element CE characts & status SE characts & status RB storage waiting submitted Match- Maker/ Broker WMS Information supermarket CE choice

14 UI WMProxy Job Contr. - CondorG Workload Manager LFC Inform. Service Computing Element Storage Element CE characts & status SE characts & status RB storage JC: responsible for the actual job management operations (done via CondorG) Job submitted waiting ready WMS Task Queue

15 15 WMS UI WMProxy Job Contr. - CondorG Workload Manager LFC Inform. Service Computing Element Storage Element CE characts & status SE characts & status RB storage Job Input Sandbox files submitted waiting ready scheduled Task Queue

16 16 UI WMProxy Job Contr. - CondorG Workload Manager LFC Inform. Service Computing Element Storage Element RB storage Input Sandbox submitted waiting ready scheduled running “Grid enabled” data transfers/ accesses Job WMS Task Queue

17 17 UI WMProxy Job Contr. - CondorG Workload Manager LFC Inform. Service Computing Element Storage Element RB storage Output Sandbox files submitted waiting ready scheduled running done WMS Task Queue

18 18 UI NS Job Contr. - CondorG Workload Manager LFC Inform. Service Computing Element Storage Element RB storage Output Sandbox files submitted waiting ready scheduled running done cleared WMS Task Queue glite-wms-job-output

19 19 UI Logging & Bookkeeping WMProxy Job Contr. - CondorG Workload Manager Computing Element LB: receives and stores job events; processes corresponding job status Log of job events Job status glite-wms-job-status glite-wms-job-logging-info WMS LB proxy

20 Job state machine 20

21 gLite-WMS Job States Ref gLite-3.2 User Guide 21 StatusDescription SUBMMITEDsubmission logged in the LB WAITjob match making for resources READYjob being sent to executing CE SCHEDULEDjob scheduled in the CE queue manager RUNNIGjob executing on a WN of the selected CE queue DONEjob terminated without grid errors CLEAREDjob output retrieved ABORTjob aborted by middleware, check reason

22  Interlocution  WMS - Workload Management System  CREAM - Computing Resource Execution And Management  Example  Simple case for WMS  Simple case for CREAM 22 Outline

23 23  CREAM: Web Service Computing Element  Cream WSDL allows defining custom user interface  C++ CLI interface allows direct submission  Lightweight  Fast notification of job status changes  via CEMon  Improved security  no “fork-scheduler”  Will support for bulk jobs on the CE  optimization of staging of input sandboxes for jobs with shared files  ICE: Interface to Cream Environment  being integrated in WMS for submissions to CREAM Computing Resource Execution And Management

24 Job Stat Machine Ref gLite-3.2 User Guide 24

25 CREAM Job States 25 StatusDescription REGISTEREDthe job has been registered but it has not been started yet. PENDINGthe job has been started, but it has still to be submitted to the LRMS abstraction layer module (i.e. BLAH). IDLEthe job is idle in the Local Resource Management System (LRMS). RUNNINGthe job wrapper, which "encompasses" the user job, is running in the LRMS. REALLY-RUNNINGthe actual user job (the one specified as Executable in the job JDL) is running in the LRMS. HELDthe job is held (suspended) in the LRMS. CANCELLEDthe job has been cancelled. DONE-OKthe job has successfully been executed. DONE-FAILEDthe job has been executed, but some errors occurred. ABORTEDerrors occurred during the "management" of the job, e.g. the submission to the LRMS abstraction layer software (BLAH) failed. UNKNOWNthe job is an unknown status.

26 Job Control Command Ref gLite-3.2 User Guide 26 gLite WMSgLite CREAM Delegate proxy glite-wms-job-delegate-proxy -d delegID glite-ce-job-delegate-proxy -e endpoint -d delegID Submit glite-wms-job-submit [-d delegID] [-a] [-o joblist] jdlfile glite-ce-job-submit [-d delegID] [-a] [-o joblist] -r ceIDs jdlfile Status glite-wms-job-status -i joblist | jobIDs glite-ce-job-status -i joblist | jobIDs Logging glite-wms-job-logging-info -i joblist | jobIDs Output glite-wms-job-output [-dir outdir] -i joblist | jobIDs Cancel glite-wms-job-cancel -i joblist | jobID glite-ce-job-cancel -i joblist | jobID Compatible resources glite-wms-job-list-match [-d delegID] [-a] jdlfile

27  Interlocution  WMS - Workload Management System  CREAM - Computing Resource Execution And Management  Example  Simple case for WMS  Simple case for CREAM 27 Outline

28 Job Description Language for WMS Ref gLite-3.2 User Guide 28 [hkw00@ui03 wms]$ ls checkHost.sh Host_wms.jdl [hkw00@ui03 wms]$ cat Host_wms.jdl JobType = "Normal"; CPUNumber = 1; Executable = "checkHost.sh”; StdOutput = "std.out"; StdError = "std.err”; InputSandbox = {"checkHost.sh"}; OutputSandbox = {"std.out", "std.err", "Host.log"}; RetryCount = 5; Requirements = other.GlueCEUniqueID == "as-ce01.euasiagrid.org:8443/cream-pbs-euasia"; [hkw00@ui03 wms]$ cat checkHost.sh #!/bin/sh echo "HOST: `hostname`" >> Host.log printenv >> Host.log

29 Example for WMS Ref gLite-3.2 User Guide 29 [hkw00@ui03 wms]$ glite-wms-job-submit -a Host_wms.jdl ====================== glite-wms-job-submit Success ====================== The job has been successfully submitted to the WMProxy Your job identifier is: https://lb04.grid.sinica.edu.tw:9000/FtH87_dKEfp-7qE54xk0Sw ========================================================================== [hkw00@ui03 wms]$ glite-wms-job-status https://lb04.grid.sinica.edu.tw:9000/FtH87_dKEfp-7qE54xk0Sw ======================= glite-wms-job-status Success ===================== BOOKKEEPING INFORMATION: Status info for the Job : https://lb04.grid.sinica.edu.tw:9000/FtH87_dKEfp- 7qE54xk0Sw Current Status: Scheduled Status Reason: unavailable Destination: as-ce01.euasiagrid.org:8443/cream-pbs-euasia Submitted: Sat Feb 2 16:35:10 2013 UTC ==========================================================================

30 Example for WMS Ref gLite-3.2 User Guide 30 hkw00@ui03 wms]$ glite-wms-job-output --dir. https://lb04.grid.sinica.edu.tw:9000/FtH87_dKEfp-7qE54xk0Sw ================================================================================ JOB GET OUTPUT OUTCOME Output sandbox files for the job: https://lb04.grid.sinica.edu.tw:9000/FtH87_dKEfp-7qE54xk0Sw have been successfully retrieved and stored in the directory: /home/hkw00/HAII/ce/wms/hkw00_FtH87_dKEfp-7qE54xk0Sw ======================================================== [hkw00@ui03 wms]$ ls /home/hkw00/HAII/ce/wms/hkw00_FtH87_dKEfp-7qE54xk0Sw Host.log std.err std.out

31  Interlocution  WMS - Workload Management System  CREAM - Computing Resource Execution And Management  Example  Simple case for WMS  Simple case for CREAM 31 Outline

32 Job Description Language for CREAM Ref gLite-3.2 User Guide 32 [hkw00@ui03 cream]$ ls checkHost.sh Host_cream.jdl [hkw00@ui03 cream]$ cat Host_cream.jdl JobType = "Normal"; CPUNumber = 1; Executable = "checkHost.sh”; StdOutput = "std.out"; StdError = "std.err”; InputSandBox = {"/home/hkw00/HAII/ce/cream/checkHost.sh"}; OutputSandBox = {"Host.log"}; OutputSandboxDestURI = {"gsiftp://as- ds01.euasiagrid.org/dpm/euasiagrid.org/home/euasia/hkw00/"}; RetryCount = 5; [hkw00@ui03 cream]$ cat checkHost.sh #!/bin/sh echo "HOST: `hostname`" >> Host.log printenv >> Host.log

33 Example CREAM Ref gLite-3.2 User Guide 33 [hkw00@ui03 cream]$ lcg-infosites --vo euasia ce # CPU Free Total Jobs Running Waiting ComputingElement ---------------------------------------------------------------- 256 155 0 0 0 as-ce01.euasiagrid.org:8443/cream-pbs-euasia 64 48 0 0 0 ce-qamar.utmgrid.utm.my:8443/cream-pbs-euasia 56 50 0 0 0 ce.utmgrid.utm.my:8443/cream-pbs-euasia (...) hkw00@ui03 cream]$ glite-ce-job-submit -r as-ce01.euasiagrid.org:8443/cream-pbs- euasia -a Host_cream.jdl https://as-ce01.euasiagrid.org:8443/CREAM378465856 [hkw00@ui03 cream]$ glite-ce-job-status https://as- ce01.euasiagrid.org:8443/CREAM378465856 ****** JobID=[https://as-ce01.euasiagrid.org:8443/CREAM378465856] Status = [DONE-OK] ExitCode = [0]

34 Example CREAM Ref gLite-3.2 User Guide 34 [hkw00@ui03 cream]$ lcg-ls srm://as- ds01.euasiagrid.org/dpm/euasiagrid.org/home/euasia/hkw00/ /dpm/euasiagrid.org/home/euasia/hkw00//Host.log (...) [hkw00@ui03 cream]$ lcg-cp srm://as- ds01.euasiagrid.org/dpm/euasiagrid.org/home/euasia/hkw00/Host.log file:`pwd`/Host.log [hkw00@ui03 cream]$ ls checkHost.sh Host_cream.jdl Host.log


Download ppt "Introduction to Computing Element HsiKai Wang Academia Sinica Grid Computing Center, Taiwan."

Similar presentations


Ads by Google