EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org Workload management in gLite 3.x - MPI P. Nenkova, IPP-BAS, Sofia, Bulgaria Some of.

Slides:



Advertisements
Similar presentations
EGEE is a project funded by the European Union under contract IST EGEE Tutorial Turin, January Hands on Job Services.
Advertisements

INFSO-RI Enabling Grids for E-sciencE Workload Management System and Job Description Language.
The Grid Constantinos Kourouyiannis Ξ Architecture Group.
Development of test suites for the certification of EGEE-II Grid middleware Task 2: The development of testing procedures focused on special details of.
E-infrastructure shared between Europe and Latin America 12th EELA Tutorial for Users and System Administrators Architecture of the gLite.
INFSO-RI Enabling Grids for E-sciencE EGEE Middleware The Resource Broker EGEE project members.
1 Architecture of the gLite WMS Esther Montes Prado CIEMAT 10th EELA Tutorial Madrid,
IST E-infrastructure shared between Europe and Latin America Architecture of the gLite WMS Alexandre Duarte CERN Fifth EELA.
E-infrastructure shared between Europe and Latin America Architecture of the WMS Manuel Rubio del Solar CETA-CIEMAT EELA Tutorial, Mérida,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Job Submission Fokke Dijkstra RuG/SARA Grid.
Special Jobs Claudio Cherubino INFN - Catania. 2 MPI jobs on gLite DAG Job Collection Parametric jobs Outline.
EGEE-II INFSO-RI Enabling Grids for E-sciencE International Summer School on Grid Computing 2006 gLite Information System and Workload.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Architecture of the WMS Yaodong Cheng CC-IHEP, Chinese Academy of Sciences.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Supporting MPI applications on the EGEE Grid.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Special Jobs Matias Zabaljauregui UNLP.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) WMPROXY API Python & C++ Diego Scardaci
INFSO-RI Enabling Grids for E-sciencE The Workload Management System: an overview Giuseppe La Rocca INFN – Catania ICTP/INFM-Democritos.
Enabling Grids for E-sciencE EGEE-II INFSO-RI BG induction to GRID Computing and EGEE project – Sofia, 2006 Practical: Porting applications.
The gLite API – PART I Giuseppe LA ROCCA INFN Catania ACGRID-II School 2-14 November 2009 Kuala Lumpur - Malaysia.
INFSO-RI Enabling Grids for E-sciencE GILDA Praticals GILDA Tutors INFN Catania ICTP/INFM-Democritos Workshop on Porting Scientific.
:: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: GridKA School 2009 MPI on Grids 1 MPI On Grids September 3 rd, GridKA School 2009.
Enabling Grids for E-sciencE Workload Management System on gLite middleware Matthieu Reichstadt CNRS/IN2P3 ACGRID School, Hanoi (Vietnam)
Nadia LAJILI User Interface User Interface 4 Février 2002.
INFSO-RI Enabling Grids for E-sciencE Workload Management System Mike Mineter
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite job submission Fokke Dijkstra Donald.
Enabling Grids for E-sciencE EGEE-II INFSO-RI Practical: Porting applications to the GILDA grid Slides from Vladimir Dimitrov, IPP-BAS.
Enabling Grids for E-sciencE EGEE-II INFSO-RI Introduction to Grid Computing, EGEE and Bulgarian Grid Initiatives Plovdiv, 2006.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Using gLite API Vladimir Dimitrov IPP-BAS “gLite middleware Application Developers.
INFSO-RI Enabling Grids for E-sciencE The gLite Workload Management System Elisabetta Molinari (INFN-Milan) on behalf of the JRA1.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Feb. 06, Introduction to High Performance and Grid Computing Faculty of Sciences,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Job Submission Fokke Dijkstra RuG/SARA Grid.
Jan 31, 2006 SEE-GRID Nis Training Session Hands-on V: Standard Grid Usage Dušan Vudragović SCL and ATLAS group Institute of Physics, Belgrade.
Job Management DIRAC Project. Overview  DIRAC JDL  DIRAC Commands  Tutorial Exercises  What do you have learned? KEK 10/2012DIRAC Tutorial.
INFSO-RI Enabling Grids for E-sciencE Workflow Management in Giuseppe La Rocca INFN – Catania ICTP/INFM-Democritos Workshop on Porting.
FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
INFSO-RI Enabling Grids for E-sciencE Claudio Cherubino, INFN Catania Grid Tutorial for users Merida, April 2006 Special jobs.
Enabling Grids for E-sciencE Workload Management System on gLite middleware - commands Matthieu Reichstadt CNRS/IN2P3 ACGRID School, Hanoi.
EGEE-0 / LCG-2 middleware Practical.
INFSO-RI Enabling Grids for E-sciencE Job Workflows with gLite Emidio Giorgio INFN NA4 Generic Applications Meeting 10 January 2006.
INFSO-RI Enabling Grids for E-sciencE GILDA and GENIUS Guy Warner NeSC Training Team An induction to EGEE for GOSC and the NGS NeSC,
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Alexandre Duarte CERN IT-GD-OPS UFCG LSD 1st EELA Grid School.
Workload Management System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks WMPROXY usage Álvaro Fernández IFIC (CSIC)
EGEE-II INFSO-RI Enabling Grids for E-sciencE Command Line Grid Programming Spiros Spirou Greek Application Support Team NCSR “Demokritos”
INFSO-RI Enabling Grids for E-sciencE EGEE is a project funded by the European Union under contract IST Job sandboxes.
INFSO-RI Enabling Grids for E-sciencE Job Description Language (JDL) Giuseppe La Rocca INFN First gLite tutorial on GILDA Catania,
INFSO-RI Enabling Grids for E-sciencE GILDA Praticals Giuseppe La Rocca INFN – Catania gLite Tutorial at the EGEE User Forum CERN.
Development of test suites for the certification of EGEE-II Grid middleware Task 2: The development of testing procedures focused on special details of.
Enabling Grids for E-sciencE EGEE-II INFSO-RI Porting an application to the EGEE Grid & Data management for Application Rachel Chen.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using WMProxy advanced job submission.
EGEE 3 rd conference - Athens – 20/04/2005 CREAM JDL vs JSDL Massimo Sgaravatto INFN - Padova.
LCG2 Tutorial Viet Tran Institute of Informatics Slovakia.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Overview of gLite, the EGEE middleware Mike Mineter Training Outreach Education National.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) Advanced Job Riccardo Rotondo
Introduction to Computing Element HsiKai Wang Academia Sinica Grid Computing Center, Taiwan.
Introduction to Job Description Language (JDL) Alessandro Costa INAF Catania Corso di Calcolo Parallelo Grid Computing Catania - ITALY September.
Enabling Grids for E-sciencE Work Load Management & Simple Job Submission Practical Shu-Ting Liao APROC, ASGC EGEE Tutorial.
FESR Trinacria Grid Virtual Laboratory Practical using WMProxy advanced job submission Emidio Giorgio INFN Catania.
Practical using C++ WMProxy API advanced job submission
Workload Management System on gLite middleware
The gLite Workload Management System
EGEE tutorial, Job Description Language - more control over your Job Assaf Gottlieb Tel-Aviv University EGEE is a project.
Alexandre Duarte CERN Fifth EELA Tutorial Santiago, 06/09-07/09,2006
Workload Management System
gLite Advanced Job Management
The gLite Workload Management System
gLite Job Management Christos Theodosiou
GENIUS Grid portal Hands on
Job Description Language (JDL)
Job Submission M. Jouvin (LAL-Orsay)
Presentation transcript:

EGEE-II INFSO-RI Enabling Grids for E-sciencE Workload management in gLite 3.x - MPI P. Nenkova, IPP-BAS, Sofia, Bulgaria Some of materials are used from presentations of Mike Mineter, Training Outreach and Education, University of Edinburgh, UK

Enabling Grids for E-sciencE EGEE-II INFSO-RI Workload Management System Helps the user accessing computing resources –resource brokering –management of input and output –management of complex workflows Support for MPI job even if the file system is not shared between CE and Worker Nodes (WN) – easy JDL extensions Web Service interface via WMProxy

Enabling Grids for E-sciencE EGEE-II INFSO-RI gLite components ReplicaCatalogue Logging & Book-keepingStorageElementComputingElement InformationService Job Status Author. &Authen. Job Submit Event Job Query Job Status Input “sandbox” + Broker Info Output “sandbox” Publish SE & CE info “User interface” Workload Management System DataSets info Input “sandbox”

Enabling Grids for E-sciencE EGEE-II INFSO-RI WMS functionality The Workload Management SystemThe Workload Management System (WMS) is the gLite component responsible for the management of user’s jobs : their –submission –scheduling –execution –status monitoring –output retrieval Its core component is the Workload Manager (WM) The WM handles the requests for job management coming from the WMS clients –The submission request hands over the responsibility of the job to the WM.  WM will dispatch the job to an appropriate Computing Element for execution taking into account requirements and the preferences expressed in the job description (JDL file) match-makingThe choice of the best matching resource to be used is the outcome of the so called match-making process.

Enabling Grids for E-sciencE EGEE-II INFSO-RI gLite WMProxy WMProxy (Workload Manager Proxy) –is a new service providing access to the gLite Workload Management System (WMS) functionality through a simple Web Services based interface. –has been designed to handle a large number of requests for job submission  gLite 1.5 => ~180 secs for 500 jobs  goal is to get in the short term to ~60 secs for 1000 jobs –it provides additional features such as bulk submission and the support for shared and compressed sandboxes for compound jobs. –It’s the natural replacement of the NS in the passage to the SOA approach.

Enabling Grids for E-sciencE EGEE-II INFSO-RI WMS: role of WMProxy CEs Local Workstation UI UI (user interface) has preinstalled client software Workload Manager Client on the UI communicates with the “WM Proxy” On UI run: glite-wms-…commands WMProxy acts on your behalf in using the WM – it needs a “delegated proxy” – hence “- a” option on commands WMProxy

Enabling Grids for E-sciencE EGEE-II INFSO-RI Complex Workflows Direct Acyclic Graph (DAG) is a set of jobs where the input, output, or execution of one or more jobs depends on one or more other jobs A Collection is a group of jobs with no dependencies –basically a collection of JDL’s A Parametric job is a job having one or more attributes in the JDL that vary their values according to parameters Using compound jobs it is possible to have one shot submission of a (possibly very large, up to thousands) group of jobs –Submission time reduction  Single call to WMProxy server  Single Authentication and Authorization process  Sharing of files between jobs –Availability of both a single Job Id to manage the group as a whole and an Id for each single job in the group nodeE nodeC nodeA nodeD nodeB

Enabling Grids for E-sciencE EGEE-II INFSO-RI WMProxy: users’ view glite-wms-job-submit will supersede glite-job-submit (which is superseding edg-job-submit) Its support for compound jobs will simplify application software –WMProxy manages sub-jobs –Shared Input and Output “sandboxes” MUST establish proxy delegation before this can be used!

Enabling Grids for E-sciencE EGEE-II INFSO-RI Submitting jobs to lcg-RBs Jobs run in batch mode on traditional gLite grids. Steps in running a job on a gLite grid with a lcg-RB: 1.Create a text file in “Job Description Language” 2.Optional check: list the compute elements that match your requirements (“edg-job-list-match myfile.jdl” command) 3.Submit the job ~ “edg-job-submit myfile.jdl” Non-blocking - Each job is given an id. 4.Occasionally check the status of your job (“edg-job- status” command) 5.When “Done” retrieve output (“edg-job-get-output” command) 6.Or just cancel the job (“edg-job-cancel” command)

Enabling Grids for E-sciencE EGEE-II INFSO-RI Submitting jobs to a gLite WMS Jobs run in batch mode on traditional gLite grids. Steps in running a job on a gLite grid with WMS: 1.Create a text file in “Job Description Language” 2.Optional check: list the compute elements that match your requirements (“glite-wms-job-list-match myfile.jdl” command) 3.Submit the job ~ “glite-wms-job-submit myfile.jdl” Non-blocking - Each job is given an id. 4.Occasionally check the status of your job (“glite-wms- job-status” command) 5.When “Done” retrieve output (“glite-wms-job-output” command) 6.Or just cancel the job (“glite-wms-job-cancel” command)

Enabling Grids for E-sciencE EGEE-II INFSO-RI Job management example

Enabling Grids for E-sciencE EGEE-II INFSO-RI WMproxy: submitting a collection of jobs Place all JLDs to be submitted in a directory ( for example./Collect) voms-proxy-init --voms gilda glite-wms-job-delegate-proxy –d DelegString glite-job-submit –d DelegString –o myJIDs --collection./Collect glite-wms-job-status -i myJIDs glite-wms-job-output –i myJIDs

Enabling Grids for E-sciencE EGEE-II INFSO-RI WMS - user interaction: JDL file Write your Job Description File (JDL file) (classAds ) Type = “Job”; JobType = “Normal”; Executable = “/bin/bash”; Arguments = “mySimulationShellScritp.sh”; StdInput = “stdin”; StdOutput = “stdout”; StdError = “stderr”; InputSandbox = {“mySimulationShellScritp.sh”,“stdin”,“data-card- 1.file”,”data-card-2.file”}; OutputSandbox = {“stderr”, “stdout”,“outputfile1.data”,”histos.zebra”}; Environment = {“JOB_LOG_FILE=/tmp/myJob.log”}; Requirements = Member(“EGEE-preprod ”,other.GlueHostApplicationSoftwareRunTimeEnvironment);

Enabling Grids for E-sciencE EGEE-II INFSO-RI Workload Manager Service The JDL allows the description of the following request types supported by the WMS: –Job: a simple application –DAG: a direct acyclic graph of dependent jobs  With WMSProxy –Collection: a set of independent jobs  With WMSProxy Type = "Collection" E.g. Type = "Collection"

Enabling Grids for E-sciencE EGEE-II INFSO-RI Some JDL-file attributes  Executable – sets the name of the executable file;  Arguments – command line arguments of the program;  StdOutput, StdError - files for storing the standard output and error messages output;  InputSandbox – set of input files needed by the program, including the executable;  OutputSandbox – set of output files which will be written during the execution, including standard output and standard error output; these are sent from the CE to the WMS for you to retrieve  ShallowRetryCount – in case of grid error, retry job this many times (“Shallow”: before job is running)

Enabling Grids for E-sciencE EGEE-II INFSO-RI JDL: Relevant Attributes JobTypeJobType (optional) oNormal (simple, sequential job), Interactive, MPICH, Checkpointable, Partitionable, Parametric oOr combination of them Checkpointable, Interactive Checkpointable, MPI JobType = “Normal”; E.g. JobType = “Normal”;

Enabling Grids for E-sciencE EGEE-II INFSO-RI JDL: Relevant Attributes (cont.) ExecutableExecutable (mandatory) oThis is a string representing the executable/command name. oThe user can specify an executable which is already on the remote CE Executable = {“/opt/EGEODE/GCT/egeode.sh”}; o Executable = {“/opt/EGEODE/GCT/egeode.sh”}; oThe user can provide a local executable name, which will be staged from the UI to the WN. Executable = {“egeode.sh”}; o Executable = {“egeode.sh”}; o InputSandbox = {“/home/larocca/egeode/egeode.sh”};

Enabling Grids for E-sciencE EGEE-II INFSO-RI JDL: Relevant Attributes (cont.) ArgumentsArguments (optional) This is a string containing all the job command line arguments. E.g.: If your executable sum has to be started as: $ sum N1 N2 –out result.out Executable = “sum”; Executable = “sum”; Arguments = “N1 N2 –out result.out”; Arguments = “N1 N2 –out result.out”;

Enabling Grids for E-sciencE EGEE-II INFSO-RI JDL: Relevant Attributes (cont.) EnvironmentEnvironment (optional) oList of environment settings needed by the job to run properly Environment = {“JAVA_HOME=/usr/java/j2sdk1.4.2_08”}; E.g. Environment = {“JAVA_HOME=/usr/java/j2sdk1.4.2_08”}; InputSandboxInputSandbox (optional) oList of files on the UI local disk needed by the job for proper running oThe listed files will be automatically staged to the remote resource InputSandbox ={“myscript.sh”,”/tmp/cc.sh”}; E.g. InputSandbox ={“myscript.sh”,”/tmp/cc.sh”}; OutputSandbox OutputSandbox (optional) oList of files, generated by the job, which have to be retrieved from the CE OutputSandbox ={ “std.out”,”std.err”, “image.png”}; –E.g. OutputSandbox ={ “std.out”,”std.err”, “image.png”};

Enabling Grids for E-sciencE EGEE-II INFSO-RI JDL: Relevant Attributes (cont.) RequirementsRequirements (optional) oJob requirements on computing resources oSpecified using attributes of resources published in the Information Service oIf not specified, default value defined in UI configuration file is considered Requirements = other.GlueCEStateStatus == "Production“; Default. Requirements = other.GlueCEStateStatus == "Production“; Requirements=other.GlueCEUniqueID == “adc006.cern.ch:2119/jobmanager-pbs-infinite” –Requirements=Member(“ALICE ”, other.GlueHostApplicationSoftwareRunTimeEnvironment); –Requirements = Member("MPICH", other.GlueHostApplicationSoftwareRunTimeEnvironment);

Enabling Grids for E-sciencE EGEE-II INFSO-RI MPI Applications in Grid Environment The current release of the EGEE middleware (gLite 3.0.1) provides the MPICH job type to support MPI applications. Users wishing to submit MPI jobs set the JobType variable to “MPICH” and set the variable “NodeNumber” to indicate how many nodes they require: JobType = "MPICH"; NodeNumber = 8; Executable = "mpi-start-wrapper.sh"; Arguments = "mpi-test MPICH"; InputSandbox = {"mpi-start-wrapper.sh","mpi-hooks.sh","mpi- test.c"}; Requirements = Member("MPICH", other.GlueHostApplicationSoftwareRunTimeEnvironment); Although it is not required to compile the MPI locally, it is highly recommended. Many compilation options are specific to the software installed or hardware installed on a site.

Enabling Grids for E-sciencE EGEE-II INFSO-RI MPI-start –MPI-start, a script developed within the int.eu.grid project provides a layer of abstraction between the user and the various different implementations of MPI. –Detects the configuration of a site at runtime and runs the user’s executable using the requested version of MPI Hide differences of MPI implementations Hide differences between scheduler implementation Support for different schedulers Support for different MPI implementations Support for simple file distribution MPI-Start Abstraction Layer

Enabling Grids for E-sciencE EGEE-II INFSO-RI Wrapper script (mpi-start) In order to hide some of the complexity, wrapper script for submitting MPI job is used. Requires the user to define a "wrapper script" and a set of "hooks“: # Setup for mpi-start. export I2G_MPI_APP=$MY_EXECUTABLE export I2G_MPI_TYPE=$MPI_FLAVOUR export I2G_MPI_PRE_RUN_HOOK=mpi-hooks.sh export I2G_MPI_POST_RUN_HOOK=mpi-hooks.sh # Invoke mpi-start. $I2G_MPI_START –"pre-hook" can be used to compile the executable itself or download data. –"post-hook" can be used to analyze results or to save the results on the grid

Enabling Grids for E-sciencE EGEE-II INFSO-RI MPI job submission JobType = "MPICH"; NodeNumber = 8; Executable = "mpi-start-wrapper.sh"; Arguments = "mpi-test OPENMPI"; InputSandbox = {"mpi-start-wrapper.sh","mpi-hooks.sh","mpi- test.c"}; Requirements = Member("OPENMPI", other.GlueHostApplicationSoftwareRunTimeEnvironment); # Setup for mpi-start. export I2G_MPI_APP=$MY_EXECUTABLE # ($1) export I2G_MPI_TYPE=$MPI_FLAVOUR # ($2) export I2G_MPI_PRE_RUN_HOOK=mpi-hooks.sh export I2G_MPI_POST_RUN_HOOK=mpi-hooks.sh # Invoke mpi-start. $I2G_MPI_START JDL wrapper pre_run_hook () { mpicc -o ${I2G_MPI_APP} ${I2G_MPI_APP}.c } hooks

Enabling Grids for E-sciencE EGEE-II INFSO-RI More information gLite 3.0 User Guide – GLUE Schema – JDL attributes specification for WM proxy – WMProxy quickstart – wm/wmproxy_client_quickstart.shtmlhttp://egee-jra1-wm.mi.infn.it/egee-jra1- wm/wmproxy_client_quickstart.shtml WMS user guides –