I2G CrossBroker Enol Fernández UAB

Slides:



Advertisements
Similar presentations
Workload Management David Colling Imperial College London.
Advertisements

Practical Mechanisms for Managing Parallel and Interactive Jobs on Grid Environments Enol Fernández UAB.
EGEE is a project funded by the European Union under contract IST EGEE Tutorial Turin, January Hands on Job Services.
EU 2nd Year Review – Jan – Title – n° 1 WP1 Speaker name (Speaker function and WP ) Presentation address e.g.
Workload management Owen Maroney, Imperial College London (with a little help from David Colling)
INFSO-RI Enabling Grids for E-sciencE Workload Management System and Job Description Language.
Int.eu.grid: A grid infrastructure for interactive applications Gonçalo Borges LIP on behalf of Int.EU.Grid Collaboration INGRID’08, Italy, April 2008.
Setting up of condor scheduler on computing cluster Raman Sehgal NPD-BARC.
Riccardo Bruno, INFN.CT Sevilla, 10-14/09/2007 GENIUS Exercises.
MPI support in gLite Enol Fernández CSIC. EMI INFSO-RI CREAM/WMS MPI-Start MPI on the Grid Submission/Allocation – Definition of job characteristics.
A Computation Management Agent for Multi-Institutional Grids
INFSO-RI Enabling Grids for E-sciencE Architecture of the gLite Workload Management System Giuseppe Andronico INFN EGEE Tutorial.
INFSO-RI Enabling Grids for E-sciencE EGEE Middleware The Resource Broker EGEE project members.
A Grid Resource Broker Supporting Advance Reservations and Benchmark- Based Resource Selection Erik Elmroth and Johan Tordsson Reporter : S.Y.Chen.
Workload Management Massimo Sgaravatto INFN Padova.
The Glidein Service Gideon Juve What are glideins? A technique for creating temporary, user- controlled Condor pools using resources from.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
EUFORIA FP7-INFRASTRUCTURES , Grant GridKa School 2008 Interactivity on the Grid Marcus Hardt SCC (The insitute formerly known as
Job Submission Condor, Globus, Java CoG Kit Young Suk Moon.
OGF 25/EGEE User Forum Catania, March 2 nd 2009 Meta Scheduling and Advanced Application Support on the Spanish NGI Enol Fernández del Castillo (IFCA-CSIC)
Computational grids and grids projects DSS,
:: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: GridKA School 2009 MPI on Grids 1 MPI On Grids September 3 rd, GridKA School 2009.
Enabling Grids for E-sciencE Workload Management System on gLite middleware Matthieu Reichstadt CNRS/IN2P3 ACGRID School, Hanoi (Vietnam)
DataGrid WP1 Massimo Sgaravatto INFN Padova. WP1 (Grid Workload Management) Objective of the first DataGrid workpackage is (according to the project "Technical.
Grid Workload Management Massimo Sgaravatto INFN Padova.
- Distributed Analysis (07may02 - USA Grid SW BNL) Distributed Processing Craig E. Tull HCG/NERSC/LBNL (US) ATLAS Grid Software.
NGS Innovation Forum, Manchester4 th November 2008 Condor and the NGS John Kewley NGS Support Centre Manager.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Feb. 06, Introduction to High Performance and Grid Computing Faculty of Sciences,
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
Review of Condor,SGE,LSF,PBS
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
Workload Management System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR.
Int.eu.grid: Experiences with Condor to Run Interactive and Parallel Applications on the Grid Elisa Heymann Department of Computer Architecture and Operating.
INFSO-RI Enabling Grids for E-sciencE Job Description Language (JDL) Giuseppe La Rocca INFN First gLite tutorial on GILDA Catania,
Grid Compute Resources and Job Management. 2 Grid middleware - “glues” all pieces together Offers services that couple users with remote resources through.
EGEE 3 rd conference - Athens – 20/04/2005 CREAM JDL vs JSDL Massimo Sgaravatto INFN - Padova.
Claudio Grandi INFN Bologna Virtual Pools for Interactive Analysis and Software Development through an Integrated Cloud Environment Claudio Grandi (INFN.
INFSO-RI Enabling Grids for E-sciencE Padova site report Massimo Sgaravatto On behalf of the JRA1 IT-CZ Padova group.
Introduction to Computing Element HsiKai Wang Academia Sinica Grid Computing Center, Taiwan.
Enabling Grids for E-sciencE Work Load Management & Simple Job Submission Practical Shu-Ting Liao APROC, ASGC EGEE Tutorial.
CE design report Luigi Zangrando
User requirements for interactive controlling and monitoring of applications in grid environments Dr. Isabel Campos Plasencia Institute of Physics of Cantabria.
Lessons from LEAD/VGrADS Demo Yang-suk Kee, Carl Kesselman ISI/USC.
Advanced services in gLite Gergely Sipos and Peter Kacsuk MTA SZTAKI.
DIRAC: Workload Management System Garonne Vincent, Tsaregorodtsev Andrei, Centre de Physique des Particules de Marseille Stockes-rees Ian, University of.
Workload Management Workpackage
Information System testing for LCG-1
Architecture of the gLite WMS
Quick Architecture Overview INFN HTCondor Workshop Oct 2016
gLite MPI Job Amina KHEDIMI CERIST
Workload Management System on gLite middleware
Special jobs with the gLite WMS
Design rationale and status of the org.glite.overlay component
LEAD-VGrADS Day 1 Notes.
Workload Management System ( WMS )
Corso di Calcolo Parallelo Grid Computing
EGEE tutorial, Job Description Language - more control over your Job Assaf Gottlieb Tel-Aviv University EGEE is a project.
The CREAM CE: When can the LCG-CE be replaced?
Job Submission in the DataGrid Workload Management System
Alexandre Duarte CERN Fifth EELA Tutorial Santiago, 06/09-07/09,2006
Job Description Language
CompChem VO: User experience using MPI
5. Job Submission Grid Computing.
Condor and Multi-core Scheduling
Basic Grid Projects – Condor (Part I)
The gLite Workload Management System
Job Description Language
GENIUS Grid portal Hands on
Job Description Language (JDL)
Condor-G Making Condor Grid Enabled
Presentation transcript:

I2G CrossBroker Enol Fernández UAB Dublin MPI Course, 10-11 September 2007

Dublin MPI Course, 10-11 September 2007 Introduction CrossBroker does automatic scheduling in Grid Environments Resource discovery Resource Selection Job Execution Jobs not treated by gLite: parallel jobs (MPI) Run in more than one resource, in a coordinated fashion. Interactive jobs The user interacts with the application during its execution Dublin MPI Course, 10-11 September 2007

Dublin MPI Course, 10-11 September 2007 Architecture CrossBroker Information Index Migrating Desktop Scheduling Agent Resource Searcher Replica Manager Application Launcher Condor-G DAGMan Comments on each module on the next slide! CE WN EGEE/Globus CE WN EGEE/Globus Dublin MPI Course, 10-11 September 2007

Dublin MPI Course, 10-11 September 2007 Architecture Scheduling Agent Receives each job and keeps it in a persistent queue Contacts Resource Searcher and gets a list of available resources Selects resources and passes them to Application Launcher Resource Searcher Given a job description (JobAd), performs the matchmaking between job needs and available resources. Uses the Condor ClassAd library, originally designed for matches of a single job with a single resource. A set matching has been developed to support matches of a single job to a group of resources. Application Launcher Responsible for providing a reliable submission service of parallel applications on the Grid. Responsible for file staging at the remote site (executable and input/output files) Uses the services of Condor-G Not much of interest, I think the previous one is enough, but here is some explanation about it. Dublin MPI Course, 10-11 September 2007

Dublin MPI Course, 10-11 September 2007 Parallel Job Support Support for parallel jobs: Open MPI PACX-MPI MPICH-P4 MPICH-G2 Plain (just the machines) Takes into account sites capabilites. Ability to define starter scripts/process to start the parallel job mpi-start is configured automatically and used by default. Dublin MPI Course, 10-11 September 2007

Dublin MPI Course, 10-11 September 2007 Parallel Job Support Changes in JDL JOBTYPE: Normal: sequential jobs, just one CPU Parallel: more than one CPU SUBJOBTYPE: openmpi pacx-mpi mpich mpich-g2 plain JOBSTARTER (if not defined, mpi-start) JOBSTARTERARGUMENTS Dublin MPI Course, 10-11 September 2007

Dublin MPI Course, 10-11 September 2007 Parallel Job Support Type = "Job"; VirtualOrganisation = "imain"; JobType = "Parallel"; SubJobType = "pacx-mpi"; NodeNumber = 5; Executable = "test-app"; Arguments = "-v"; InputSandbox = {"test-app", "inputfile"}; OutputSanbox = {"std.out", "std.err"}; StdErr = "std.err“; StdOutput = "std.out"; Rank = other.GlueHostBenchmarkSI00 ; Requirements = other.GlueCEStateStatus == "Production"; JDL example of a pacx job, that will be used later Dublin MPI Course, 10-11 September 2007

Dublin MPI Course, 10-11 September 2007 MPI Across Sites CrossBroker search and selects sets of resources for the jobs There is no guarantee that all tasks of the same job will start at the same time 1st choice: select only sites with free resources. The job will run immediately. Unfortunately, free resources are not always available 2nd choice: allocate a resource temporally and wait until all other tasks show up. Timeshare the resource with a backfilling policy to avoid resource iddleness Dublin MPI Course, 10-11 September 2007

Dublin MPI Course, 10-11 September 2007 MPI Across Sites CE CE2=aocegrid.uab.es FreeCPUs = 10 Disk = 100 AverageSI = 4000 CE3=bee001.ific.uv.es FreeCPUs = 3 AverageSI = 1000 CE1=zeus.cyf-kr.edu.pl FreeCPUs = 2 AverageSI = 2000 RS [Groups with 1 CEs] [Rank=2000] aocegrid.uab.es:2119/jobmanager-pbs-workq freeCPUs = 10 [Rank=1500] zeus.cyf-kr.edu.pl:2119/jobmanager-pbs-workq freeCPUs = 2 bee001.ific.uv.es:2119/jobmanager-pbs-workq freeCPUs = 3 Rank=1000] lngrid02.lip.pt/jobmanager-pbs-workq freeCPUs = 2 bee001.ific.uv.es:2119/jobmanager-pbs-workq freeCPUs = 3 CE CE5=lngrid02.lip.pt FreeCPUs = 2 Disk = 100 AverageSI = 1000 CE CE4= xgrid.icm.edu.pl FreeCPUs = 6 Disk = 100 AverageSI = 1000 [Groups with 1 CEs] [Rank=2000] aocegrid.uab.es:2119/jobmanager-pbs-workq freeCPUs = 10 [Groups with 2 CEs] [Rank=1500] zeus.cyf-kr.edu.pl:2119/jobmanager-pbs-workq freeCPUs = 2 bee001.ific.uv.es:2119/jobmanager-pbs-workq freeCPUs = 3 [Rank=1000] lngrid02.lip.pt:2129/jobmanager-pbs-workq MPI enabled CE Non-MPI enabled CE Example of resource selection for PACX: There are 5 sites, 1 is not MPI enabled The user asks for 5 CPUs so the possibilites are shown in the animation: 1st-> 1 site for the whole application in pink 2nd -> 2 sites which sum 5 CPUs 3rd -> 2 sites that also sum 5 CPUs The broker will always try to execute using the lower number of sites to avoid the high latency links between them Dublin MPI Course, 10-11 September 2007

Dublin MPI Course, 10-11 September 2007 Time Sharing Grid Resource CrossBroker LRMS MPI JOB Scheduling Agent Condor-G These slides are an schema of what goes on with the glidein. It is not of much interest for the users since this is done transparently and without any notice for them. I think they are quite easy to follow: the mpi job request arrives at the broker, it will select one resource. Starting the application launcher (in pacx-mpi this includes the startupserver), and using CondorG, the CB will submit a job to the resource. This jobs is in fact an agent (glidein) that divides the WN into two virtual machines that can be controlled from the CB One of the VM is used for the MPI Job, that will wait suspended until all the other parts of the job are ready. In the meantime, the CB can do backfilling and use the other VM to execute other jobs. When all the MPI tasks are ready, then the MPI jobs is awaken and continues its execution. Dublin MPI Course, 10-11 September 2007

Dublin MPI Course, 10-11 September 2007 Time Sharing Grid Resource CrossBroker LRMS MPI JOB Scheduling Agent Application Launcher Condor-G Dublin MPI Course, 10-11 September 2007

Dublin MPI Course, 10-11 September 2007 Time Sharing Grid Resource CrossBroker LRMS MPI JOB Scheduling Agent Agent Application Launcher VM1 VM2 Condor-G Dublin MPI Course, 10-11 September 2007

Dublin MPI Course, 10-11 September 2007 Time Sharing Grid Resource CrossBroker LRMS MPI JOB Scheduling Agent Agent Application Launcher VM1 VM2 Condor-G Dublin MPI Course, 10-11 September 2007

Dublin MPI Course, 10-11 September 2007 Time Sharing Grid Resource CrossBroker LRMS Scheduling Agent Agent Application Launcher VM1 VM2 Condor-G MPI TASK Waiting For rest of tasks Dublin MPI Course, 10-11 September 2007

Dublin MPI Course, 10-11 September 2007 Time Sharing Grid Resource CrossBroker JOB LRMS Scheduling Agent Agent Application Launcher VM1 VM2 Condor-G MPI TASK Dublin MPI Course, 10-11 September 2007

Dublin MPI Course, 10-11 September 2007 Time Sharing Grid Resource CrossBroker LRMS Scheduling Agent Agent Application Launcher VM1 VM2 Condor-G JOB MPI TASK BackFilling While the MPI waits Dublin MPI Course, 10-11 September 2007

Dublin MPI Course, 10-11 September 2007 Time Sharing Grid Resource CrossBroker LRMS Scheduling Agent Agent Application Launcher VM1 VM2 Condor-G MPI TASK JOB All tasks Ready! Dublin MPI Course, 10-11 September 2007

Interactive Job Support Scheduling priority Interactive jobs are sent to sites with available machines If there are not available machines, use time sharing Support for interactivity in all kinds of jobs sequential and all the MPI flavors CrossBroker injects intractive agents that enable communication between user and job Transparent to the user Full integration with glogin & gvid Dublin MPI Course, 10-11 September 2007

Interactive Job Support Changes in JDL INTERACTIVE: true/false. Indicates that the job is interactive and the broker should treat it with higher proirity INTERACTIVEAGENT INTERACTIVEAGENTARGUMENTS These attributes specify the command (and its arguments) used to communicate with the user. Dublin MPI Course, 10-11 September 2007

Interactive Job Support Type = "Job"; VirtualOrganisation = "imain"; JobType = "Parallel"; SubJobType = “openmpi"; NodeNumber = 11; Interactive = TRUE; InteractiveAgent = “glogin“; InteractiveAgentArguments = “-r –p 195.168.105.65:23433“; Executable = "test-app"; InputSandbox = {"test-app", "inputfile"}; OutputSanbox = {"std.out", "std.err"}; StdErr = "std.err“; StdOutput = "std.out"; Rank = other.GlueHostBenchmarkSI00 ; Requirements = other.GlueCEStateStatus == "Production"; JDL example of fusion application Dublin MPI Course, 10-11 September 2007

Dublin MPI Course, 10-11 September 2007 Time Sharing Grid Resource CrossBroker INT. JOB LRMS Scheduling Agent Agent Application Launcher VM1 VM2 Condor-G BATCH Same thing as the MPI, but with interactive and batch jobs. Just the final part of the scenario. The batch job can be suspended or lower its priority and this is controlled by the broker Dublin MPI Course, 10-11 September 2007

Time Sharing Grid Resource CrossBroker LRMS Scheduling Agent Agent Application Launcher VM1 VM2 Condor-G INT. JOB BATCH Startup-time Reduction Only one layer involved Dublin MPI Course, 10-11 September 2007

Dublin MPI Course, 10-11 September 2007 Other features Intelligent job retrial disables submission to failing sites temporarily Fast notification of job status better interaction with the application gLite interoperability accepts jobs from gLite's UI able to submit jobs to gLite resources (LCG-CE and gLite CE) Dublin MPI Course, 10-11 September 2007