Workload Management Status DIRAC User Meeting Marseille, 29-31 Oct 2012.

Slides:



Advertisements
Similar presentations
Job Submission The European DataGrid Project Team
Advertisements

Development of test suites for the certification of EGEE-II Grid middleware Task 2: The development of testing procedures focused on special details of.
DIRAC Tutorial By R. Graciani Lisbon, Nov 8 th 2012.
Special Jobs Claudio Cherubino INFN - Catania. 2 MPI jobs on gLite DAG Job Collection Parametric jobs Outline.
1 Bridging Clouds with CernVM: ATLAS/PanDA example Wenjing Wu
Bookkeeping data Monitoring info Get jobs Site A Site B Site C Site D Agent Production service Monitoring service Bookkeeping service Agent © Andrei Tsaregorodtsev.
Distributed Computing for CEPC YAN Tian On Behalf of Distributed Computing Group, CC, IHEP for 4 th CEPC Collaboration Meeting, Sep ,
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Special Jobs Matias Zabaljauregui UNLP.
The ATLAS Production System. The Architecture ATLAS Production Database Eowyn Lexor Lexor-CondorG Oracle SQL queries Dulcinea NorduGrid Panda OSGLCG The.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
BESIII distributed computing and VMDIRAC
Grid Initiatives for e-Science virtual communities in Europe and Latin America DIRAC TEAM CPPM – CNRS DIRAC Grid Middleware.
K. Harrison CERN, 20th April 2004 AJDL interface and LCG submission - Overview of AJDL - Using AJDL from Python - LCG submission.
YAN, Tian On behalf of distributed computing group Institute of High Energy Physics (IHEP), CAS, China CHEP-2015, Apr th, OIST, Okinawa.
:: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: GridKA School 2009 MPI on Grids 1 MPI On Grids September 3 rd, GridKA School 2009.
Enabling Grids for E-sciencE Workload Management System on gLite middleware Matthieu Reichstadt CNRS/IN2P3 ACGRID School, Hanoi (Vietnam)
Nadia LAJILI User Interface User Interface 4 Février 2002.
INFSO-RI Enabling Grids for E-sciencE Workload Management System Mike Mineter
1 DIRAC – LHCb MC production system A.Tsaregorodtsev, CPPM, Marseille For the LHCb Data Management team CHEP, La Jolla 25 March 2003.
Javascript Cog Kit By Zhenhua Guo. Grid Applications Currently, most grid related applications are written as separate software. –server side: Globus,
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) GISELA Additional Services Diego Scardaci
Bookkeeping Tutorial. Bookkeeping & Monitoring Tutorial2 Bookkeeping content  Contains records of all “jobs” and all “files” that are created by production.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite job submission Fokke Dijkstra Donald.
Status of the LHCb MC production system Andrei Tsaregorodtsev, CPPM, Marseille DataGRID France workshop, Marseille, 24 September 2002.
Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P.
INFSO-RI Enabling Grids for E-sciencE The gLite Workload Management System Elisabetta Molinari (INFN-Milan) on behalf of the JRA1.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Feb. 06, Introduction to High Performance and Grid Computing Faculty of Sciences,
Distributed Computing for CEPC YAN Tian On Behalf of Distributed Computing Group, CC, IHEP for 4 th CEPC Collaboration Meeting, Sep , 2014 Draft.
Job Management DIRAC Project. Overview  DIRAC JDL  DIRAC Commands  Tutorial Exercises  What do you have learned? KEK 10/2012DIRAC Tutorial.
Author: Andrew C. Smith Abstract: LHCb's participation in LCG's Service Challenge 3 involves testing the bulk data transfer infrastructure developed to.
FRANEC and BaSTI grid integration Massimo Sponza INAF - Osservatorio Astronomico di Trieste.
INFSO-RI Enabling Grids for E-sciencE Workflow Management in Giuseppe La Rocca INFN – Catania ICTP/INFM-Democritos Workshop on Porting.
High-Performance Computing Lab Overview: Job Submission in EDG & Globus November 2002 Wei Xing.
INFSO-RI Enabling Grids for E-sciencE Job Workflows with gLite Emidio Giorgio INFN NA4 Generic Applications Meeting 10 January 2006.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
Transformation System report Luisa Arrabito 1, Federico Stagni 2 1) LUPM CNRS/IN2P3, France 2) CERN 5 th DIRAC User Workshop 27 th – 29 th May 2015, Ferrara.
1 DIRAC Job submission A.Tsaregorodtsev, CPPM, Marseille LHCb-ATLAS GANGA Workshop, 21 April 2004.
The GridPP DIRAC project DIRAC for non-LHC communities.
AliEn central services Costin Grigoras. Hardware overview  27 machines  Mix of SLC4, SLC5, Ubuntu 8.04, 8.10, 9.04  100 cores  20 KVA UPSs  2 * 1Gbps.
DIRAC Pilot Jobs A. Casajus, R. Graciani, A. Tsaregorodtsev for the LHCb DIRAC team Pilot Framework and the DIRAC WMS DIRAC Workload Management System.
Development of test suites for the certification of EGEE-II Grid middleware Task 2: The development of testing procedures focused on special details of.
Global ADC Job Monitoring Laura Sargsyan (YerPhI).
DIRAC Project A.Tsaregorodtsev (CPPM) on behalf of the LHCb DIRAC team A Community Grid Solution The DIRAC (Distributed Infrastructure with Remote Agent.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using WMProxy advanced job submission.
ATLAS FroNTier cache consistency stress testing David Front Weizmann Institute 1September 2009 ATLASFroNTier chache consistency stress testing.
The GridPP DIRAC project DIRAC for non-LHC communities.
Active-HDL Server Farm Course 11. All materials updated on: September 30, 2004 Outline 1.Introduction 2.Advantages 3.Requirements 4.Installation 5.Architecture.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
Advanced gLite job management Paschalis Korosoglou, AUTH/GRNET EPIKH Application Porting School 2011 Beijing, China Paschalis Korosoglou,
LHCb/DIRAC week A.Tsaregorodtsev, CPPM 7 April 2011.
Core and Framework DIRAC Workshop October Marseille.
DIRAC for Grid and Cloud Dr. Víctor Méndez Muñoz (for DIRAC Project) LHCb Tier 1 Liaison at PIC EGI User Community Board, October 31st, 2013.
Job Management Beijing, 13-15/11/2013. Overview Beijing, /11/2013 DIRAC Tutorial2  DIRAC JDL  DIRAC Commands  Tutorial Exercises  What do you.
1 DIRAC Project Status A.Tsaregorodtsev, CPPM-IN2P3-CNRS, Marseille 10 March, DIRAC Developer meeting.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) Advanced Job Riccardo Rotondo
1 Building application portals with DIRAC A.Tsaregorodtsev, CPPM-IN2P3-CNRS, Marseille 27 April 2010, Journée LuminyGrid, Marseille.
Multi-community e-Science service connecting grids & clouds R. Graciani 1, V. Méndez 2, T. Fifield 3, A. Tsaregordtsev 4 1 University of Barcelona 2 University.
EGI-InSPIRE RI EGI Hands On Training for AEGIS Users EGI-InSPIRE N G I A E G I S EGI Hands On Training for AEGIS Users Institute of Physics.
Distributed computing and Cloud Shandong University (JiNan) BESIII CGEM Cloud computing Summer School July 18~ July 23, 2016 Xiaomei Zhang 1.
FESR Trinacria Grid Virtual Laboratory Practical using WMProxy advanced job submission Emidio Giorgio INFN Catania.
AWS Integration in Distributed Computing
Design rationale and status of the org.glite.overlay component
The gLite Workload Management System
Alexandre Duarte CERN Fifth EELA Tutorial Santiago, 06/09-07/09,2006
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
5. Job Submission Grid Computing.
Job Management with DATA
Simulation use cases for T2 in ALICE
gLite Advanced Job Management
Job Description Language (JDL)
Presentation transcript:

Workload Management Status DIRAC User Meeting Marseille, Oct 2012

 Recent updates  Coming next  And then? Marseille - 30/12/20122

Parametric Jobs Marseille - 30/12/20123  A parametric job consists in submission of a set of jobs where only one parameter make the difference between the jobs.  The job parameters are defined as the JDL attribute “Parameter”. It can take the following values:  A list (strings or numbers).  Or, an integer or float specifying the number of parameters to generate, in this case the JDL attributes ParameterStart and ParameterStep/ParameterFactor must be defined in order to create the sequence of values:  P 0 = ParameterStart  P i = P i-1 *ParameterFactor + ParameterStep

Parametric Jobs Marseille - 30/12/20124  Parameter value:  At job submission time, the value of the parameter for each job is determined and the %s placeholder can be used in JDL attributes.  Parameter number:  It can be represented in JDL by %n

Parametric Job - JDL Marseille - 30/12/20125 Executable = "testParametricJob.sh"; JobName = ”Parametric_%n"; Arguments = "%s"; Parameters = 20; ParameterStart = 0; ParameterStep = 0.02; ParameterFactor = 1; StdOutput = "StdOut_%n"; StdError = "StdErr_%n"; InputSandbox = {"testJob.sh"}; OutputSandbox = {"StdOut_%n","StdErr_%n"}; Placeholder replaced by Parameter value for each job Placeholder replaced by Parameter Number value for each job

Parametric Job - JDL Hanoi - 25/10/2011ACGRID6 Add parameters Parameters in JDL

Optimizers  From Agents to Executors  Change the processing of new jobs in the system  Agents:  Poll the DB, take an action, updated the DB  And this for 5-6 Optimizers that are needed to prepare a new jobs for execution  Executors:  Register a capability on a central server  The central server distribute the actions among available executors and at the end update the DB Marseille - 30/12/20127

Optimizers (II)  The preparation of jobs becomes “data driven”, triggered by the submission by the user.  Advantages:  Reduce queries to the DB backend  Reduce the response time (no wait time involved).  Single point of access to the DB, can optimize access to DB.  Additionally the OptimizationMind (the server) has been prepared to support parallel processing of jobs by the Optimizers (executors). Marseille - 30/12/20128

On the CERN Integration Setup Marseille - 30/12/20129

10

On the Belle II Setup Marseille - 30/12/201211

Marseille - 30/12/201212

Observations  Using short parametric, a “single box” installation can support sustained rates over 8 Job/s:  Submission  Optimizing  Matching  Executing  The JobStateUpate service and the DB becomes the limiting factor. Marseille - 30/12/201213

Integration of multi-clouds Marseille - 30/12/  Back in 2010 the integration of Amazon EC2 with gLite Grid resources using DIRAC was proven on a a Belle MC campaign.  Clouds are becoming more common and can be a convenient solution in some cases.  There is an increasing number of solutions and flavors.  VMDIRAC defines a paradigm for a flexible integration of this type of (CPU) resources.

Coming next Marseille - 30/12/  Scheduling to Storage Elements rather than Sites.  Pending from the Resource reorganization on the Configuration (also important for multi-VO installations).  Simplified version of Workflows  Current DIRAC workflow objects are too complex for 99% of the use cases.  Need to be integrated with Parametric Jobs.

And beyond that … Marseille - 30/12/  Refurbishing of WMS backend  State Machine  Handling collections of jobs  Separate “active” and “halted” jobs  JDL’s  Job Parameters  Improve response time

Summary Marseille - 30/12/  WMS is one of the central pieces of DIRAC functionality.  Recent improvements have addressed performance issues.  Next steps will go into improving usability.  And then back to performance.