A proposal for standardizing the working environment for a LCG/EGEE job David Bouvet - Grid Computing team - CCIN2P3 HEPIX Karlsruhe 13/05/2005.

Slides:



Advertisements
Similar presentations
29 June 2006 GridSite Andrew McNabwww.gridsite.org VOMS and VOs Andrew McNab University of Manchester.
Advertisements

12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR E-infrastructure shared between Europe and Latin America CE + WN installation and configuration.
EGEE-II INFSO-RI Enabling Grids for E-sciencE The gLite middleware distribution OSG Consortium Meeting Seattle,
4/2/2002HEP Globus Testing Request - Jae Yu x Participating in Globus Test-bed Activity for DØGrid UTA HEP group is playing a leading role in establishing.
CERN LCG Overview & Scaling challenges David Smith For LCG Deployment Group CERN HEPiX 2003, Vancouver.
Plateforme de Calcul pour les Sciences du Vivant SRB & gLite V. Breton.
Development of test suites for the certification of EGEE-II Grid middleware Task 2: The development of testing procedures focused on special details of.
PBSpro Advanced Information Systems & Technology Advanced Campus Services Prepared by Chao “Bill” Xie, PhD student Computer Science Fall 2005.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Services Abderrahman El Kharrim
Report on Hepix Spring 2005 Forschungszentrum Karlsruhe 9-13 May Storage and data management Batch scheduling workshop May.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Supporting MPI applications on the EGEE Grid.
OSG End User Tools Overview OSG Grid school – March 19, 2009 Marco Mambelli - University of Chicago A brief summary about the system.
The EDG Testbed Deployment Details The European DataGrid Project
Physicists's experience of the EGEE/LCG infrastructure usage for CMS jobs submission Natalia Ilina (ITEP Moscow) NEC’2007.
SA1 / Operation & support Enabling Grids for E-sciencE Integration of heterogeneous computational resources in.
Ninth EELA Tutorial for Users and Managers E-infrastructure shared between Europe and Latin America LFC Server Installation and Configuration.
OSG Site Provide one or more of the following capabilities: – access to local computational resources using a batch queue – interactive access to local.
Grid Deployment Data challenge follow-up & lessons learned Ian Bird LCG Deployment Area Manager LHCC Comprehensive Review 22 nd November 2004.
1 BIG FARMS AND THE GRID Job Submission and Monitoring issues ATF Meeting, 20/06/03 Sergio Andreozzi.
:: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: GridKA School 2009 MPI on Grids 1 MPI On Grids September 3 rd, GridKA School 2009.
Cracow Grid Workshop Grid Software Installation Tools
How to Install and Use the DQ2 User Tools US ATLAS Tier2 workshop at IU June 20, Bloomington, IN Marco Mambelli University of Chicago.
11/30/2007 Overview of operations at CC-IN2P3 Exploitation team Reported by Philippe Olivero.
Grid job submission using HTCondor Andrew Lahiff.
DIRAC Review (13 th December 2005)Stuart K. Paterson1 DIRAC Review Exposing DIRAC Functionality.
David Adams ATLAS ADA, ARDA and PPDG David Adams BNL June 28, 2004 PPDG Collaboration Meeting Williams Bay, Wisconsin.
Architecture and ATLAS Western Tier 2 Wei Yang ATLAS Western Tier 2 User Forum meeting SLAC April
Enabling Grids for E-sciencE Introduction Data Management Jan Just Keijser Nikhef Grid Tutorial, November 2008.
Maarten Litmaath (CERN), GDB meeting, CERN, 2006/02/08 VOMS deployment Extent of VOMS usage in LCG-2 –Node types gLite 3.0 Issues Conclusions.
Accounting in LCG Dave Kant CCLRC, e-Science Centre.
Virtual Batch Queues A Service Oriented View of “The Fabric” Rich Baker Brookhaven National Laboratory April 4, 2002.
T3 analysis Facility V. Bucard, F.Furano, A.Maier, R.Santana, R. Santinelli T3 Analysis Facility The LHCb Computing Model divides collaboration affiliated.
INFSO-RI Enabling Grids for E-sciencE gLite Data Management and Interoperability Peter Kunszt (JRA1 DM Cluster) 2 nd EGEE Conference,
Stephen Burke – Data Management - 3/9/02 Partner Logo Data Management Stephen Burke, PPARC/RAL Jeff Templon, NIKHEF.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
Portal Update Plan Ashok Adiga (512)
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Priorities update Andrea Sciabà IT/GS Ulrich Schwickerath IT/FIO.
Glite. Architecture Applications have access both to Higher-level Grid Services and to Foundation Grid Middleware Higher-Level Grid Services are supposed.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Site Architecture Resource Center Deployment Considerations MIMOS EGEE Tutorial.
CE: compute element TP: CE & WN Compute Element Worker Node Installation configuration.
SAM Sensors & Tests Judit Novak CERN IT/GD SAM Review I. 21. May 2007, CERN.
A step towards interoperability (between Int.EU.Grid and EGEE Grid infrastructures) Gonçalo Borges, Jorge Gomes LIP on behalf of Int.EU.Grid Collaboration.
Service Availability Monitor tests for ATLAS Current Status Tests in development To Do Alessandro Di Girolamo CERN IT/PSS-ED.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Command Line Grid Programming Spiros Spirou Greek Application Support Team NCSR “Demokritos”
Liudmila Stepanova SINP-MSU team Testing gLite Worker Nodes LCG Dubna, Jul 26, 2007.
CERN Running a LCG-2 Site – Oxford July - 1 LCG2 Administrator’s Course Oxford University, 19 th – 21 st July Developed.
Accounting in LCG/EGEE Can We Gauge Grid Usage via RBs? Dave Kant CCLRC, e-Science Centre.
The ATLAS Strategy for Distributed Analysis on several Grid Infrastructures D. Liko, IT/PSS for the ATLAS Distributed Analysis Community.
SAM Status Update Piotr Nyczyk LCG Management Board CERN, 5 June 2007.
12th EELA Tutorial for Users and Managers E-infrastructure shared between Europe and Latin America LFC Server Installation and Configuration.
GLite WN Installation Giuseppe LA ROCCA INFN Catania ACGRID-II School 2-14 November 2009 Kuala Lumpur - Malaysia.
II EGEE conference Den Haag November, ROC-CIC status in Italy
INFSO-RI Enabling Grids for E-sciencE Padova site report Massimo Sgaravatto On behalf of the JRA1 IT-CZ Padova group.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America LFC Server Installation and Configuration.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Enabling Grids for E-sciencE Claudio Cherubino INFN DGAS (Distributed Grid Accounting System)
YAIM Optimized Cristina Aiftimiei – Sergio Traldi
David Bouvet Fabio Hernandez IN2P3 Computing Centre - Lyon
Andreas Unterkircher CERN Grid Deployment
lcg-infosites documentation (v2.1, LCG2.3.1) 10/03/05
The CREAM CE: When can the LCG-CE be replaced?
Grid User Interface Giuliano Taffoni.
The gLite API – Part II Giuseppe LA ROCCA ACGRID-II School
Artem Trunov and EKP team EPK – Uni Karlsruhe
Pierre Girard ATLAS Visit
Overview of gLite Middleware
Grid Management Challenge - M. Jouvin
Site availability Dec. 19 th 2006
Installation/Configuration
Presentation transcript:

A proposal for standardizing the working environment for a LCG/EGEE job David Bouvet - Grid Computing team - CCIN2P3 HEPIX Karlsruhe 13/05/2005

David Bouvet – HEPIX Karlsruhe 13/05/ Motivation Problem raised some months ago by Jeff Templon: –D0 jobs encountered problems at Lyon due to different use of environment variables to address scratch/temp disk space Standard is defined for: –Environment Variables « IEEE Std , 2004 POSIX Part 1: Base definitions, Amendment 8 »  among which: HOME, PATH, PWD, SHELL, TMPDIR, USER –Batch Environment Services « IEEE Std , 2004 POSIX Part 2: Shell and Utilities, Amendment 1 »  PBS_ENVIRONMENT, PBS_JOBID, PBS_JOBNAME, PBS_QUEUE  PBS_O_HOME, PBS_O_HOST, PBS_O_LOGNAME, PBS_O_PATH, PBS_O_QUEUE, PBS_O_SHELL, PBS_O_WORKDIR  these variables are not directly used by the jobs There is no standard definition of environment variables for grid batch jobs  Proposal for LCG/EGEE sites of a common definition of minimal set of environment variables for grid batch jobs

David Bouvet – HEPIX Karlsruhe 13/05/ Current status through several batch used on the grid Environment variables for grid batch job have been checked on several LCG/EGEE sites (among which all the LCG T1s) Conditions of test: ATLAS VO, short queue Batch system CEs distribution # CEs checked BQS32 CONDOR43 TORQUE7211 PBS3613 LSF54

David Bouvet – HEPIX Karlsruhe 13/05/ Current status: POSIX variables  : defined  : not defined on some sites  not all these variables are defined on the various batch systems VariableBQSCONDORTORQUEPBSLSF POSIX basic: HOME PATH PWD SHELL TMPDIR USER         POSIX batch 

David Bouvet – HEPIX Karlsruhe 13/05/ Current status (cont.)  : defined  : not defined on some sites  even for Globus, not all the sites define the same set of environment variables. VariableBQSCONDORTORQUEPBSLSF GLOBUS variables: GLOBUS_LOCATION GLOBUS_PATH GLOBUS_TCP_PORT_RANGE X509_USER_PROXY       MYPROXY_SERVER (useful for proxy renewal)     

David Bouvet – HEPIX Karlsruhe 13/05/ Current status: LCG environment variables (middleware related) (list from the LCG Users Guide) VariableDefinitionBQSCONDORTORQUEPBSLSF EDG_LOCATION Base of the installed EDG software    LCG_LOCATION Base of the installed LCG software    EDG_WL_JOBID Job ID (for a running job) in a WN  EDG_WL_LOCATION Base of the EDG’s WMS software  EDG_WL_PATH Path for EDG’s WMS commands  EDG_WL_RB_BROKERINFO Location of the.BrokerInfo file in a WN  LCG_GFAL_INFOSYS Location of the BDII for lcg-utils and GFAL    LCG_CATALOG_TYPE Type of file catalog used (edg or lfc) for lcg-utils and GFAL  LFC_HOST Location of the LFC catalog (only for catalog type lfc) 

David Bouvet – HEPIX Karlsruhe 13/05/ Current status: LCG environment variables (job related) (list from the LCG Users Guide) VariableDefinitionBQSCONDORTORQUEPBSLSF EDG_TMP Temp directory    LCG_TMP Temp directory    VO_ _DEFAULT_SE Default SE defined for a CE in a WN    VO_ _SW_DIR Base directory of the VO’s software in a WN    possible uniformization to POSIX name: TMPDIR ?

David Bouvet – HEPIX Karlsruhe 13/05/ Current status: gLite environment variables gLite environment variables on WN (in config. files and scripts) from gLite installation guide: –GLITE_LOCATION /opt/glite –GLITE_LOCATION_VAR /var/glite –GLITE_LOCATION_LOG /var/log/glite –GLITE_LOCATION_TMP /tmp/glite GLITE_LOCATION_TMP  another tmp directory to clean!

David Bouvet – HEPIX Karlsruhe 13/05/ Proposal for standardization Variable type DefinitionName POSIX Home directory of job user on WN HOME Temp directory TMPDIR (currently LCG_TMP, EDG_TMP, GLITE_LOCATION_TMP) PWD SHELL PATH Grid batch jobs Job working directory on WN GRID_WORKDIR Site name on which the job run (same as siteName in Information Provider) GRID_SITENAME WN hostname on which the job run GRID_HOSTNAME CE and queue names on which the job run (same as GlueCEUniqueID in Information Provider) GRID_CEID Job ID in local batch system GRID_LOCAL_JOBID Job ID on grid GRID_GLOBAL_JOBID (currently EDG_WL_JOBID) User’s DN of certificate GRID_USERID

David Bouvet – HEPIX Karlsruhe 13/05/ Proposal for standardization (cont.) Use of POSIX variable when existing –TMPDIR: POSIX variable which can replace LCG_TMP, EDG_TMP, GLITE_LOCATION_TMP –HOME: MPI jobs need a home directory

David Bouvet – HEPIX Karlsruhe 13/05/ Proposal for standardization (cont.) Minimal set of environment variable (not related to middleware). The naming convention must be independant of grid middleware name for grid job portability –GRID_WORKDIR –GRID_WORKDIR: work directory specific to the job (unix right 700) e.g.: /scratch/atlas ccwl0092 –GRID_SITENAME –GRID_SITENAME: to know on which site the job run (same as siteName in the Information System) e.g.: IN2P3-CC –GRID_HOSTNAME –GRID_HOSTNAME: could be useful to know the WN hostname for problem tracking (and parallel jobs?) e.g.: ccwl0006.in2p3.fr –GRID_CEID –GRID_CEID: CE and queue names on which the job run (same as GlueCEUniqueID in Information System) e.g.: heplnx201.pp.rl.ac.uk:2119/jobmanager-torque-short –GRID_LOCAL_JOBID –GRID_LOCAL_JOBID: useful for problem tracking (and parallel jobs?) lcg e.g.: lcg –GRID_GLOBAL_JOBID –GRID_GLOBAL_JOBID: same as EDG_WL_JOBID for LCG e.g.: –GRID_USERID –GRID_USERID: DN of user’s certificate (already exists on some sites) /e.g.: /O=GRID-FR/C=FR/O=CNRS/OU=CC-LYON/CN=David

David Bouvet – HEPIX Karlsruhe 13/05/ Proposal for standardization (cont.) When agreed on a set of variables and a naming convention, this standard should be implemented on all LCG/EGEE CEs. Based on today’s discussion, a document will be distributed to site administrators and applications. A possible deadline for discussion and beginning of deployment: end of June

David Bouvet – HEPIX Karlsruhe 13/05/ Proposal for standardization (discussion) Variable type DefinitionName Agreement on POSIX Home directory of job user on WN HOME Temp directory TMPDIR (currently LCG_TMP, EDG_TMP, GLITE_LOCATION_TMP) Grid batch jobs Job working directory on WN GRID_WORKDIR Site name on which the job run (same as siteName in Information Provider) GRID_SITENAME WN hostname on which the job run GRID_HOSTNAME CE and queue names on which the job run (same as GlueCEUniqueID in Information Provider) GRID_CEID Job ID in local batch system GRID_LOCAL_JOBID Job ID on grid GRID_GLOBAL_JOBID (currently EDG_WL_JOBID) User’s DN of certificate GRID_USERID