Presentation on theme: "11th December 2002Tim Adye1 BaBar UK Grid Work Tim Adye Rutherford Appleton Laboratory BaBar Collaboration Meeting SLAC 11 th December 2002."— Presentation transcript:
11th December 2002Tim Adye1 BaBar UK Grid Work Tim Adye Rutherford Appleton Laboratory BaBar Collaboration Meeting SLAC 11 th December 2002
Tim Adye2 Talk Plan RAL Tier A BaBar Job Submission [Janusz Martyniak] Metadata [Alessandra Forti] BaBar UK Grid Facilities [Marc Kelly]
11th December 2002Tim Adye3 BaBar Batch CPU Use at RAL
11th December 2002Tim Adye4 BaBar Batch Users at RAL (running at least one non-trivial job each week)
11th December 2002Tim Adye5 BaBar Job Submission Janusz Martyniak Imperial College
11th December 2002Tim Adye6 Submitter Requirements A job should be submitted only to a site which holds data required to run the job The job should run in the standard BaBar environment ( srtpath, PARENT link etc) A user should be allowed to pass his private environment variables with the job Output should be send back and/or stored on a SE and registered with the RC
11th December 2002Tim Adye7 Job Submitter Steps Data preparation (skimData) User TCL file expansion ( dump ) JDL file(s) creation based on skimData delivered index and user options Submission to the GRID
11th December 2002Tim Adye8 Data Preparation A modified version of the skimData program has been developed by Dave Smith and Alessandra Forti It returns data matching given criteria and creates a file, index The index groups the files into buckets. Each bucket is defined by a list of sites which hold the data
11th December 2002Tim Adye9 Index Example Filenames are named: indexname.number.tcl The example above would result in 4 JDL files and 4 jobs submitted sites: BABAR-RAL,BABAR-IC 00001 00002 00003 sites: BABAR-MAN 00004
11th December 2002Tim Adye10 User TCL File Expansion Requires release 12.2.2 [Asoka De Silva] Reads user TCL file (eg. kangaFilterMicro ), sourcing all files referenced by it The result is a single TCL file The data file list should not be sourced
11th December 2002Tim Adye11 JDL File Creation The JDL file creation process analyses the index and user supplied options and creates a set of JDL files to be submitted to the GRID The options include: The index filename User top TCL file (or expanded TCL file) The executable Environment variables to be passed to the GRID
11th December 2002Tim Adye12 Submission to GRID The index file defines a data file to be used by a job. The data filename is inserted into the expanded TCL file and sourced A wrapper around the user defined executable is created and sent to GRID The wrapper defines the environment on the GRID but relies on the existing BaBar setup The std output and std error are put into the output sandbox as well as any additional user created files (eg. n-tuples)
11th December 2002Tim Adye13 To Do A manual is (almost) ready. The job submitter requires Linux 2.4 in order to run the tcl expansion. Since there is no official EDG UI under 2.4 the expansion and actual submission are done in 2 steps now, but it is planned to combine it later. Which form the should the code (pure Python) be released? rpm or Phython distribution tools [or CVS]?
11th December 2002Tim Adye14 Conclusions The first dumb version of the submitter exists The command line scripts create automatically a set of JDL files based on the skimData created index This ensures that job will be submitted only to a site which actually holds the data required The load balancing is done by the Resource Broker The environment variables may be transferred to the GRID but they are the same for all JDL files The python modules will be distributed as rpm or tar files and installed as third party modules (python Distutils)
11th December 2002Tim Adye15 Metadata etc Alessandra Forti Manchester
11th December 2002Tim Adye16 RLS – Replica Location Service RLS is the EDG/Globus replica location service Will replace the LDAP based replica catalog distributed system based on MySQL (relational database) It consists of two levels of information: LRC (Local Replica Catalog) with local replica or PFN (Physical File Name) information RLI (Replica Location Index) contains pointers to different LRC for each LFN (Logical File Name) The RLS doesnt contain any metadata Can we use this with skimData? instead of remote lookups at each potential site
11th December 2002Tim Adye17 Other Work Alessandra is maintaining the LDAP-based Replica Catalogue (RC) Used from INFN, IN2P3 and SLAC for testing Works in EDG 1.2 will be upgraded today or tomorrow for EDG 1.3 Can now submit jobs using Globus, gsiklog, and AFS works since yesterday! Also plan to install R-GMA in Manchester for testing outside the testbed
11th December 2002Tim Adye18 UK Grid Facilities Marc Kelly Bristol
11th December 2002Tim Adye19 Central Facilities The "Testbed" resoruce broker is being upgraded to 1.4 at Imperial College UK Testbench at RAL is also being upgraded to 1.4 RAL Tier A front-end will follow
11th December 2002Tim Adye20 UK Farms Still have a 1.2 Resource Broker for BaBar usage. This machine claims to have the following registered. gf18.hep.man.ac.uk gm04.hep.ph.ic.ac.uk gw32.hep.ph.ic.ac.uk bfa.hep.ph.ic.ac.uk farm020.hep.phy.cam.ac.uk ce.hep.phy.cam.ac.uk bbr-gate01.slac.stanford.edu bbr-gate02.slac.stanford.edu bbr-gate03.slac.stanford.edu What about the other BaBar UK Farms?
11th December 2002Tim Adye21 UK Farms (cont) Many of the UK farms are or are planning to upgrade, but have been following a moving target 1.3.3, then 1.3.4, now 1.4.0 Need to decide which version we move to and stick to that.