F. Fassi, S. Cabrera, R. Vives, S. González de la Hoz, Á. Fernández, J. Sánchez, L. March, J. Salt, A. Lamas IFIC-CSIC-UV, Valencia, Spain Third EELA conference,

Similar presentations


Presentation on theme: "F. Fassi, S. Cabrera, R. Vives, S. González de la Hoz, Á. Fernández, J. Sánchez, L. March, J. Salt, A. Lamas IFIC-CSIC-UV, Valencia, Spain Third EELA conference,"— Presentation transcript:

1 F. Fassi, S. Cabrera, R. Vives, S. González de la Hoz, Á. Fernández, J. Sánchez, L. March, J. Salt, A. Lamas IFIC-CSIC-UV, Valencia, Spain Third EELA conference, 4 December 2007, Catania, Italy

2 Outline:  ATLAS Grid computing and facilities  Distributed Analysis model in ATLAS  GANGA overview  IFIC Tier-2 infrastructure  Resources and services  Data transfers  Job priority  A demonstration of using GANGA  Conclusion

3 ATLAS Grid computing  Atlas computing based on hierarchical architecture of Tiered  Atlas computing operates uniformly on heterogeneous grid environment based on three grid infrastructures  Grids have different middle-ware, replica catalogs and tools to submit jobs CERN Tier-1s Tier-2s

4 ATLAS facilities Event Filter Farm at CERN Located near the Experiment, assembles data into a stream to the Tier-0 Tier-0 at CERN Derive first pass calibration with 24 hours Reconstructed of the data keeping up with data taking Tier-1s distributed worldwide (10 centers) Reprocessing of full data with improved calibrations 2 months after data taking Managed Tape Access: RAW, ESD Disk Access: AOD, Fraction of ESD Tier2s distributed worldwide ( 40 + centers) MC Simulation, producing ESD, AOD User physics analysis, Disk Store (AOD) CERN Analysis Facility Primary purpose : Calibration Limited access to ESD and RAW Tier-3s distributed worldwide. Physics analysis

5 Distributed Analysis model in ATLAS The Distributed Analysis model is based on the ATLAS computing model: Data for analysis will be available distributed on all Tier-1 and Tier-2 centers  Tier-2s are open for analysis jobs  The computing model foresees 50 % of grid resources to be allocated for analysis User jobs are sent to data  large input datasets (100 GB up to several TB) Results must be made available to the user (NTuples or similar) Data is added with meta-data and bookkeeping in catalogs

6 ATLAS strategy is based on making use of the whole resources  Solution must deal with the challenge of the heterogeneous grid infrastructures NorduGrid:  backend for ARC submission is integrated OSG/Panda:  Recently integrated a backend for Panda  GANGA front-end supports all ATLAS Grid flavors Distributed Analysis model in ATLAS

7 GANGA overview

8 The idea behind GANGA The naive idea of submitting jobs to Grid assume the following steps:  Prepare the “Job Description Language” file for job configuration  Find suitable Athena software application  Locate the datasets on different storage elements  Job splitting, monitoring and book-keeping GANGA combines several different components providing a front-end client for interacting with grid infrastructures  It is a user-friendly job definition and management tool  Allows simple switching between testing on a local batch system and large-scale data processing on Grid distributed resources

9 GANGA features  GANGA is based on a simple, but flexible, job abstraction  A job is constructed from a set of building blocks, not all required for each job Support for several applications:  Generic Executable  ATLAS Athena software  Root Support for several back-ends:  LCG/gLite Resource Broker  OSG/ PANDA  Nordugrid/ ARC Middleware  Batch (LSF, PBS, etc)

10 IFIC Tier-2 infrastructure

11 Equipments: ( Santiago Gonzalez de la Hoz´ talk)  CPU  132 KSi2k  Disk  34TB Disk  Tape  tape robot of 140 TB Services:  2 SRM Interface, 2 CE, 4 UI, 1 BDII, 1 RB  1 PROXY,1 MON, 2 GridFTP, 2 QUATTOR  QUATTOR: install and configure the resources Network:  Connectivity from the site to network is about one Gpbs  The facility serves the dual purpose of producing simulated data and analysis data Resources and services Racks Robot

12 Data Transfers (I) Data management is a crucially aspect of Distributed Analysis  Managed by DDM system  known as DQ2 Data is being distributed to Tier-1 and Tier-2 centers for analysis  Through several exercises organized by the ATLAS collaboration IFIC is participating in this effort with the aim to:  Have datasets available at site IFIC for analysis  Test the functionality and performance of data transfer mechanisms IFIC contribution in the data transfer activities is the following:  SC4 (Service Challenge 4; October 2006)  Functional Tests (August 2007)  M4: Comics' Run: August 23 – September 3  M5 :Comics' Run scheduled for October 16-23

13 Data Transfers (II)  The datasets exported to IFIC is store in the Lustre based Storage Element  They are available in distributed manner through:  Registering in the Local LFC catalog  Archiving thought-out the whole grid using the DDM central catalog In addition:  information on the stored datasets is provided by the IFIC web page: http://ific.uv.es/atlas-t2-es/ific/main.html

14 Job Priority Analysis jobs perform in parallel to the production jobs Need a mechanism to steer the resource consumption of the ATLAS community Job Priority  Objective: allow enforcement of job priorities based on VOMS groups/roles, using the Priorities Working Group schema Development and deployment done at IFIC : ● Define local mappings for groups/roles and Fair Share (FS)  atlas:atlas  50 % of all Atlas VO users  atlb:/atlas/Role=production  50 % of Atlas production activity  atlc:/atlas/Role=software  no FS (but more priority, only 1 job at a time)  atld:/atlas/Role=lcgadmin  no FS (but more priority, only 1 job at a time)

15 A demonstration of using GANGA

16 Introduction Objective  Testing IFIC Tier-2 analysis facility  Producing Top N-tuples from large ttbar dataset (6M events, ~ 217 GB )  Benefit to perform Top analysis studies Requirement:  Fast and easy large scale production  grid environment  Runs everywhere  use ATLAS resources  Easy user interface  Hide the grid infrastructure Our setup:  GANGA version 4.4.2  Athena 12.0.6, TopView-00-12-13-03  EventView Group Area 12.0.6.8 GANGA

17 Observations and issues Before sending jobs to Grid some operations have been done: Find out where the dataset is complete (this dataset has 2383 files) Be sure that the selected site is a good one. Jobs are sent correctly to the selected sites (good ones with complete replica) General issues:  In general, jobs fail even on good sites  Re-submits are necessary until successful  GANGA submission failure due to missing CE-SE correspondence  Often the jobs fail because the site on which they are executed is not properly configured  Speed issue in submitting sub-jobs using LCG RG  WMS gLite bulk submission  At some sites jobs end up in Long queue  job priority missing  Currently no solution  kill and restart again the jobs

18 Performance General:  Jobs at IFIC are finished within 1-2 hours  fast execution 1h to run 1M events  Some jobs were ran successfully also in other sites (Lyon, FZK)  Very high efficiency running on those sites where the dataset is available and no site configuration problem

19 Results  Some re-combining output N-tuples were analyzed with the Root framework for reconstructing the top quark mass from the hadronic decay

20 Conclusion  Experience in configuring and deploying IFIC site is shown  Lesson learned from using GANGA:  GANGA is a Lightweight  easy grid job submission tool  GANGA performs a great job in configuring, splitting jobs, scheduling input and output files  The Distributed Analysis using GANGA depends strongly on the data distribution and sites quality configuration  Speed of submission was a major issue with LCG RB  need of WMS gLite deployment  Bulk submission feature


Download ppt "F. Fassi, S. Cabrera, R. Vives, S. González de la Hoz, Á. Fernández, J. Sánchez, L. March, J. Salt, A. Lamas IFIC-CSIC-UV, Valencia, Spain Third EELA conference,"

Similar presentations


Ads by Google