Discussion on data transfer options for Tier 3 Tier 3(g,w) meeting at ANL ASC – May 19, 2009 Marco Mambelli – University of Chicago

Slides:



Advertisements
Similar presentations
ATLAS Tier-3 in Geneva Szymon Gadomski, Uni GE at CSCS, November 2009 S. Gadomski, ”ATLAS T3 in Geneva", CSCS meeting, Nov 091 the Geneva ATLAS Tier-3.
Advertisements

1 User Analysis Workgroup Update  All four experiments gave input by mid December  ALICE by document and links  Very independent.
 Contributing >30% of throughput to ATLAS and CMS in Worldwide LHC Computing Grid  Reliant on production and advanced networking from ESNET, LHCNET and.
EGEE is a project funded by the European Union under contract IST Using SRM: DPM and dCache G.Donvito,V.Spinoso INFN Bari
Grid data transfer tools Tier 3(g,w) meeting at ANL ASC – May 19, 2009 Marco Mambelli – University of Chicago
1 Bridging Clouds with CernVM: ATLAS/PanDA example Wenjing Wu
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
Summary of issues and questions raised. FTS workshop for experiment integrators Summary of use  Generally positive response on current state!  Now the.
OSG End User Tools Overview OSG Grid school – March 19, 2009 Marco Mambelli - University of Chicago A brief summary about the system.
US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.
Data management at T3s Hironori Ito Brookhaven National Laboratory.
OSG Services at Tier2 Centers Rob Gardner University of Chicago WLCG Tier2 Workshop CERN June 12-14, 2006.
OSG Site Provide one or more of the following capabilities: – access to local computational resources using a batch queue – interactive access to local.
How to Install and Use the DQ2 User Tools US ATLAS Tier2 workshop at IU June 20, Bloomington, IN Marco Mambelli University of Chicago.
G RID M IDDLEWARE AND S ECURITY Suchandra Thapa Computation Institute University of Chicago.
F. Fassi, S. Cabrera, R. Vives, S. González de la Hoz, Á. Fernández, J. Sánchez, L. March, J. Salt, A. Lamas IFIC-CSIC-UV, Valencia, Spain Third EELA conference,
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE middleware: gLite Data Management EGEE Tutorial 23rd APAN Meeting, Manila Jan.
UMD TIER-3 EXPERIENCES Malina Kirn October 23, 2008 UMD T3 experiences 1.
Architecture and ATLAS Western Tier 2 Wei Yang ATLAS Western Tier 2 User Forum meeting SLAC April
CERN Using the SAM framework for the CMS specific tests Andrea Sciabà System Analysis WG Meeting 15 November, 2007.
OSG Tier 3 support Marco Mambelli - OSG Tier 3 Dan Fraser - OSG Tier 3 liaison Tanya Levshina - OSG.
DDM Monitoring David Cameron Pedro Salgado Ricardo Rocha.
Author: Andrew C. Smith Abstract: LHCb's participation in LCG's Service Challenge 3 involves testing the bulk data transfer infrastructure developed to.
1 User Analysis Workgroup Discussion  Understand and document analysis models  Best in a way that allows to compare them easily.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
Open Science Grid (OSG) Introduction for the Ohio Supercomputer Center Open Science Grid (OSG) Introduction for the Ohio Supercomputer Center February.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
A PanDA Backend for the Ganga Analysis Interface J. Elmsheuser 1, D. Liko 2, T. Maeno 3, P. Nilsson 4, D.C. Vanderster 5, T. Wenaus 3, R. Walker 1 1: Ludwig-Maximilians-Universität.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Site Architecture Resource Center Deployment Considerations MIMOS EGEE Tutorial.
December 26, 2015 RHIC/USATLAS Grid Computing Facility Overview Dantong Yu Brookhaven National Lab.
BNL Service Challenge 3 Status Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.
EGEE is a project funded by the European Union under contract IST VO box: Experiment requirements and LCG prototype Operations.
David Adams ATLAS ATLAS distributed data management David Adams BNL February 22, 2005 Database working group ATLAS software workshop.
The CMS Top 5 Issues/Concerns wrt. WLCG services WLCG-MB April 3, 2007 Matthias Kasemann CERN/DESY.
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
Doug Benjamin Duke University. 2 ESD/AOD, D 1 PD, D 2 PD - POOL based D 3 PD - flat ntuple Contents defined by physics group(s) - made in official production.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
US Atlas Tier 3 Overview Doug Benjamin Duke University.
Efi.uchicago.edu ci.uchicago.edu Ramping up FAX and WAN direct access Rob Gardner on behalf of the atlas-adc-federated-xrootd working group Computation.
U.S. ATLAS Facility Planning U.S. ATLAS Tier-2 & Tier-3 Meeting at SLAC 30 November 2007.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
T3g software services Outline of the T3g Components R. Yoshida (ANL)
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
Distributed Analysis Tutorial Dietrich Liko. Overview  Three grid flavors in ATLAS EGEE OSG Nordugrid  Distributed Analysis Activities GANGA/LCG PANDA/OSG.
EGEE is a project funded by the European Union under contract IST Experiment Software Installation toolkit on LCG-2
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
Tier 3 Support and the OSG US ATLAS Tier2/Tier3 Workshop at UChicago August 20, 2009 Marco Mambelli –
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
ATLAS TIER3 in Valencia Santiago González de la Hoz IFIC – Instituto de Física Corpuscular (Valencia)
Gestion des jobs grille CMS and Alice Artem Trunov CMS and Alice support.
Claudio Grandi INFN Bologna Virtual Pools for Interactive Analysis and Software Development through an Integrated Cloud Environment Claudio Grandi (INFN.
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
Acronyms GAS - Grid Acronym Soup, LCG - LHC Computing Project EGEE - Enabling Grids for E-sciencE.
VO Box discussion ATLAS NIKHEF January, 2006 Miguel Branco -
Open Science Grid Consortium Storage on Open Science Grid Placing, Using and Retrieving Data on OSG Resources Abhishek Singh Rana OSG Users Meeting July.
Atlas Tier 3 Overview Doug Benjamin Duke University.
Data management in ATLAS S. Jézéquel LAPP-CNRS-France.
Federating Data in the ALICE Experiment
a brief summary for users
Jean-Philippe Baud, IT-GD, CERN November 2007
BNL Tier1 Report Worker nodes Tier 1: added 88 Dell R430 nodes
Data Challenge with the Grid in ATLAS
Extended OSG client for WLCG
Data Federation with Xrootd Wei Yang US ATLAS Computing Facility meeting Southern Methodist University, Oct 11-12, 2011.
Artem Trunov and EKP team EPK – Uni Karlsruhe
Discussions on group meeting
Presentation transcript:

Discussion on data transfer options for Tier 3 Tier 3(g,w) meeting at ANL ASC – May 19, 2009 Marco Mambelli – University of Chicago

Resources (beside your own)  OSG (Grid client and servers, e.g. SE)  Tier 3 liaison  Site operation  Software tools  ATLAS (DQ2, Compatibility tests, Support)  WLCG-client  DQ2 Clients developers  DDM operation  University?  hardware, cluster management, facilities  Tier3 (Suggest solutins, Support)  ATLAS Tier3 Support 5/19/092M. Mambelli - Tier 3 data transfer discussion

Current data solutions for Tier 3  DQ2 Clients  wide range of platforms supported  easy to install/setup (almost oneliner)  only as reliable as the underlying file transfer  slow: uses unprivileged paths (firewall, low bandwidth, low priority, …) 5/19/09M. Mambelli - Tier 3 data transfer discussion3

Possible data solutions for Tier 3  Improving DQ2 Clients  Build a layer on top of DQ2 Clients  Install DQ2 Site Services or part of it 5/19/09M. Mambelli - Tier 3 data transfer discussion4

Improving DQ2 Clients  use FTS for file transfer  still data sink (invisible to the Grid, no consistency problem)  allow queuing  can use privileged paths  may require FTS channel setups  requires SE (server installed at the Tier 3)  development by DQ2 Client developers  Storage Element  requires a server  tested path for data transfer  data will be copied there  provided by OSG (easy install, possibly VM?) 5/19/09M. Mambelli - Tier 3 data transfer discussion5

Build a layer on top of DQ2 Clients  manage retries  multiple attempts  try different ‘copy tools’  still data sink (invisible to the Grid, no consistency problem)  no server required  unprivileged path  no FTS setup, not distinguishable from other clients 5/19/09M. Mambelli - Tier 3 data transfer discussion6

Install DQ2 Site Services or part of it  Full Site Service installation (Tier 2, Alden’s VM)  Hosted Site Service or DQ2 End Point (Non US ATLAS Clouds)  Something inbetween  installation can be made simple  consistency problem  ‘Data Sink’ Site Service  remove consistency requirement  DQ2 development 5/19/09M. Mambelli - Tier 3 data transfer discussion7

Separate Elements of DQ2 (Site) Server  (TiersOfATLAS ToA)  Storage Element  [managed] disk space  transfer server  DQ2 Endpoint  entry in ToA  FTS channels  DQ2 Site Services  transfer agents  transfer DB  LHC File Catalog (LFC) SE File Transfer (SRM) LFC dCache, xrootd, DRM, … A Site server transfer agent subscription agent ToA topolog y Map A: … B: … ToA topolog y Map A: … B: … transfer DB remote at site cat A 5/19/098M. Mambelli - Tier 3 data transfer discussion

Storage Element  Disk pools  Storage system  DRM, dCache, GPFS, Xrootd  uniform space (file retrieval and naming)  file system abstraction  External access and file transfer  GridFTP (GSI, multiprotocol)  RFT, FTS (reliable)  SRM Storage Resource Management  space management  transfer negotiation and queuing  pinning  File catalog (RLS, LFC, RLI, LFC) 5/19/099M. Mambelli - Tier 3 data transfer discussion

END – Extra follows 5/19/09M. Mambelli - Tier 3 data transfer discussion10

Site ATLAS Distributed Data Management (DQ2) DQ2 dataset catalogs: -files in dataset -where are datasets DQ2 “Queued Transfers” Local File Catalog (LFN ->PFN) File Transfer Service File Transfer Service DQ2 Subscription Agents DQ2 (ATLAS) Grid provided ATLAS provided? Local Storage Remote Storage 5/19/0911M. Mambelli - Tier 3 data transfer discussion

Distributed Data Management (DDM) capabilities  Different setups/components presented  Incremental setups (each one requires the previous)  Resulting configurations (adding the components)  c0ddm: no local managed storage, relying on external SE  c1ddm: SE only (your ATLAS visible files are somewhere else)  c2ddm: DQ2 endpoint and SE (Site services and LFC outsourced) (UofC Tier3)  c3ddm: DQ2 site services + endpoint + SE (LFC outsourced)  c4ddm: LFC + DQ2 ss + endpoint + SE (US Tier2)  Each time the difficulty to add and maintain the additional component is evaluated  Requirements, functionalities and responsibilities are listed 5/19/09M. Mambelli - Tier 3 data transfer discussion12

DIFFICULTY RATING – Easy to install (user application), many supported platform, simple use, documented, strong VDT and ATLAS support No local Storage Element  Copy to the local disks  any local path (also fragmented, user space, …)  DQ2Clients (enduser tools)  dq2-get/dq2-put  limited automatic retry  no automatic replication (conditions)  synchronous (no queuing)  immediate  unprivileged data paths  chaotic  strong support (VDT, ATLAS)  only intermittent outbound connectivity 5/19/09M. Mambelli - Tier 3 data transfer discussion13

DIFFICULTY RATING – Easy to install (user application), many supported platform, simple use, documented, strong VDT and ATLAS support No local Storage Element - FTS  Copy to the local disks  any local path (also fragmented, user space, …)  DQ2Clients (enduser tools – development required)  dq2-get/dq2-put  limited automatic retry  no automatic replication (conditions)  synchronous (no queuing)  immediate  privileged data paths  organized  strong support (VDT, ATLAS)  requires OSG SE, only intermittent outbound connectivity 5/19/09M. Mambelli - Tier 3 data transfer discussion14

Could be very easy, depending on the FS and SE chosen. Some file server are difficult to tune/maintain. You may have them already. OSG/ATLAS do not add load (compared to local use only) OSG Storage Element  Big variety of Storage Elements (SRM/unmanaged)  Gridftp server  dCache installation with SRM and Space Token  (dCache, BeStMan-Xrootd, BeStMan-FS, Gridftp)  Help from OSG (and US-ATLAS)  software package  support (tickets and mailing lists)  You have to do the installation and maintain it functional  specially if you want a DQ2 End Point or more  You have a standard file server 5/19/09M. Mambelli - Tier 3 data transfer discussion15

Excluding the SE, more bureaucratic than technical. Outside support would help DQ2 End Point  Your storage element is visible in DQ2  Can use DQ2 subscriptions to transfer data  You need the support of a Site (Tier2) with  DQ2 Site Services  LFC  Register your SE in TiersOfATLAS  Setup and maintain FTS channel (support and effort from BNL)  Functional and available SE  (May have to support SRM or StorageTokens)  Reliable data management  Your files are known on the Grid through DQ2  Designate a data manager 5/19/09M. Mambelli - Tier 3 data transfer discussion16

? Unknown development effort. Strain on the system (Tier1/2, network) DQ2 “sink” End Point (Subscription only DQ2)  A storage element able to receive subscriptions (in ToA)  one or 2 level staging  The SE is not visible in DQ2 as source of data  no LFC or registrations in central catalog  Can use DQ2 subscriptions to transfer data  No data management responsibilities  Does not exist  unknown development effort  Different scale in the number of endpoints (load?)  unsustainable strain on Tier1/2 (SE, network)  complicate network topology (FTS paths)  huge number of subscription 5/19/09M. Mambelli - Tier 3 data transfer discussion17

DQ2 Site Services  You move your own data  DQ2 agents are running at your Site  Do not rely on other Site being up (for file transfer  Install and configure Site Services  Maintain reliable Site Services  You need a Storage Element and a DQ2 End Point  Still need FTS (support and effort from T1)  You need a LFC  From a Tier2/Tier1 or your own 5/19/09M. Mambelli - Tier 3 data transfer discussion Could require effort: software tested more on LCG, some problems in the past Optimistic? 18

LHC File Catalog  ATLAS migrating from LRC to LFC  complex transition problem  You have your local catalog  No need of a network connection to know which files are at your Site  Generally it makes sense if you have also a SE, DQ2 End Point and DQ2 Site Services  Independent from another Site (Tier2)  Still need FTS (support and effort from T1)  You have to install, configure and maintain LFC  Database back-end, LFC Server program 5/19/09M. Mambelli - Tier 3 data transfer discussion Difficult. LFC, is relatively new, some load problem, maintain consistency 19

More classification examples  US ATLAS (OSG) Tier2, WISC (Tier3)  c4ddm, c2je  LCG Tier2  c3ddm (LFC hosted at supporting Tier1), c2je (gLite)  UofC Tier3  c2ddm, no CE (local PBS queue)  shares some disk (OSG_APP, OSG_GRID) with the Tier2  UIUC, Satellite clusters (UC_ATLAS_Tier2, IU_OSG, OU_ATLAS_SWT2)  c0ddm, c2je 5/19/09M. Mambelli - Tier 3 data transfer discussion20

Documentation  DDM    TiersOfATLAS   LFC   Clients    sHowTo sHowTo 5/19/09M. Mambelli - Tier 3 data transfer discussion21

A Tier3 example  Install OSG Storage Element on top of the data storage and setup a DQ2 End Point (c3ddm)  Big datasets are easily moved with subscriptions  Little effort for the Tier3  Supporting Tier2 and Tier1 can handle most of the effort (and they should handle it easily once the procedure is streamlined)  Install if desired an OSG Computing Element (c2je)  Specially if it has more than 10 computers (cores)  Effort still little  Can receive ATLAS releases  Can use its resources with Panda/pAthena (and can contribute its idle CPU) 5/19/09M. Mambelli - Tier 3 data transfer discussion22

CE elements interacting with Panda  Gatekeeper  receive pilots (authenticate, receive files, schedule)  Worker Nodes  execute pilots and jobs  ex. installation jobs  Shared file systems  host ATLAS releases (APP)  support execution  (SE and LFC)  receive/provide files from/to Panda Mover CE WN APP scratch Gatekeeper SE File Transfer (SRM) LFC remote. at site 5/19/09M. Mambelli - Tier 3 data transfer discussion23

Job execution capabilities  To interact with ATLAS is recommended a workstation where you can run the ATLAS sw (Athena, ROOT, …), Grid client packages (WLCG-Client, gLiteUI, …) and DDM client (DQ2Clients)  To execute jour own (or someone else’s) pAthena jobs you need a system capable to run Panda pilots  c1je: Grid CE  c2je: Grid CE + additional Panda support (Panda site)  In the following pages I’m referring to an OSG CE. LCG/gLite has a similar solution to provide for a CE 5/19/09M. Mambelli - Tier 3 data transfer discussion24

Software is well documented with a large user base, almost easy OSG Computing Element  Help from OSG (and US-ATLAS)  software package  support (tickets and mailing lists)  You have to do the installation (and maintain it functional)  If you don’t give your availability for Panda you have no obligation or responsibility  You can receive ATLAS releases (Grid installation)  You are part of OSG  Can use the grid to reach your cluster, run on it  Can be in Panda/Pathena and in ATLAS accounting 5/19/09M. Mambelli - Tier 3 data transfer discussion25

Panda support 5/19/09M. Mambelli - Tier 3 data transfer discussion Depending on your configuration it may be difficult to support ATLAS requirements (specially if you are behind a firewall)  You can submit pAthena jobs to you own resources  Until GlExec is deployed you cannot discriminate between Panda users  limit external use by not advertising  Satisfy other ATLAS jobs requirements  access external HTTP repositories  access external DBs  run ATLAS sw  You need an OSG Computing Element and a c4ddm (or the support of one  old Panda pilot (no forwarded proxy) could run also on local queues but this requires extra support 26

ATLAS space token token namestorage type used ATLASDATATAPET1D0RAW data, ESD, AOD from re-proc XX ATLASDATADISKT0D1ESD, AOD from dataXXX ATLASMCTAPET1D0HITS from G4, AOD from ATLFAST X ATLASMCDISKT0D1AOD from MCXXX ATLASPRODDISKT0D1buffer for in-and exportX ATLASGROUPDISKT0D1DPDXXX ATLASUSERDISKT0D1User DataXX *) ATLASLOCALGROUP DISK T0D1Local User How data is organized at sites (SEs) 5/19/0927M. Mambelli - Tier 3 data transfer discussion