Storage Interfaces and Access: Interim report Wahid Bhimji University of Edinburgh On behalf of WG: Brian Bockelman, Philippe Charpentier, Simone Campana,

Slides:



Advertisements
Similar presentations
Storage Workshop Summary Wahid Bhimji University Of Edinburgh On behalf all of the participants…
Advertisements

Jens G Jensen CCLRC/RAL hepsysman 2005Storage Middleware SRM 2.1 issues hepsysman Oxford 5 Dec 2005.
Wahid Bhimji SRM; FTS3; xrootd; DPM collaborations; cluster filesystems.
LCG Tiziana Ferrari - SC3: INFN installation status report 1 Service Challenge Phase 3: Status report Tiziana Ferrari on behalf of the INFN SC team INFN.
Data Management TEG Status Dirk Duellmann & Brian Bockelman WLCG GDB, 9. Nov 2011.
1 User Analysis Workgroup Update  All four experiments gave input by mid December  ALICE by document and links  Very independent.
Storage: Futures Flavia Donno CERN/IT WLCG Grid Deployment Board, CERN 8 October 2008.
Data & Storage Management TEGs Summary of recommendations Wahid Bhimji, Brian Bockelman, Daniele Bonacorsi, Dirk Duellmann GDB, CERN 18 th April 2012.
Filesytems and file access Wahid Bhimji University of Edinburgh, Sam Skipsey, Chris Walker …. Apr-101Wahid Bhimji – Files access.
LHCC Comprehensive Review – September WLCG Commissioning Schedule Still an ambitious programme ahead Still an ambitious programme ahead Timely testing.
StoRM Some basics and a comparison with DPM Wahid Bhimji University of Edinburgh GridPP Storage Workshop 31-Mar-101Wahid Bhimji – StoRM.
SRM 2.2: tests and site deployment 30 th January 2007 Flavia Donno, Maarten Litmaath IT/GD, CERN.
Tier 3 Data Management, Tier 3 Rucio Caches Doug Benjamin Duke University.
LHCb input to DM and SM TEGs. Remarks to DM and SM TEGS Introduction m We have already provided some input during our dedicated session of the TEG m Here.
PhysX CoE: LHC Data-intensive workflows and data- management Wahid Bhimji, Pete Clarke, Andrew Washbrook – Edinburgh And other CoE WP4 people…
Storage Wahid Bhimji DPM Collaboration : Tasks. Xrootd: Status; Using for Tier2 reading from “Tier3”; Server data mining.
Grid Lab About the need of 3 Tier storage 5/22/121CHEP 2012, The need of 3 Tier storage Dmitri Ozerov Patrick Fuhrmann CHEP 2012, NYC, May 22, 2012 Grid.
Δ Storage Middleware GridPP10 What’s new since GridPP9? CERN, June 2004.
Your university or experiment logo here Storage and Data Management - Background Jens Jensen, STFC.
Wahid, Sam, Alastair. Now installed on production storage Edinburgh: srm.glite.ecdf.ed.ac.uk  Local and global redir work (port open) e.g. root://srm.glite.ecdf.ed.ac.uk//atlas/dq2/mc12_8TeV/NTUP_SMWZ/e1242_a159_a165_r3549_p1067/mc1.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE middleware: gLite Data Management EGEE Tutorial 23rd APAN Meeting, Manila Jan.
Data and Storage Evolution in Run 2 Wahid Bhimji Contributions / conversations / s with many e.g.: Brian Bockelman. Simone Campana, Philippe Charpentier,
MW Readiness Verification Status Andrea Manzi IT/SDC 21/01/ /01/15 2.
CERN IT Department CH-1211 Geneva 23 Switzerland GT WG on Storage Federations First introduction Fabrizio Furano
GLUE 2 Open Issues in Storage Information Providers 16 th May 2014.
Owen SyngeTitle of TalkSlide 1 Storage Management Owen Synge – Developer, Packager, and first line support to System Administrators. Talks Scope –GridPP.
1 User Analysis Workgroup Discussion  Understand and document analysis models  Best in a way that allows to compare them easily.
INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Pre-GDB Storage Classes summary of discussions Flavia Donno Pre-GDB.
WLCG Grid Deployment Board, CERN 11 June 2008 Storage Update Flavia Donno CERN/IT.
WebFTS File Transfer Web Interface for FTS3 Andrea Manzi On behalf of the FTS team Workshop on Cloud Services for File Synchronisation and Sharing.
Storage Federations and FAX (the ATLAS Federation) Wahid Bhimji University of Edinburgh.
Jens G Jensen RAL, EDG WP5 Storage Element Overview DataGrid Project Conference Heidelberg, 26 Sep-01 Oct 2003.
Your university or experiment logo here The Protocol Zoo A Site Presepective Shaun de Witt, STFC (RAL)
SLACFederated Storage Workshop Summary For pre-GDB (Data Access) Meeting 5/13/14 Andrew Hanushevsky SLAC National Accelerator Laboratory.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT DPM / LFC and FTS news Ricardo Rocha ( on behalf of the IT/GT/DMS.
Evolution of storage and data management Ian Bird GDB: 12 th May 2010.
CERN IT Department CH-1211 Geneva 23 Switzerland GT HTTP solutions for data access, transfer, federation Fabrizio Furano (presenter) on.
Storage Interfaces Introduction Wahid Bhimji University of Edinburgh Based on previous discussions with Working Group: (Brian Bockelman, Simone Campana,
The new FTS – proposal FTS status. EMI INFSO-RI /05/ FTS /05/ /05/ Bugs fixed – Support an SE publishing more than.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Upcoming Features and Roadmap Ricardo Rocha ( on behalf of the.
Data Placement Intro Dirk Duellmann WLCG TEG Workshop Amsterdam 24. Jan 2012.
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
Storage Classes report GDB Oct Artem Trunov
Storage Interfaces and Access pre-GDB Wahid Bhimji University of Edinburgh On behalf of all those who participated.
Efi.uchicago.edu ci.uchicago.edu Ramping up FAX and WAN direct access Rob Gardner on behalf of the atlas-adc-federated-xrootd working group Computation.
DMLite GridFTP frontend Andrey Kiryanov IT/SDC 13/12/2013.
Testing Infrastructure Wahid Bhimji Sam Skipsey Intro: what to test Existing testing frameworks A proposal.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
Grid Deployment Board 5 December 2007 GSSD Status Report Flavia Donno CERN/IT-GD.
Wahid Bhimji (Some slides are stolen from Markus Schulz’s presentation to WLCG MB on 19 June Apologies to those who have seen some of this before)
The Grid Storage System Deployment Working Group 6 th February 2007 Flavia Donno IT/GD, CERN.
1 DIRAC Data Management Components A.Tsaregorodtsev, CPPM, Marseille DIRAC review panel meeting, 15 November 2005, CERN.
An Analysis of Data Access Methods within WLCG Shaun de Witt, Andrew Lahiff (STFC)
Acronyms GAS - Grid Acronym Soup, LCG - LHC Computing Project EGEE - Enabling Grids for E-sciencE.
DPM in FAX (ATLAS Federation) Wahid Bhimji University of Edinburgh As well as others in the UK, IT and Elsewhere.
EMI is partially funded by the European Commission under Grant Agreement RI Future Proof Storage with DPM Oliver Keeble (on behalf of the CERN IT-GT-DMS.
CERN IT Department CH-1211 Genève 23 Switzerland t DPM status and plans David Smith CERN, IT-DM-SGT Pre-GDB, Grid Storage Services 11 November.
CMS data access Artem Trunov. CMS site roles Tier0 –Initial reconstruction –Archive RAW + REC from first reconstruction –Analysis, detector studies, etc.
HEPiX IPv6 Working Group David Kelsey (STFC-RAL) GridPP33 Ambleside 22 Aug 2014.
Riccardo Zappi INFN-CNAF SRM Breakout session. February 28, 2012 Ingredients 1. Basic ingredients (Fabric & Conn. level) 2. (Grid) Middleware ingredients.
WLCG IPv6 deployment strategy
Future of WAN Access in ATLAS
Storage Interfaces and Access: Introduction
dCache – protocol developments and plans
Gfal/lcg-util -> Gfal2/gfal2-util
Storage Protocol overview
Taming the protocol zoo
Storage Resource Reporting Proposal
GFAL 2.0 Devresse Adrien CERN lcgutil team
The INFN Tier-1 Storage Implementation
Presentation transcript:

Storage Interfaces and Access: Interim report Wahid Bhimji University of Edinburgh On behalf of WG: Brian Bockelman, Philippe Charpentier, Simone Campana, Dirk Duellmann, Michel Jouvin, Oliver Keeble, Markus Schulz And other participants (see meeting agenda/ minutes)

Storage Interfaces WG: Background and Mandate Little of current management interface (SRM) is used. Leads to performance overheads for experiments; work on developers to maintain; restricts sites technology choices. Building on Storage/Data TEG, clarify for disk-only systems the minimal functionality for WLCG Storage Management Interface. Evaluate alternative interfaces as they emerge, call for the need of tests whenever interesting, and recommend those shown to be interoperable, scalable and supportable. Help ensure that these alternatives can be supported by FTS and lcg_utils to allow interoperability. Meetings coincide with GDBs with extras on demand: Contributions from developers, sites and experiments Not designing a replacement interface to SRM but there are already activities so bringing together and coordinating these.

Functionality required Experiment’s Data Management plans: CMS: no “blockers” for non-SRM usage; Nebraska SRM-free site; Example ways of doing things that can be used by others. ATLAS: some issues some of which will be resolved in next gen of data management: rucio. Open to trying controlled non-SRM sites using common solutions / interfaces. LHCb have some concerns but also have not very different requirements than Atlas. Sites Perspective: CERN: Bestman SRM doesn’t scale for them. Also RAL future disk only technology, choice ideally wouldn’t be hampered by SRM options. Other sites see advantages… Middleware and tool development : e.g. FTS3; gFal2 support non- SRM interfaces (some testing by VOs to be done).

Areas still requiring development Needed by?IssueSolution proposed in Annecy pre-GDB ATLAS/ LHCb/ Sites Reporting of space used in space tokens. Protection of space. JSON publishing currently used in some places on ATLAS – temporary measure. WebDav quotas? ATLAS/ LHCb Targeting upload to space token. Could just use namespace but certain SEs would need to change the way they report space to reflect. (Or use e.g. http) ATLAS/L HCb DeletiongFal2 will help provide an abstract layer. LHCb (ATLAS) Surl->TurlRequire a redirecting protocol and SURL = Turl for sites that want no SRM. ATLAS/L HCb Service query: checksum check etc. Some service check needed as is some “srm-ls”. gFal2 will help All?Redirecting protocol on different storage Performant gridftp redirection already on dCache – soon on DPM (also have xrootd/http and FTS3 will /does support these ). See also tables in backup slides

Developments and remaining actions. Developments since start of WG FTS3 now in production and its xrootd and http interfaces are almost at VO testing stage. gFal2 well developed and will be supported to satisfy the ongoing need described in last table. CMS now require xrootd at all sites; ATLAS soon need both xrootd (for federation) and webdav (for rucio – initially renaming but more envisaged) These interfaces now prevalent (after EMI upgrades etc.) Note: currently means more complexity with new and legacy interfaces used Actions: ATLAS will test their required functionality in gFAL2 for deletion and service discovery and use of a few (more) non-SRM demonstrator sites. For using namespace instead of space-token (space usage, uploading, quota..) – iterate on more concrete description of needs (both for VOs (LHCb/ATLAS) but also sites) alongside proposals / demonstrators.

Conclusions Activity on storage interfaces is progressing well. Transition from SRM is happening. Remaining issues are tractable via a combination of rules (SURL=TURL); redirecting protocols and movement to client side rather than server side abstraction (gFal2). Dooable – but not done: as it is not peoples primary focus it will not be an overnight transition so continuing need for group to stay engaged, monitor and ensure interoperability. There are also a number of wider issues, such as data access interface (see next slides), and cloud storage which did have TEG recommendations and which this group is discussing (though not in original mandate).

Data access protocols: WLCG direction

Reminder: TEG Recommendation(s) [LAN] Protocol support and evolution: Both remote I/O (direct reading from local storage) and streaming of the file (copy to WN) should be supported in the short/medium term. However the trend to move towards remote IO should be encouraged by both experiments and storage solution providers, and should be accompanied by an increase in resilience of protocols. LHC experiments are able to support all protocols supported by ROOT and expect to be able to continue to do so in the future. This support should be maintained but the current direction of travel towards fewer protocols (in particular the focus on file://, xrootd and is encouraged. Specifically both the work on current implementations of file:// access through Nfs-4-1 and that on testing ROOT performance with direct access via http, should be continued

Where are we now. Server wise http and xrootd options are more mature Experiment side still in the same zoo with even more animals: Rfio, file, dcap now joined by http, xrootd, S3 … With copy-to-scratch, direct and federation flavours Given xrootd is now established (on e.g. DPM) rfio is the most obvious candidate for retirement but still used very widely in production. For WAN gridFTP is already standard – here things are also diverging (as forseen) with FTS3 supporting xrootd; http.

Escaping the Zoo? Do we want to escape: Yes... there are support and performance issues, also makes for random user failures on different sites. Some solutions Forced (gradual) retirement of rfio. Supported by WG CMS is now requiring this. ATLAS phased transition (now used in prod at Taiwan; Edinburgh and Oxford) (see also backup slides). Initial retirement monitored by this WG then WLCG ops or GDB. Concept of a “core” protocol that is required for all sites? (even if another is used for performance.) This is de-facto currently xrootd. However systems should be flexible enough to allow transitions.

The End The following are extra slides….

Table of used functions from TEG Somewhat simplified and removed those only relevant for Archive/T1 Still probably can’t read it (!) but a couple of observations: Not that much is needed – e.g. space management is only querying and not even that for CMS

Brief functionality table – focussing on areas where there are issues FunctionUsed by ATLASCMSLHCb Is there an existing Alternative or Issue (to SRM) Transfer: 3 rd Party (FTS) YES Using just gridFTP in EOS (ATLAS) and Nebraska (CMS) What about on other SEs? Transfer: Job in/out (LAN) YES ATLAS and CMS using LAN protocols directly Negotiate a transport protocol NO YESLHCb use lcg-getturls; Transfer: Direct Download YESNO ATLAS use SRM via lcg-cp, Alternative plugins in rucio Namespace: Manipulation / Deletion YES ATLAS: Deletion would need plugin for an alternative Space QueryYESNOYES?Development Required Space UploadYESNOYES?Minor Development Required

Rfio -> Xrootd on DPM All DPM sites use(d) rfio for local file access Perceived performance issues Therefore mainly used in copy mode (for atlas) Sometimes there are issues with client libs (e.g. EMI 32bit ones; link to libshift.so; CMS many issues) DPM developers want to move away WLCG “Storage interfaces” group supported decision to “retire” rfio. CMS requested / required it in WLCG ops mtg

Testing… Testing xrootd in UK and ASGC for almost a year Initially some issues but everything stable on the DPM server side for a long while now. Some issues seen on ATLAS FAX tests were N2N / Fax specific and shouldn’t affect local xrootd ops. HammerCloud test so far didn’t show any problems – not clear if performance gains Experiment configuration – production experience will sort these quicker Changed for ATLAS at Oxford, Edinburgh last week (for both production and analysis) to add to ANALY_TAIWAN

A few debateables Direct or copy-to-scratch: using copy at first but would be good for direct access to work well Should we push http copy at some sites? As further steps can also use xrdcp for Federation access Stage-out (lfc registering taken care of elsewhere. What about SpaceTokens? ). FTS transfers (soon in FTS3 (again possible ST issue))