Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t DBCF GT Standard Interfaces to Grid Storage DPM and LFC Update Ricardo.

Slides:



Advertisements
Similar presentations
Data Management Expert Panel. RLS Globus-EDG Replica Location Service u Joint Design in the form of the Giggle architecture u Reference Implementation.
Advertisements

DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI
HDFS and S3 plugins Andrea Manzi Martin Hellmich 13/12/2013.
EGEE is a project funded by the European Union under contract IST Using SRM: DPM and dCache G.Donvito,V.Spinoso INFN Bari
Storage: Futures Flavia Donno CERN/IT WLCG Grid Deployment Board, CERN 8 October 2008.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT dpm-xrootd v3 Creating Federated Data Stores for the LHC David.
Adding scalability to legacy PHP web applications Overview Mario A. Valdez-Ramirez.
CASTOR Upgrade, Testing and Issues Shaun de Witt GRIDPP August 2010.
O. Keeble, F. Furano, A. Manzi, A. Ayllon, I. Calvet, M. Hellmich DPM Workshop 2014 DPM Monitoring.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Simplifying Configuration Ricardo Rocha ( on behalf of the LCGDM.
DPM Italian sites and EPEL testbed in Italy Alessandro De Salvo (INFN, Roma1), Alessandra Doria (INFN, Napoli), Elisabetta Vilucchi (INFN, Laboratori Nazionali.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Performant and Future Proof: MySQL, Memcache and Raspberry Pi.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
StoRM Some basics and a comparison with DPM Wahid Bhimji University of Edinburgh GridPP Storage Workshop 31-Mar-101Wahid Bhimji – StoRM.
Ricardo Rocha ( on behalf of the DPM team ) Standards, Status and Plans.
Storage Wahid Bhimji DPM Collaboration : Tasks. Xrootd: Status; Using for Tier2 reading from “Tier3”; Server data mining.
CERN IT Department CH-1211 Geneva 23 Switzerland t Storageware Flavia Donno CERN WLCG Collaboration Workshop CERN, November 2008.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Author - Title- Date - n° 1 Partner Logo WP5 Summary Paris John Gordon WP5 6th March 2002.
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
Alejandro Alvarez Ayllon on behalf of the LCGDM developer team IT/SDC 13/12/2013 DAV support in DPM.
WebFTS File Transfer Web Interface for FTS3 Andrea Manzi On behalf of the FTS team Workshop on Cloud Services for File Synchronisation and Sharing.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT DPM Collaboration Motivation and proposal Oliver Keeble CERN On.
SLACFederated Storage Workshop Summary For pre-GDB (Data Access) Meeting 5/13/14 Andrew Hanushevsky SLAC National Accelerator Laboratory.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT DPM / LFC and FTS news Ricardo Rocha ( on behalf of the IT/GT/DMS.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
CERN IT Department CH-1211 Geneva 23 Switzerland GT HTTP solutions for data access, transfer, federation Fabrizio Furano (presenter) on.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Upcoming Features and Roadmap Ricardo Rocha ( on behalf of the.
Andrea Manzi CERN On behalf of the DPM team HEPiX Fall 2014 Workshop DPM performance tuning hints for HTTP/WebDAV and Xrootd 1 16/10/2014.
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Overview of DMLite Ricardo Rocha ( on behalf of the LCGDM team.
SRM-2 Road Map and CASTOR Certification Shaun de Witt 3/3/08.
DMLite GridFTP frontend Andrey Kiryanov IT/SDC 13/12/2013.
Testing Infrastructure Wahid Bhimji Sam Skipsey Intro: what to test Existing testing frameworks A proposal.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
EMI is partially funded by the European Commission under Grant Agreement RI Roadmap & Future Work Ricardo Rocha ( on behalf of the DPM team )
LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.
DCache/XRootD Dmitry Litvintsev (DMS/DMD) FIFE workshop1Dmitry Litvintsev.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
Andrea Manzi CERN EGI Conference on Challenges and Solutions for Big Data Processing on cloud 24/09/2014 Storage Management Overview 1 24/09/2014.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Cluman: Advanced Cluster Management for Large-scale Infrastructures.
An Analysis of Data Access Methods within WLCG Shaun de Witt, Andrew Lahiff (STFC)
DPM: Future Proof Storage Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Standard Protocols in DPM Ricardo Rocha.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI Services for Distributed e-Infrastructure Access Tiziana Ferrari on behalf.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Solutions for WAN data access: xrootd and NFSv4.1 Andrea Sciabà.
EMI is partially funded by the European Commission under Grant Agreement RI DPM in EMI-II HTTP and NFS interfaces Oliver Keeble On behalf of DPM.
EMI is partially funded by the European Commission under Grant Agreement RI Future Proof Storage with DPM Oliver Keeble (on behalf of the CERN IT-GT-DMS.
1 EMI INFSO-RI Dynamic Federations Seamless aggregation of standard-protocol-based storage endpoints Fabrizio Furano Patrick Fuhrmann Paul Millar.
CERN IT Department CH-1211 Genève 23 Switzerland t DPM status and plans David Smith CERN, IT-DM-SGT Pre-GDB, Grid Storage Services 11 November.
UNICORE and Argus integration Krzysztof Benedyczak ICM / UNICORE Security PT.
Jean-Philippe Baud, IT-GD, CERN November 2007
Dynamic Storage Federation based on open protocols
Ricardo Rocha ( on behalf of the DPM team )
StoRM: a SRM solution for disk based storage systems
Vincenzo Spinoso EGI.eu/INFN
Status of the SRM 2.2 MoU extension
Future of WAN Access in ATLAS
Dynafed, DPM and EGI DPM workshop 2016 Speaker: Fabrizio Furano
StoRM Architecture and Daemons
Introduction to Data Management in EGI
Data Federation with Xrootd Wei Yang US ATLAS Computing Facility meeting Southern Methodist University, Oct 11-12, 2011.
GFAL 2.0 Devresse Adrien CERN lcgutil team
DPM releases and platforms status
Ákos Frohner EGEE'08 September 2008
Data Management cluster summary
INFNGRID Workshop – Bari, Italy, October 2004
Presentation transcript:

Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Standard Interfaces to Grid Storage DPM and LFC Update Ricardo Rocha, Alejandro Alvarez ( on behalf of the LCGDM team ) EMI INFSO-RI

Grid Technology Main Goals Provide a lightweight, grid-aware storage solution Simplify life of users and administrators Improve the feature set and performance Use standard protocols Use standard building blocks Allow easy integration with new tools and systems 2

Grid Technology DISK NODE Architecture (Reminder) 3 CLIENT HEAD NODE DISK NODE Separation between metadata and data access – Direct data access to/from Disk Nodes Strong authentication / authorization Multiple access protocols NSDPMGRIDFTP NFSHTTP/DAV XROOT NFS HTTP/DAV XROOT GRIDFTP RFIO

Grid Technology Deployment & Usage DPM is the most widely deployed grid storage system – Over 200 sites in 50 regions – Over 300 VOs – ~36 PB (10 sites with > 1PB) LFC enjoys wide deployment too – 58 instances at 48 sites – Over 300 VOs 4

Grid Technology Deployment & Usage DPM is the most widely deployed grid storage system – Over 200 sites in 50 regions – Over 300 VOs – ~36 PB (10 sites with > 1PB) LFC enjoys wide deployment too – 58 instances at 48 sites – Over 300 VOs 5

Grid Technology Software Availability DPM and LFC are available via gLite and EMI – But gLite releases stopped at From on we’re Fedora compliant for all components, available in multiple repositories – EMI1, EMI2 and UMD – Fedora / EPEL Latest production release is Some packages will never make it into Fedora – YAIM (Puppet? See later…) – Oracle backend 6

Grid Technology Software Availability DPM and LFC are available via gLite and EMI – But gLite releases stopped at From on we’re Fedora compliant for all components, available in multiple repositories – EMI1, EMI2 and UMD – Fedora / EPEL Latest production release is Some packages will never make it into Fedora – YAIM (Puppet? See later…) – Oracle backend 7

Grid Technology System Evaluation

Grid Technology System Evaluation ~1.5 years ago we performed a full system evaluation – Using PerfSuite, out testing framework – – ( results presented later are obtained using this framework too ) It showed the system had serious bottlenecks – Performance – Code maintenance (and complexity) – Extensibility 9

Grid Technology Dependency on NS/DPM daemons All calls to the system had to go via the daemons – Not only user / client calls – Also the case for our frontends (HTTP/DAV, NFS, XROOT, …) – Daemons were a bottleneck, and did not scale well Short term fix (available since 1.8.2) – Improve TCP listening queue settings to prevent timeouts – Increase number of threads in the daemon pools Previously statically defined to a rather low value Medium term (available since 1.8.4, with DMLite) – Refactor the daemon code into a library 10

Grid Technology Dependency on NS/DPM daemons All calls to the system had to go via the daemons – Not only user / client calls – Also valid for our new frontends (HTTP/DAV, NFS, XROOT, …) – Daemons were a bottleneck, and did not scale well Short term fix (available since 1.8.2) – Improve TCP listening queue settings to prevent timeouts – Increase number of threads in the daemon pools Previously statically defined Medium term (available since 1.8.4, with DMLite) – Refactor the daemon code into a library 11

Grid Technology GET asynchronous performance DPM used to mandate asynchronous GET calls – Introduces significant client latency – Useful when some preparation of the replica is needed – But this wasn’t really our case (disk only) Fix (available with 1.8.3) – Allow synchronous GET requests 12

Grid Technology GET asynchronous performance DPM used to mandate asynchronous GET calls – Introduces significant client latency – Useful when some preparation of the replica is needed – But this wasn’t really our case (disk only) Fix – Allow synchronous GET requests 13

Grid Technology Database Access No DB connection pooling, no bind variables  – DB connections were linked to daemon pool threads – DB connections would be kept for the whole life of the client Quicker fix – Add DB connection pooling to the old daemons – Good numbers, but needs extensive testing… Medium term fix (available since for HTTP/DAV) – DMLite, which includes connection pooling – Among many other things… 14

Grid Technology Database Access No DB connection pooling, no bind variables  – DB connections were linked to daemon pool threads – DB connections would be kept for the whole life of the client Quicker fix – Add DB connection pooling to the old daemons – Good numbers, but needs extensive testing… Medium term fix (available since for HTTP/DAV) – DMLite, which includes connection pooling – Among many other things… 15

Grid Technology Dependency on the SRM SRM imposes significant latency for data access – It has its use cases, but is a killer for regular file access – For data access, only required for protocols not supporting redirection (file name to replica translation) Fix (all available from 1.8.4) – Keep SRM for space management only (usage, reports, …) – Add support for protocols natively supporting redirection HTTP/DAV, NFS 4.1/pNFS, XROOT And promote them widely… Investigating GridFTP redirection support (seems possible!) 16

Grid Technology Other recent activities… ATLAS LFC consolidation effort – There was an instance at each T1 center – They have all been merged into a single CERN one – Similar effort was done at BNL for the US sites HTTP Based Federations – Lots of work in this area too… – But there will be a separate forum event on this topic Puppet and Nagios for easy system administration – Available since for DPM – Working on new manifests for DMLite based setups – And adding LFC manifests, to be tested at CERN 17

Grid Technology Other recent activities… ATLAS LFC consolidation effort – There was an instance at each T1 center – They have all been merged into a single CERN one – The same effort was done at BNL for the US sites HTTP Based Federations – Lots of work in this area too… – But there will be a separate forum event on this topic Puppet and Nagios for easy system administration – In production since for DPM – Working on new manifests for DMLite based setups – And adding LFC manifests, to be tested at CERN 18

Grid Technology Future Proof with DMLite

Grid Technology Future Proof with DMLite DMLite is our new plugin based library Meets goals resulting from the system evaluation – Refactoring of the existing code – Single library used by all frontends – Extensible, open to external contributions – Easy integration of standard building blocks Apache2, HDFS, S3, … 20

Grid Technology DMLite is our new plugin based library Meets goals resulting from the system evaluation – Refactoring of the existing code – Single library used by all frontends – Extensible, open to external contributions – Easy integration of standard building blocks Apache, HDFS, S3, … Future Proof with DMLite 21

Grid Technology DMLite: Interfaces Plugins implement one or multiple interfaces – Depending on the functionality they provide Plugins are stacked, and called LIFO – You can load multiple plugins for the same functionality APIs in C/C++/Python, plugins in C++ (Python soon) 22 I/O domainPool domain PoolHandler IODriver IOHandler Namespace domain Catalog INode PoolManager PoolDriver User domain UserGroupDb

Grid Technology DMLite Plugin: Legacy Interacts directly with the DPNS/DPM/LFC daemons – Simply redirects calls using the existing NS and DPM APIs For full backward compatibility – Both for namespace and pool/filesystem management 23 I/O domainPool domain PoolHandler IODriver IOHandler Namespace domain Catalog INode PoolManager PoolDriver User domain UserGroupDb

Grid Technology DMLite Plugin: MySQL Refactoring of the MySQL backend – Properly using bind variables and connection pooling – Huge performance improvements Namespace traversal comes from Built-in Catalog Proper stack setup provides fallback to Legacy Plugin 24 I/O domainPool domain PoolHandler IODriver IOHandler Namespace domain Catalog INode PoolManager PoolDriver User domain UserGroupDb

Grid Technology DMLite Plugin: Oracle Refactoring of the Oracle backend What applies to the MySQL one, applies here – Better performance with bind variables and pooling – Namespace traversal comes from Built-in Catalog – Proper stack setup provides fallback to Legacy Plugin 25 I/O domainPool domain PoolHandler IODriver IOHandler Namespace domain Catalog INode PoolManager PoolDriver User domain UserGroupDb

Grid Technology DMLite Plugin: Memcache Memory cache for namespace requests – Reduced load on the database – Much improved response times – Horizontal scalability Can be put over any other Catalog implementation 26 I/O domainPool domain PoolHandler IODriver IOHandler Namespace domain Catalog INode PoolManager PoolDriver User domain UserGroupDb

Grid Technology DMLite Plugin: Memcache Memory cache for namespace requests – Reduced load on the database – Much improved response times – Horizontal scalability Can be put over any other Catalog implementation 27 I/O domainPool domain PoolHandler IODriver IOHandler Namespace domain Catalog INode PoolManager PoolDriver User domain UserGroupDb

Grid Technology DMLite Plugin: Hadoop/HDFS First new pool type HDFS pool can coexist with legacy pools, … – In the same namespace, transparent to frontends All HDFS goodies for free (auto data replication, …) Catalog interface coming soon – Exposing the HDFS namespace directly to the frontends 28 I/O domainPool domain PoolHandler IODriver IOHandler Namespace domain Catalog INode PoolManager PoolDriver User domain UserGroupDb

Grid Technology DMLite Plugin: S3 Second new pool type Again, can coexist with legacy pools, HDFS, … – In the same namespace, transparent to frontends Main goal is to provide additional, temporary storage – High load periods, user analysis before big conferences, … – Evaluated against Amazon, now looking at Huawei and OpenStack 29 I/O domainPool domain PoolHandler IODriver IOHandler Namespace domain Catalog INode PoolManager PoolDriver User domain UserGroupDb

Grid Technology DMLite Plugin: VFS Third new pool type (currently in development) Exposes any mountable filesystem – As an additional pool in an existing namespace – Or directly exposing that namespace Think Lustre, GPFS, … 30 I/O domainPool domain PoolHandler IODriver IOHandler Namespace domain Catalog INode PoolManager PoolDriver User domain UserGroupDb

Grid Technology DMLite Plugins: Even more… Librarian – Replica failover and retrial – Used by the HTTP/DAV frontend for a Global Access Service Profiler – Boosted logging capabilities – For every single call, logs response times per plugin HTTP based federations ATLAS Distributed Data Management (DDM) – First external plugin – Currently under development – Will expose central catalogs via standard protocols Writing plugins is very easy… 31

Grid Technology DMLite Plugins: Development 32

Grid Technology DMLite Plugins: Development 33

Grid Technology DMLite: Demo Time

Grid Technology Standard Frontends

Grid Technology Standard Frontends Standards based access to DPM is already available – HTTP/DAV and NFS4.1 / pNFS – But also XROOT (useful in the HEP context) Lots of recent work to make them performant – Many details were already presented – But we needed numbers to show they are a viable alternative We now have those numbers! 36

Grid Technology Frontends: HTTP / DAV Frontend based on Apache2 + mod_dav In production since – Working with PES on deployment in front of the CERN LFC too Can be for both get/put style (=GridFTP) or direct access – Some extras for full GridFTP equivalence Multiple streams with Range/Content-Range Third party copies using WebDAV COPY + Gridsite Delegation – Random I/O Possible to do vector reads and other optimizations Metalink support (failover, retrial) With it is already DMLite based 37

Grid Technology Frontend based on Apache2 + mod_dav In production since – Working with PES on deployment in front of the CERN LFC too Can be for both get/put style (=GridFTP) or direct access – Some extras for full GridFTP equivalence Multiple streams with Range/Content-Range Third party copies using WebDAV COPY + Gridsite Delegation – Random I/O Possible to do vector reads and other optimizations Metalink support (failover, retrial) With it is already DMLite based Frontends: HTTP / DAV 38

Grid Technology Frontends: NFS 4.1 / pNFS Direct access to the data, with a standard NFS client Available with DPM (read only) – Write enabled early next year – Not yet based on DMLite Implemented as a plugin to the Ganesha server Only kerberos authentication for now – Issue with client X509 support in Linux (server ready though) – We’re investigating how to add this DMLite based version in development 39

Grid Technology Frontends: NFS 4.1 / pNFS Direct access to the data, with a standard NFS client Available with DPM (read only) – Read / write early next year – Not yet based on DMLite Implementation based on the Ganesha server Only kerberos authentication for now – Issue with client X509 support – We’re investigating how to add this DMLite based version in development 40

Grid Technology Frontends: XROOTD Not really a standard, but widely used in HEP Initial implementation in 2006 – No multi-vo support, limited authz, performance issues New version 3.1 (rewrite) available with – Multi VO support – Federation aware (already in use in ATLAS FAX federation) – Strong auth/authz with X509, but ALICE token still available Based on the standard XROOTD server – Plugins for XrdOss, XrdCmsClient and XrdAccAuthorize Soon also based on DMLite (version 3.2) 41

Grid Technology Frontends: Random I/O Performance HTTP/DAV vs XROOTD vs RFIO – Soon adding NFS 4.1 / pNFS to the comparison 42 LAN / Chunk Size: / File Size: ProtocolN. ReadsRead SizeRead Time HTTP50022,773, HTTP100046,027, XROOT50022,773, XROOT100046,027, RFIO50022,773, RFIO100046,027,

Grid Technology HTTP/DAV vs XROOTD vs RFIO – Soon adding NFS 4.1 / pNFS to the comparison 43 LAN / Chunk Size: / File Size: / 5000 Reads ProtocolMax. VectorRead SizeRead Time HTTP81,166,613, HTTP162,156,423, HTTP243,211,861, HTTP324,226,877, HTTP648,535,839, XROOT81,166,613, XROOT162,156,423, XROOT243,211,861, XROOT324,226,877, XROOT648,535,839, Frontends: Random I/O Performance

Grid Technology HTTP/DAV vs XROOTD vs RFIO – Soon adding NFS 4.1 / pNFS to the comparison 44 WAN / Chunk Size: / File Size: ProtocolN. ReadsRead SizeRead Time HTTP50022,773, HTTP100046,027, XROOT50022,773, XROOT100046,027, RFIO50022,773,112 RFIO100046,027,143 Frontends: Random I/O Performance

Grid Technology Frontends: Hammercloud We’ve also run a set of Hammercloud tests – Using ATLAS analysis jobs – These are only a few of all the metrics we have 45 Remote HTTP Remote HTTP (TTreeCache) Staging HTTP Remote XROOT Remote XROOT (TTreeCache) Staging XROOT Staging GridFTP Events Athena(s) Event Rate(s) Job Efficiency

Grid Technology Performance: Showdown Big thanks to ShuTing and ASGC – For doing a lot of the testing and providing the infrastructure First recommendation is to phase down RFIO – No more development effort on it from our side HTTP vs XROOTD – Performance is equivalent, up to sites/users to decide – But we like standards… there’s a lot to gain with them Staging vs Direct Access – Staging not ideal… requires lots of extra space on the WN – Direct Access is performant if used with ROOT TTreeCache 46

Grid Technology Summary & Outlook DPM and LFC are in very good shape – Even more lightweight, much easier code maintenance – Open, extensible to new technologies and contributions DMLite is our new core library Standards, standards, … – Protocols and building blocks – Deployment and monitoring – Reduced maintenance, free clients, community help DPM Community Workshops – Paris December 2012 – Taiwan 17 March 2013 (ISGC workshop) A DPM Collaboration is being setup 47