Scalla/xrootd Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 19-May-09 ANL Tier3(g,w) Meeting.

Slides:



Advertisements
Similar presentations
Potential Data Access Architectures using xrootd OSG All Hands Meeting Harvard University March 7-11, 2011 Andrew Hanushevsky, SLAC
Advertisements

Current methods for negotiating firewalls for the Condor ® system Bruce Beckles (University of Cambridge Computing Service) Se-Chang Son (University of.
Xrootd Roadmap Atlas Tier 3 Meeting University of Chicago September 12-13, 2011 Andrew Hanushevsky, SLAC
Distributed Xrootd Derek Weitzel & Brian Bockelman.
Xrootd Update OSG All Hands Meeting University of Nebraska March 19-23, 2012 Andrew Hanushevsky, SLAC
EGEE is a project funded by the European Union under contract IST Using SRM: DPM and dCache G.Donvito,V.Spinoso INFN Bari
PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.
Experiences Deploying Xrootd at RAL Chris Brew (RAL)
Scalla Back Through The Future Andrew Hanushevsky SLAC National Accelerator Laboratory Stanford University 8-April-10
10 May 2007 HTTP - - User data via HTTP(S) Andrew McNab University of Manchester.
16 th May 2006Alessandra Forti Storage Alessandra Forti Group seminar 16th May 2006.
The Next Generation Root File Server Andrew Hanushevsky Stanford Linear Accelerator Center 27-September-2004
Scalla/xrootd Andrew Hanushevsky SLAC National Accelerator Laboratory Stanford University 19-August-2009 Atlas Tier 2/3 Meeting
Xrootd Demonstrator Infrastructure OSG All Hands Meeting Harvard University March 7-11, 2011 Andrew Hanushevsky, SLAC
Scalla/xrootd Andrew Hanushevsky SLAC National Accelerator Laboratory Stanford University 29-October-09 ATLAS Tier 3 Meeting at ANL
March 6, 2009Tofigh Azemoon1 Real-time Data Access Monitoring in Distributed, Multi Petabyte Systems Tofigh Azemoon Jacek Becla Andrew Hanushevsky Massimiliano.
Xrootd, XrootdFS and BeStMan Wei Yang US ATALS Tier 3 meeting, ANL 1.
Scalla/xrootd Introduction Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 6-April-09 ATLAS Western Tier 2 User’s Forum.
Installation and Development Tools National Center for Supercomputing Applications University of Illinois at Urbana-Champaign The SEASR project and its.
GStore: GSI Mass Storage ITEE-Palaver GSI Horst Göringer, Matthias Feyerabend, Sergei Sedykh
SLAC Experience on Bestman and Xrootd Storage Wei Yang Alex Sim US ATLAS Tier2/Tier3 meeting at Univ. of Chicago Aug 19-20,
11-July-2008Fabrizio Furano - Data access and Storage: new directions1.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Xrootd Monitoring Atlas Software Week CERN November 27 – December 3, 2010 Andrew Hanushevsky, SLAC.
Support in setting up a non-grid Atlas Tier 3 Doug Benjamin Duke University.
Architecture and ATLAS Western Tier 2 Wei Yang ATLAS Western Tier 2 User Forum meeting SLAC April
Xrootd Update Andrew Hanushevsky Stanford Linear Accelerator Center 15-Feb-05
1 User Analysis Workgroup Discussion  Understand and document analysis models  Best in a way that allows to compare them easily.
02-June-2008Fabrizio Furano - Data access and Storage: new directions1.
Accelerating Debugging In A Highly Distributed Environment CHEP 2015 OIST Okinawa, Japan April 28, 2015 Andrew Hanushevsky, SLAC
Performance and Scalability of xrootd Andrew Hanushevsky (SLAC), Wilko Kroeger (SLAC), Bill Weeks (SLAC), Fabrizio Furano (INFN/Padova), Gerardo Ganis.
Xrootd Present & Future The Drama Continues Andrew Hanushevsky Stanford Linear Accelerator Center Stanford University HEPiX 13-October-05
Xrootd Update Alice Tier 1/2 Workshop Karlsruhe Institute of Technology (KIT) January 24-26, 2012 Andrew Hanushevsky, SLAC
SLACFederated Storage Workshop Summary For pre-GDB (Data Access) Meeting 5/13/14 Andrew Hanushevsky SLAC National Accelerator Laboratory.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT DPM / LFC and FTS news Ricardo Rocha ( on behalf of the IT/GT/DMS.
Scalla/xrootd Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 08-June-10 ANL Tier3 Meeting.
Scalla Advancements xrootd /cmsd (f.k.a. olbd) Fabrizio Furano CERN – IT/PSS Andrew Hanushevsky Stanford Linear Accelerator Center US Atlas Tier 2/3 Workshop.
Scalla Authorization xrootd /cmsd Andrew Hanushevsky SLAC National Accelerator Laboratory CERN Seminar 10-November-08
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
Xrootd Proxy Service Andrew Hanushevsky Heinz Stockinger Stanford Linear Accelerator Center SAG September-04
Scalla In’s & Out’s xrootdcmsd xrootd /cmsd Andrew Hanushevsky SLAC National Accelerator Laboratory OSG Administrator’s Work Shop Stanford University/SLAC.
SRM Space Tokens Scalla/xrootd Andrew Hanushevsky Stanford Linear Accelerator Center Stanford University 27-May-08
Distributed Logging Facility Castor External Operation Workshop, CERN, November 14th 2006 Dennis Waldron CERN / IT.
Scalla As a Full-Fledged LHC Grid SE Wei Yang, SLAC Andrew Hanushevsky, SLAC Alex Sims, LBNL Fabrizio Furano, CERN SLAC National Accelerator Laboratory.
Scalla + Castor2 Andrew Hanushevsky Stanford Linear Accelerator Center Stanford University 27-March-07 Root Workshop Castor2/xrootd.
Federated Data Stores Volume, Velocity & Variety Future of Big Data Management Workshop Imperial College London June 27-28, 2013 Andrew Hanushevsky, SLAC.
11-June-2008Fabrizio Furano - Data access and Storage: new directions1.
T3g software services Outline of the T3g Components R. Yoshida (ANL)
Testing CernVM-FS scalability at RAL Tier1 Ian Collier RAL Tier1 Fabric Team WLCG GDB - September
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Testing Infrastructure Wahid Bhimji Sam Skipsey Intro: what to test Existing testing frameworks A proposal.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
09-Apr-2008Fabrizio Furano - Scalla/xrootd status and features1.
DCache/XRootD Dmitry Litvintsev (DMS/DMD) FIFE workshop1Dmitry Litvintsev.
Scalla Update Andrew Hanushevsky Stanford Linear Accelerator Center Stanford University 25-June-2007 HPDC DMG Workshop
An Analysis of Data Access Methods within WLCG Shaun de Witt, Andrew Lahiff (STFC)
New Features of Xrootd SE Wei Yang US ATLAS Tier 2/Tier 3 meeting, University of Texas, Arlington,
OSG STORAGE OVERVIEW Tanya Levshina. Talk Outline  OSG Storage architecture  OSG Storage software  VDT cache  BeStMan  dCache  DFS:  SRM Clients.
SLACFederated Storage Workshop Summary Andrew Hanushevsky SLAC National Accelerator Laboratory April 10-11, 2014 SLAC.
A. Sim, CRD, L B N L 1 Production Data Management Workshop, Mar. 3, 2009 BeStMan and Xrootd Alex Sim Scientific Data Management Research Group Computational.
Federating Data in the ALICE Experiment
a brief summary for users
Blueprint of Persistent Infrastructure as a Service
Dag Toppe Larsen UiB/CERN CERN,
Progress on NA61/NA49 software virtualisation Dag Toppe Larsen Wrocław
Dag Toppe Larsen UiB/CERN CERN,
SLAC National Accelerator Laboratory
Brookhaven National Laboratory Storage service Group Hironori Ito
Ákos Frohner EGEE'08 September 2008
Scalla/XRootd Advancements
Presentation transcript:

Scalla/xrootd Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 19-May-09 ANL Tier3(g,w) Meeting

ANL Tier3(g,w) Meeting 19-May-092 Outline File servers NFS & xrootd How xrootd manages files Multiple file servers (i.e., clustering) Considerations and pitfalls Getting to xrootd hosted file data Native monitoring Conclusions

ANL Tier3(g,w) Meeting 19-May-093 File Server Types Application Linux NFS Server Linux NFS Client Client Machine Server Machine Alternatively xrootd is nothing more than an application level file server & client using another protocol DataFiles Application Linux Linux Client Machine Server Machine xroot Server DataFiles xroot Client

ANL Tier3(g,w) Meeting 19-May-094 Why Not Just Use NFS? NFS V2 & V3 inadequate Scaling problems with large batch farms Unwieldy when more than one server needed NFS V4? Relatively new Standard is still being evolved Mostly in the area of new features Multiple server clustering & stress stability being vetted Performance appears similar to NFS V3 Let’s explore multiple server support in xrootd

ANL Tier3(g,w) Meeting 19-May-095 xrootd & Multiple File Servers I DataFiles Application Linux Client Machine Linux Server Machine B DataFiles open(“/foo”); xroot Client Linux Server Machine A xroot Server Linux Server Machine R xroot Server /foo Redirector 1 Who has /foo? 2 I do! 3 Try B 4 open(“/foo”); xrdcp root://R//foo /tmp The xrootd system does all of these steps automatically without application (user) intervention!

Corresponding Configuration File ANL Tier3(g,w) Meeting 19-May-096 # General section that applies to all servers # all.export /atlas if redirector.slac.stanford.edu all.role manager else all.role server fi all.manager redirector.slac.stanford.edu 3121 # Cluster management specific configuration # cms.allow *.slac.stanford.edu # xrootd specific configuration # xrootd.fslib /opt/xrootd/prod/lib/libXrdOfs.so xrootd.port 1094

ANL Tier3(g,w) Meeting 19-May-097 File Discovery Considerations I The redirector does not have a catalog of files It always asks each server, and Caches the answers in memory for a “while” So, it won’t ask again when asked about a past lookup Allows real-time configuration changes Clients never see the disruption Does have some side-effects The lookup takes less than a millisecond when files exist Much longer when a requested file does not exist!

ANL Tier3(g,w) Meeting 19-May-098 xrootd & Multiple File Servers II DataFiles Application Linux Client Machine Linux Server Machine B DataFiles open(“/foo”); xroot Client Linux Server Machine A xroot Server Linux Server Machine R xroot Server /foo Redirector 1 Who has /foo? 2Nope!5 File deemed not to exist if there is no response after 5 seconds! xrdcp root://R//foo /tmp

ANL Tier3(g,w) Meeting 19-May-099 File Discovery Considerations II System optimized for “file exists” case! Penalty for going after missing files Aren’t new files, by definition, missing? Yes, but that involves writing data! The system is optimized for reading data So, creating a new file will suffer a 5 second delay Can minimize the delay by using the xprep command Primes the redirector’s file memory cache ahead of time Can files appear to be missing any other way?

ANL Tier3(g,w) Meeting 19-May-0910 Missing File vs. Missing Server In xrootd files exist to the extent servers exist The redirector cushions this effect for 10 minutes The time is configurable, but… Afterwards, the redirector cannot tell the difference This allows partially dead server clusters to continue Jobs hunting for “missing” files will eventually die But jobs cannot rely on files actually being missing xrootd cannot provide a definitive answer to “  s:  file x ” This requires additional care during file creation Issue will be mitigated in next release Files that persist only when successfully closed

ANL Tier3(g,w) Meeting 19-May-0911 Getting to xrootd hosted data Via the root framework Automatic when files named root://.... Manually, use TXNetFile() object Note: identical TFile() object will not work with xrootd! xrdcp The native copy command SRM (optional add-on) srmcp, gridFTP FUSE Linux only: xrootd as a mounted file system POSIX preload library Allows POSIX compliant applications to use xrootd

The Flip Side of Things File management is largely transparent Engineered to be turned on and pretty much forget But what if you just need to know Usage statistics Who’s using what Specific data access patterns The big picture A multi-site view ANL Tier3(g,w) Meeting 19-May-0912

ANL Tier3(g,w) Meeting 19-May-0913 Xrootd Monitoring Approach Minimal impact on client requests Robustness against multimode failure Precision & specificity of collected data Real time scalability Use UDP datagrams  Data servers insulated from monitoring. But  Packets can get lost Highly encode the data stream Outsource stream serialization Use variable time buckets 

14 March 6, 2009 Monitored Data Flow Start SessionStart Session sessionId, user, PId, client, server, timestamp Open FileOpen File sessionId, fileId, file path, timestamp Bulk I/OBulk I/O sessionId, fileId, file offset, number bytes Close FileClose File sessionId, fileId, bytes read, bytes written sessionId, fileId, bytes read, bytes written Application DataApplication Data sessionId, appdata End SessionEnd Session sessionId, duration, server restart time StagingStaging stageId, user, PId, client, file path, timestamp, size, duration, server RTdata Configurable

Single Site Monitoring ANL Tier3(g,w) Meeting 19-May-0915

Multi-Site Monitoring ANL Tier3(g,w) Meeting 19-May-0916

Basic Views ANL Tier3(g,w) Meeting 19-May-0917 users unique files jobs all files

Detailed Views ANL Tier3(g,w) Meeting 19-May-0918 Top Performers Table

Per User Views ANL Tier3(g,w) Meeting 19-May-0919 User Information

What’s Missing Integration with common tools Nagios, Ganglia, MonaLisa, etc. Better Packaging Simple install Better Documentation Working on proposal to address the issues ANL Tier3(g,w) Meeting 19-May-0920

ANL Tier3(g,w) Meeting 19-May-0921 The Good Part I Xrootd is simple and easy to administer E.g.: BNL/Star 400-node cluster  0.5 grad student No 3 rd party software required (i.e., self-contained) Not true when SRM support needed Single configuration file independent of cluster size Handles heavy unpredictable loads E,g., >3,000 connections & >10,000 open files Ideal for batch farms where jobs can start in waves Resilient and forgiving Configuration changes can be done in real time Ad hoc addition and removal of servers or files

ANL Tier3(g,w) Meeting 19-May-0922 The Good Part II Ultra low overhead Xrootd memory footprint < 50MB For mostly read-only configuration on SLC4 or later Opens a wide range of deployment options High performance LAN/WAN I/O CPU overlapped I/O buffering and I/O pipelining Well integrated into the root framework Makes WAN random I/O a realistic option Parallel streams and optional multiple data sources Torrent-style WAN data transfer

ANL Tier3(g,w) Meeting 19-May-0923 The Good Part III Wide range of clustering options Can cluster geographically distributed clusters Clusters can be overlaid Can run multiple xrootd versions using production data SRM V2 Support Optional add-on using LBNL BestMan Can be mounted as a file system FUSE (SLC4 or later) Not suitable for high performance I/O Extensive monitoring facilities

ANL Tier3(g,w) Meeting 19-May-0924 The Not So Good Not a general all-purpose solution Engineered primarily for data analysis Not a true full-fledged file system Non-transactional file namespace operations Create, remove, rename, etc Create mitigated in the next release via ephemeral files SRM support not natively integrated Yes, 3 rd party package Too much reference-like documentation More tutorials would help

ANL Tier3(g,w) Meeting 19-May-0925 Conclusion Xrootd is a lightweight data access system Suitable for resource constrained environments Human as well as hardware Rugged enough to scale to large installations CERN analysis & reconstruction farms Readily available Distributed as part of the OSG VDT Also part of the CERN root distribution Visit the web site for more information

ANL Tier3(g,w) Meeting 19-May-0926 Acknowledgements Software Contributors Alice: Derek Feichtinger CERN: Fabrizio Furano, Andreas Peters Fermi/GLAST: Tony Johnson (Java) Root: Gerri Ganis, Beterand Bellenet, Fons Rademakers SLAC: Tofigh Azemoon, Jacek Becla, Andrew Hanushevsky, Wilko Kroeger BeStMan LBNL: Alex Sim, Junmin Gu, Vijaya Natarajan (BeStMan team) Operational Collaborators BNL, CERN, FZK, IN2P3, RAL, SLAC, UVIC, UTA Partial Funding US Department of Energy Contract DE-AC02-76SF00515 with Stanford University