CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/i t XROOTD news Status and strategic directions ALICE-GridKa operations meeting 03 July 2009.

Slides:



Advertisements
Similar presentations
Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.
Advertisements

EGEE is a project funded by the European Union under contract IST Using SRM: DPM and dCache G.Donvito,V.Spinoso INFN Bari
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT dpm-xrootd v3 Creating Federated Data Stores for the LHC David.
Low level CASE: Source Code Management. Source Code Management  Also known as Configuration Management  Source Code Managers are tools that: –Archive.
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 7: Planning a DNS Strategy.
SM3121 Software Technology Mark Green School of Creative Media.
G51FSE Version Control Naisan Benatar. Lecture 5 - Version Control 2 On today’s menu... The problems with lots of code and lots of people Version control.
ALICE Operations short summary and directions in 2012 Grid Deployment Board March 21, 2011.
ALICE Operations short summary and directions in 2012 WLCG workshop May 19-20, 2012.
1 Status of the ALICE CERN Analysis Facility Marco MEONI – CERN/ALICE Jan Fiete GROSSE-OETRINGHAUS - CERN /ALICE CHEP Prague.
Staging to CAF + User groups + fairshare Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE Offline week,
Microsoft Windows 2003 Server. Client/Server Environment Many client computers connect to a server.
Experiences Deploying Xrootd at RAL Chris Brew (RAL)
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
CERN - IT Department CH-1211 Genève 23 Switzerland t The High Performance Archiver for the LHC Experiments Manuel Gonzalez Berges CERN, Geneva.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS From data management to storage services to the next challenges.
Step By Step Windows Server 2003 Installation Guide Step By Step Windows Server 2003 Installation Guide.
XROOTD Tutorial Part 2 – the bundles Fabrizio Furano.
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES PhEDEx Monitoring Nicolò Magini CERN IT-ES-VOS For the PhEDEx.
July-2008Fabrizio Furano - The Scalla suite and the Xrootd1 cmsd xrootd cmsd xrootd cmsd xrootd cmsd xrootd Client Client A small 2-level cluster. Can.
IT253: Computer Organization
The huge amount of resources available in the Grids, and the necessity to have the most up-to-date experimental software deployed in all the sites within.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Wahid, Sam, Alastair. Now installed on production storage Edinburgh: srm.glite.ecdf.ed.ac.uk  Local and global redir work (port open) e.g. root://srm.glite.ecdf.ed.ac.uk//atlas/dq2/mc12_8TeV/NTUP_SMWZ/e1242_a159_a165_r3549_p1067/mc1.
CERN IT Department CH-1211 Genève 23 Switzerland t Xrootd setup An introduction/tutorial for ALICE sysadmins.
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1.
July-2008Fabrizio Furano - The Scalla suite and the Xrootd1.
Chapter 10 Chapter 10: Managing the Distributed File System, Disk Quotas, and Software Installation.
CERN IT Department CH-1211 Genève 23 Switzerland Internet Services Xrootd explained
INTERNET SAFETY FOR KIDS
Status of PDC’07 and user analysis issues (from admin point of view) L. Betev August 28, 2007.
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.
Stephen Burke – Data Management - 3/9/02 Partner Logo Data Management Stephen Burke, PPARC/RAL Jeff Templon, NIKHEF.
CERN IT Department CH-1211 Genève 23 Switzerland t Scalla/xrootd WAN globalization tools: where we are. Now my WAN is well tuned! So what?
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
CERN IT Department CH-1211 Genève 23 Switzerland t DSS Data Access in the HEP community Getting performance with extreme HEP data distribution.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT DPM / LFC and FTS news Ricardo Rocha ( on behalf of the IT/GT/DMS.
Xrootd Monitoring and Control Harsh Arora CERN. Setting Up Service  Monalisa Service  Monalisa Repository  Test Xrootd Server  ApMon Module.
CS 147 Virtual Memory Prof. Sin Min Lee Anthony Palladino.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS XROOTD news New release New features.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Upcoming Features and Roadmap Ricardo Rocha ( on behalf of the.
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
AliRoot survey: Analysis P.Hristov 11/06/2013. Are you involved in analysis activities?(85.1% Yes, 14.9% No) 2 Involved since 4.5±2.4 years Dedicated.
CERN IT Department CH-1211 Genève 23 Switzerland t ALICE XROOTD news New xrootd bundle release Fixes and caveats A few nice-to-know-better.
Review CS File Systems - Partitions What is a hard disk partition?
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES L. Betev, A. Grigoras, C. Grigoras, P. Saiz, S. Schreiner AliEn.
Dynamic staging to a CAF cluster Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE CAF / PROOF Workshop,
Data transfers and storage Kilian Schwarz GSI. GSI – current storage capacities vobox LCG RB/CE GSI batchfarm: ALICE cluster (67 nodes/480 cores for batch.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
CERN IT Department CH-1211 Genève 23 Switzerland t Xrootd through WAN… Can I? Now my WAN is well tuned! So what? Now my WAN is well tuned.
DCache/XRootD Dmitry Litvintsev (DMS/DMD) FIFE workshop1Dmitry Litvintsev.
Andrea Manzi CERN EGI Conference on Challenges and Solutions for Big Data Processing on cloud 24/09/2014 Storage Management Overview 1 24/09/2014.
Storage Element Security Jens G Jensen, WP5 Barcelona, May 2003.
Your current Moodle 1.9 Minimum Requirements Ability to do a TEST RUN! Upgrading Moodle to Version 2 By Ramzan Jabbar Doncaster College for the Deaf By.
Configuring SQL Server for a successful SharePoint Server Deployment Haaron Gonzalez Solution Architect & Consultant Microsoft MVP SharePoint Server
New Features of Xrootd SE Wei Yang US ATLAS Tier 2/Tier 3 meeting, University of Texas, Arlington,
CERN IT Department CH-1211 Genève 23 Switzerland t Large DBs on the GRID Getting performance with wild HEP data distribution.
Amazon Web Services. Amazon Web Services (AWS) - robust, scalable and affordable infrastructure for cloud computing. This session is about:
CERN IT Department CH-1211 Genève 23 Switzerland t DPM status and plans David Smith CERN, IT-DM-SGT Pre-GDB, Grid Storage Services 11 November.
CERN IT Department CH-1211 Genève 23 Switzerland M.Schröder, Hepix Vancouver 2011 OCS Inventory at CERN Matthias Schröder (IT-OIS)
Federating Data in the ALICE Experiment
Vincenzo Spinoso EGI.eu/INFN
Status of the CERN Analysis Facility
Data Federation with Xrootd Wei Yang US ATLAS Computing Facility meeting Southern Methodist University, Oct 11-12, 2011.
Ákos Frohner EGEE'08 September 2008
Gregory Kesden, CSE-291 (Storage Systems) Fall 2017
Gregory Kesden, CSE-291 (Cloud Computing) Fall 2016
Presentation transcript:

CERN IT Department CH-1211 Genève 23 Switzerland t XROOTD news Status and strategic directions ALICE-GridKa operations meeting 03 July 2009

CERN IT Department CH-1211 Genève 23 Switzerland t The historical Problem: data access ALICE-GridKa operations meeting2 Physics experiments rely on rare events and statistics –Huge amount of data to get a significant number of events The typical data store can reach 5-10 PB… now Millions of files, thousands of concurrent clients –Each one opening many files (about in Alice, up to 1000 in GLAST/Fermi) –Each one keeping many open files –The transaction rate is very high Not uncommon O(10 3 ) file opens/sec per cluster –Average, not peak –Traffic sources: local GRID site, local batch system, WAN Need scalable high performance data access –Need to squeeze every byte/s from the hw –No imposed limits on performance and size, connectivity Do we like clusters made of 5000 servers? We MUST be able to do it. –Need a way to avoid WN under-utilization

CERN IT Department CH-1211 Genève 23 Switzerland t File serving with Xrootd ALICE-GridKa operations meeting3 Very open platform for file serving –Can be used in many many ways, even crazy An example? Xrootd-over-NFS-over-XFS data serving –BTW Do your best to avoid this kind of things! –In general, the really best option is ‘local disks’ and redundant cheap servers (if you want) + some form of data redundancy/MSS –Additional complexity can impact performance and robustness Xrootd [Scalla] is only one, always up-to-date! andhttp://savannah.cern.ch/projects/xrootd –Many sites historically set up everything manually from a CVS snapshot Or wrote new plugins to accommodate their reqs –Careful manual config (e.g. BNL-STAR) –Many others rely on a standardized setup (e.g. the Alice sites) –Others take it from the ROOT bundle (e.g. for PROOF) –… which comes from a CVS snapshot –Again, careful and sometimes very delicate manual config

CERN IT Department CH-1211 Genève 23 Switzerland t Most famous basic features ALICE-GridKa operations meeting4 No weird configuration requirements –Scale setup complexity with the requirements’ complexity. No strange SW dependencies. Highly customizable Fault tolerance High, scalable transaction rate –Open many files per second. Double the system and double the rate. –NO DBs for filesystem-like funcs! Would you put one in front of your laptop’s file system? How long would the boot take? –No known limitations in size and total global throughput for the repo Very low CPU usage on servers Happy with many clients per server –Thousands. But check their bw consumption vs the disk/net performance! WAN friendly (client+protocol+server) –Enable efficient remote POSIX-like direct data access through WAN WAN friendly (server clusters) –Can set up WAN-wide huge repositories by aggregating remote clusters –Or making them cooperate

CERN IT Department CH-1211 Genève 23 Switzerland t Basic working principle ALICE-GridKa operations meeting 5 cmsd xrootd cmsd xrootd cmsd xrootd cmsd xrootd Client Client A small 2-level cluster. Can hold Up to 64 servers P2P-like

CERN IT Department CH-1211 Genève 23 Switzerland t Simple LAN clusters ALICE-GridKa operations meeting 6 cmsd xrootd cmsd xrootd cmsd xrootd cmsd xrootd Simple cluster Up to 64 data servers 1-2 mgr redirectors Simple cluster Up to 64 data servers 1-2 mgr redirectors cmsd cmsd xrootd cmsd xrootd cmsd xrootd cmsd xrootd cmsd xrootd cmsd xrootd cmsd xrootd cmsd xrootd cmsd xrootd cmsd xrootd cmsd xrootd cmsd xrootd cmsd xrootd Advanced cluster Up to 4096 (3 lvls) or 262K (4 lvls) data servers Advanced cluster Up to 4096 (3 lvls) or 262K (4 lvls) data servers Everything can have hot spares

CERN IT Department CH-1211 Genève 23 Switzerland t Tech “strategic direction” To give tools able to deploy an “unified worldwide storage” Valid also locally at the site Fast, scalable and efficient WAN friendly Fault tolerance as per xrootd protocol+client Extended fault tolerance by means of a limited ‘self healing’ of an SE’s content Completely self-contained, no external systems Based uniquely on base features of the xrootd platform ALICE is moving in this direction –Refining history… 7ALICE-GridKa operations meeting

CERN IT Department CH-1211 Genève 23 Switzerland t Storage and metadata Here, “Storage” means just “Storage” i.e. you can get the files you put –Related with the HEP computing models, which typically need a “metadata DB” To “know” which files are supposed to exist –But not necessarily their location, which is handled by the Storage System –It does not go out of sync with reality To associate HEP-related metadata to them and do offline queries 8ALICE-GridKa operations meeting

CERN IT Department CH-1211 Genève 23 Switzerland t The ALICE way with XROOTD Pure Xrootd + ALICE strong authz plugin. No difference among T1/T2 (only size and QOS) WAN-wide globalized deployment, very efficient direct data access Tier-0: CASTOR+Xrd serving data normally. Tier-0: Pure Xrootd cluster serving conditions to ALL the GRID jobs via WAN “Old” DPM+Xrootd in some tier2s Xrootd site (GSI) A globalized cluster ALICE global redirector Local clients work Normally at each site Missing a file? Ask to the global redirector Get redirected to the right collaborating cluster, and fetch it. Immediately. A smart client could point here Any other Xrootd site (CERN) Cmsd Xrootd Virtual Mass Storage System … built on data Globalization More details and complete info in “Scalla/Xrootd WAN globalization tools: where we CHEP09 9ALICE-GridKa operations meeting

CERN IT Department CH-1211 Genève 23 Switzerland t Grid Storage collaboration Set of features in the latest ALICE xrootd Ses Very generic features, used in a plain way –Worldwide storage aggregation –“Hole fixing” Virtual Mass Storage System –A new entry: The eXtreme Copy (alpha) 10ALICE-GridKa operations meeting

CERN IT Department CH-1211 Genève 23 Switzerland t Status Good! The latest releases of the xrootd bundle ( b) are very stable 1.6b to be released now –No trace of issues about global cooperation And about data access as well –1.6b fixes a bug in the ALICE token lib A crash which can be triggered only with the ‘future’ tools. It does not hurt the current ALICE usage. So, we are at least 1 step beyond. 11ALICE-GridKa operations meeting

CERN IT Department CH-1211 Genève 23 Switzerland t Worldwide Storage Recipe: –Take all the ALICE xrootd Ses –Make sure that they export a correct namespace i.e. URLs must not contain site-specific parts –Aggregate them into a unique worldwide metacluster … and this metacluster appears as an unique storage Is it so easy? What can I do then? 12ALICE-GridKa operations meeting

CERN IT Department CH-1211 Genève 23 Switzerland t Worldwide Storage For the sysadmin it’s quite easy –Eventually, he must just set correctly one parameter LOCALPATHPFX … See the latest “Setup tutorial” Why? –Historically, each site was given a “VO” prefix Which turned to be a site-dependent mistake –Because local choices must remain local –A shared service MUST ignore local choices All the sites hosting the SAME file had a different path prefix for it, not VO-related, but site-related Many file entries in the Alice DB contain disk names, local directory names and other bad things –What if that disk/machine/local user is changed? If present, that local prefix must be specified in the local config, so that it can be made locally optional by the xrootd daemon /alice/cern.ch /something/sitename/diskname/ 13ALICE-GridKa operations meeting

CERN IT Department CH-1211 Genève 23 Switzerland t Worldwide Storage Does it cover the whole ALICE repository by now? –Unfortunately not yet “Old” DPM setups do not support this –There are chances for the upcoming version dCache-based sites do not run xrootd –The xrootd door is a basic emulation of just the basic data access protocol –It does not support the xrootd clustering, hence it cannot participate –So, what can we do until it grows up? 14ALICE-GridKa operations meeting

CERN IT Department CH-1211 Genève 23 Switzerland t Worldwide Storage The Global redirector ‘sees’ a fraction of the ALICE repository Quite a large one but not complete –If a site has a ‘hole’ in the repository this is a good place to fetch this file from The absence of a requested file triggers a “staging” from the Global redirector Born experimental, turning to be a big lifesaver 15ALICE-GridKa operations meeting

CERN IT Department CH-1211 Genève 23 Switzerland t ALICE Globalization status As said, technically very good –More sites joining –All the new sites are automatically in the game –Some older sites have upgraded –The new CASTOR is able to join ALICE SEs should: –Expose a coherent namespace (i.e. a globalizable path) –Open access for reading –Secured access for writing 16ALICE-GridKa operations meeting

CERN IT Department CH-1211 Genève 23 Switzerland t A real example Remember: ALICE creates always 3 replicas of production-related files During Spring 09 the ISS SE had serious hw troubles Relatively small but very good site They lost a number of files, very difficult to tell which ones –Hence, the ALICE DB was out of sync with it –When updated and put again in production The users did not notice anything, no jobs crashed >1600 missing files were instantly fetched, fixing the repo as they were accessed there –During the first production afternoon Success rate was ~99.9%, the avg throughput from external sites was ~5-6MB/s –We must remember that the Global redirector cannot see all the sites –The few failures were just files in SEs not seen by the Global Redirector 17ALICE-GridKa operations meeting

CERN IT Department CH-1211 Genève 23 Switzerland t The eXtreme Copy Let’s suppose that I have to get a (big) file –And that there are several replicas in different sites Big question: where to fetch it from? –The closest one? How can I tell if it’s the closest? Closest to what? Will it be faster as well? –The best connected one? It can always be overloaded or a hoax –Whatever I choose, the situation can change over time Instead I want always the max efficiency 18ALICE-GridKa operations meeting

CERN IT Department CH-1211 Genève 23 Switzerland t The eXtreme Copy So let’s be Torrent-like –And fetch individual chunks from everywhere From ALL the existing replicas of the file Adapting the speed and the load –This can boost the performance in doing copies And adapt it to varying conditions But the ALICE computing model does not need to do many copies 19ALICE-GridKa operations meeting

CERN IT Department CH-1211 Genève 23 Switzerland t The eXtreme copy Copy program xrdcp –x Wants to get ‘myfile’ from the repository Xrootd site A A globalized cluster (ALICE global redirector) Any other Xrootd site B Cmsd Xrootd Locate ‘myfile’ ALICE-GridKa operations meeting20

CERN IT Department CH-1211 Genève 23 Switzerland t Main recent fixes Replica problem The ALICE xrootd storage is globalized This unique worldwide metacluster was refusing to create replicas (‘File already exists’) The hung xrdcp problem Rarely, but it was mainly hung (extremely slow) This also fixes a problem inside (Ali)ROOT, dealing with errors when reading data. Stat problem Servers were interacting too much with the global redirector, slowing down a bit the creation of new files Drop the token Ofs lib No more authz inside the ofs lib, now it uses the default one Authz now is performed by data servers, not redirectors Authz now is in an XrdAcc plugin by A.Peters (the same as CASTOR) –Redirectors now are performant as they should be (cut the CPU consumption by 5-10X) Better parameters for servers and partitions balancing Space and load Better handling and reporting of server load Aids the load balancing among servers 21ALICE-GridKa operations meeting

CERN IT Department CH-1211 Genève 23 Switzerland t ALICE URL format Now the ALICE internal URL format is ‘as it should have been’ root://host[:port]//my/path/to/the/file/as/exported In practice, Alien asks the storage for Alien PFNs (i.e. GUIDs), not LFNs anymore This triggered a hard debugging tour in the central services (thanks Pablo/Andreas) –As far as I have understood, only the dcache- based sites needed a central fix For some reason the standard ROOT url format was not working –Yes, now the central services send a non-standard one to those sites 22ALICE-GridKa operations meeting

CERN IT Department CH-1211 Genève 23 Switzerland t Beware! A different option now forbids us to copy an old system.cnf into a new setup –OFSLIB does not exist anymore –Replaced by LIBACC –Copy by hand the values into the new one 10 simple values, not a big deal –Remember to switch the DEBUG mode OFF when you are satisfied For the rest, the instructions are the same –And the Alien Howto Wiki is up-to-date 23ALICE-GridKa operations meeting

CERN IT Department CH-1211 Genève 23 Switzerland t Upgrade! Due to the importance of the fixes/new features, updating is a very good idea now. And thanks again to the sysadmins of the test sites (ISS::File, Kolkata::SE, CERN::SE, Subatech::SE, Legnaro::SE) 24ALICE-GridKa operations meeting

CERN IT Department CH-1211 Genève 23 Switzerland t Overview Relatively pleasant overall situation New honourful SEs popping up regularly Also small sites give a great contribution to the storage –Easy to spot in the SE monitoring Path to evolution very alive All the new SEs are in the global domain New xrootd-side developments are making things smoother Means “possibility to do something really good” Very good responses from site admins 25ALICE-GridKa operations meeting

CERN IT Department CH-1211 Genève 23 Switzerland t Recent Q&A Frequent question –Q: “Hey, I installed everything but I get this warning about alice-gr01:1213” –A: Very good! The login in the global redirector is NOT open to every SE. By asking you made the right thing. What I usually do is to allow the login for that (new) rdr, and the warning disappears Note that locally the new SE is anyway operational 26ALICE-GridKa operations meeting

CERN IT Department CH-1211 Genève 23 Switzerland t Recent Q&A The xrd-installer automated setup is giving very good results –However, it still needs a bit of creativity –The partitions to aggregate must be specified by the sysadmin –In the case there are multiple partitions E.g /disk1 + /disk2 It’s a very BAD idea putting every data file in the root dir /disk1 or /disk2 –It works, but it’s messy –There is also the xrootd namespace (LOCALROOT) to deal with There is a wiser recipe which makes life easier 27ALICE-GridKa operations meeting

CERN IT Department CH-1211 Genève 23 Switzerland t Subdirectories are easier! Don’t put ALL your data in / or /diskX !!! This causes massive headaches Create subdirectories instead, with meaningful names –/disk1/xrddata –/disk2/xrddata These will contain the data (OSSCACHE) –/disk1/xrdnamespace Will contain the xrootd namespace Doing this, a backup/restore gets trivial to perform And you know what you have 28ALICE-GridKa operations meeting

CERN IT Department CH-1211 Genève 23 Switzerland t Thank you Questions?