CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/i t Xrootd setup An introduction/tutorial for ALICE sysadmins.

Slides:



Advertisements
Similar presentations
ATLAS Tier-3 in Geneva Szymon Gadomski, Uni GE at CSCS, November 2009 S. Gadomski, ”ATLAS T3 in Geneva", CSCS meeting, Nov 091 the Geneva ATLAS Tier-3.
Advertisements

EGEE is a project funded by the European Union under contract IST Using SRM: DPM and dCache G.Donvito,V.Spinoso INFN Bari
Introducing the Command Line CMSC 121 Introduction to UNIX Much of the material in these slides was taken from Dan Hood’s CMSC 121 Lecture Notes.
Efficiently Sharing Common Data HTCondor Week 2015 Zach Miller Center for High Throughput Computing Department of Computer Sciences.
On-Demand Media Streaming Over the Internet Mohamed M. Hefeeda, Bharat K. Bhargava Presented by Sam Distributed Computing Systems, FTDCS Proceedings.
Outline Network related issues and thinking for FAX Cost among sites, who has problems Analytics of FAX meta data, what are the problems  The main object.
1 Status of the ALICE CERN Analysis Facility Marco MEONI – CERN/ALICE Jan Fiete GROSSE-OETRINGHAUS - CERN /ALICE CHEP Prague.
Staging to CAF + User groups + fairshare Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE Offline week,
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
CERN IT Department CH-1211 Genève 23 Switzerland t Some Hints for “Best Practice” Regarding VO Boxes Running Critical Services and Real Use-cases.
Web server and web browser It’s a take and give policy in between client and server through HTTP(Hyper Text Transport Protocol) Server takes a request.
Experiences Deploying Xrootd at RAL Chris Brew (RAL)
CERN IT Department CH-1211 Genève 23 Switzerland t XROOTD news Status and strategic directions ALICE-GridKa operations meeting 03 July 2009.
1 The Google File System Reporter: You-Wei Zhang.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
INSTALLING MICROSOFT EXCHANGE SERVER 2003 CLUSTERS AND FRONT-END AND BACK ‑ END SERVERS Chapter 4.
5 Chapter Five Web Servers. 5 Chapter Objectives Learn about the Microsoft Personal Web Server Software Learn how to improve Web site performance Learn.
October, Scientific Linux INFN/Trieste B.Gobbo – Compass R.Gomezel - T.Macorini - L.Strizzolo INFN - Trieste.
1 Apache. 2 Module - Apache ♦ Overview This module focuses on configuring and customizing Apache web server. Apache is a commonly used Hypertext Transfer.
INTRODUCTION The GRID Data Center at INFN Pisa hosts a big Tier2 for the CMS experiment, together with local usage from other HEP related/not related activities.
XROOTD Tutorial Part 2 – the bundles Fabrizio Furano.
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
Block1 Wrapping Your Nugget Around Distributed Processing.
July-2008Fabrizio Furano - The Scalla suite and the Xrootd1 cmsd xrootd cmsd xrootd cmsd xrootd cmsd xrootd Client Client A small 2-level cluster. Can.
CVS – concurrent versions system Network Management Workshop intERlab at AIT Thailand March 11-15, 2008.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Database Architectures Database System Architectures Considerations – Data storage: Where do the data and DBMS reside? – Processing: Where.
Support in setting up a non-grid Atlas Tier 3 Doug Benjamin Duke University.
Wahid, Sam, Alastair. Now installed on production storage Edinburgh: srm.glite.ecdf.ed.ac.uk  Local and global redir work (port open) e.g. root://srm.glite.ecdf.ed.ac.uk//atlas/dq2/mc12_8TeV/NTUP_SMWZ/e1242_a159_a165_r3549_p1067/mc1.
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1.
July-2008Fabrizio Furano - The Scalla suite and the Xrootd1.
RAL Site Report Castor Face-to-Face meeting September 2014 Rob Appleyard, Shaun de Witt, Juan Sierra.
Configuring File Services. Using the Distributed File System Larger enterprises typically use more file servers Used to improve network performce Reduce.
Chapter 10 Chapter 10: Managing the Distributed File System, Disk Quotas, and Software Installation.
CERN IT Department CH-1211 Genève 23 Switzerland Internet Services Xrootd explained
Eduardo Gutarra Velez. Outline Distributed Filesystems Motivation Google Filesystem Architecture The Metadata Consistency Model File Mutation.
CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.
CERN IT Department CH-1211 Genève 23 Switzerland t Scalla/xrootd WAN globalization tools: where we are. Now my WAN is well tuned! So what?
VMware vSphere Configuration and Management v6
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
SLACFederated Storage Workshop Summary For pre-GDB (Data Access) Meeting 5/13/14 Andrew Hanushevsky SLAC National Accelerator Laboratory.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS XROOTD news New release New features.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Upcoming Features and Roadmap Ricardo Rocha ( on behalf of the.
Doug Benjamin Duke University. 2 ESD/AOD, D 1 PD, D 2 PD - POOL based D 3 PD - flat ntuple Contents defined by physics group(s) - made in official production.
CERN IT Department CH-1211 Genève 23 Switzerland t ALICE XROOTD news New xrootd bundle release Fixes and caveats A few nice-to-know-better.
Review CS File Systems - Partitions What is a hard disk partition?
Introduction to AFS IMSA Intersession 2003 An Overview of AFS Brian Sebby, IMSA ’96 Copyright 2003 by Brian Sebby, Copies of these slides.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES L. Betev, A. Grigoras, C. Grigoras, P. Saiz, S. Schreiner AliEn.
T3g software services Outline of the T3g Components R. Yoshida (ANL)
CERN - IT Department CH-1211 Genève 23 Switzerland CASTOR F2F Monitoring at CERN Miguel Coelho dos Santos.
Dynamic staging to a CAF cluster Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE CAF / PROOF Workshop,
Data transfers and storage Kilian Schwarz GSI. GSI – current storage capacities vobox LCG RB/CE GSI batchfarm: ALICE cluster (67 nodes/480 cores for batch.
09-Apr-2008Fabrizio Furano - Scalla/xrootd status and features1.
CERN IT Department CH-1211 Genève 23 Switzerland t Xrootd through WAN… Can I? Now my WAN is well tuned! So what? Now my WAN is well tuned.
CASTOR in SC Operational aspects Vladimír Bahyl CERN IT-FIO 3 2.
New Features of Xrootd SE Wei Yang US ATLAS Tier 2/Tier 3 meeting, University of Texas, Arlington,
CERN IT Department CH-1211 Genève 23 Switzerland t DPM status and plans David Smith CERN, IT-DM-SGT Pre-GDB, Grid Storage Services 11 November.
CERN IT Department CH-1211 Genève 23 Switzerland M.Schröder, Hepix Vancouver 2011 OCS Inventory at CERN Matthias Schröder (IT-OIS)
CERN IT Department CH-1211 Genève 23 Switzerland t Xrootd LHC An up-to-date technical survey about xrootd- based storage solutions.
Federating Data in the ALICE Experiment
Dag Toppe Larsen UiB/CERN CERN,
Dag Toppe Larsen UiB/CERN CERN,
dCache “Intro” a layperson perspective Frank Würthwein UCSD
Service Challenge 3 CERN
Report PROOF session ALICE Offline FAIR Grid Workshop #1
Data Federation with Xrootd Wei Yang US ATLAS Computing Facility meeting Southern Methodist University, Oct 11-12, 2011.
Introduction to Computers
Ákos Frohner EGEE'08 September 2008
June 2011 David Front Weizmann Institute
Support for ”interactive batch”
Presentation transcript:

CERN IT Department CH-1211 Genève 23 Switzerland t Xrootd setup An introduction/tutorial for ALICE sysadmins

CERN IT Department CH-1211 Genève 23 Switzerland t Outline Intro –Basic notions –Make sure we know what we want Setup of an xrootd cluster Configuration for ALICE usage A few words on data migrations F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t Introduction We must know what we want and the possibilities F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t The historical Problem: data access F.Furano (CERN IT-DM) - Xrootd setup for ALICE Physics experiments rely on rare events and statistics –Huge amount of data to get a significant number of events The typical data store can reach 5-10 PB… now Millions of files, thousands of concurrent clients –The transaction rate is very high Not uncommon O(10 3 ) file opens/sec per cluster –Average, not peak –Traffic sources: local GRID site, local batch system, WAN Up to O(10 3 ) clients per server! If not met then the outcome is: –Crashes, instability, workarounds, “need” for crazy things Scalable high performance direct data access –No imposed limits on performance and size, connectivity –Higher performance, supports WAN direct data access –Avoids WN under-utilization –No need to do inefficient local copies if not needed Do we fetch entire websites to browse one page?

CERN IT Department CH-1211 Genève 23 Switzerland t Main requirement Data access has to work reliably at the desired scale –This also means: It has to work, simpler is better It has not to waste resources –This includes the cluster’s dimensioning Very tricky topic for any kind of system With xrootd you are allowed to focus at 99% at the hw only (disks + network) A 2-4 core CPU per server usually is fine these times F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t Basic working principle F.Furano (CERN IT-DM) - Xrootd setup for ALICE cmsd xrootd cmsd xrootd cmsd xrootd cmsd xrootd Client Client A small 2-level cluster. Can hold Up to 64 servers P2P-like

CERN IT Department CH-1211 Genève 23 Switzerland t Simple LAN clusters F.Furano (CERN IT-DM) - Xrootd setup for ALICE cmsd xrootd cmsd xrootd cmsd xrootd cmsd xrootd Simple cluster Up to 64 data servers 1-2 mgr redirectors Simple cluster Up to 64 data servers 1-2 mgr redirectors cmsd cmsd xrootd cmsd xrootd cmsd xrootd cmsd xrootd cmsd xrootd cmsd xrootd cmsd xrootd cmsd xrootd cmsd xrootd cmsd xrootd cmsd xrootd cmsd xrootd cmsd xrootd Advanced cluster Up to 4096 (3 lvls) or 262K (4 lvls) data servers Advanced cluster Up to 4096 (3 lvls) or 262K (4 lvls) data servers Everything can have hot spares

CERN IT Department CH-1211 Genève 23 Switzerland t xrootd Plugin Architecture F.Furano (CERN IT-DM) - Xrootd setup for ALICE lfn2pfn prefix encoding Storage System (oss, drm/srm, etc) authentication (gsi, krb5, etc) Clustering (cmsd) authorization (name based) File System (ofs, sfs, alice, etc) Protocol (1 of n) (xrootd) Protocol Driver (XRD)

CERN IT Department CH-1211 Genève 23 Switzerland t Pure xrootd flavors F.Furano (CERN IT-DM) - Xrootd setup for ALICE Standardized all-in one setup –Supports up to 64 servers –ALICE xrootd-based sites use this Except NIHAM Manual setup –Always possible –Need for deeper knowledge Used for PROOF Other experiments (BaBar, Fermi [GLAST], Some Atlas US sites, …) Integrations/bundles (not treated here) –DPM, CASTOR Described in their own documentation –GPFS+STORM Nothing special is really needed from the xrootd point of view

CERN IT Department CH-1211 Genève 23 Switzerland t The ALICE way with XROOTD Pure Xrootd + ALICE strong authz plugin. No difference among T1/T2 (only size and QOS) WAN-wide globalized deployment, very efficient direct data access CASTOR at Tier-0 serving data, Pure Xrootd serving conditions to the GRID jobs “Old” DPM+Xrootd in several tier2s F.Furano (CERN IT-DM) - Xrootd setup for ALICE Xrootd site (GSI) A globalized cluster ALICE global redirector Local clients work Normally at each site Missing a file? Ask to the global redirector Get redirected to the right collaborating cluster, and fetch it. Immediately. A smart client could point here Any other Xrootd site (CERN) Cmsd Xrootd Virtual Mass Storage System … built on data Globalization More details and complete info in “Scalla/Xrootd WAN globalization tools: where we CHEP09

CERN IT Department CH-1211 Genève 23 Switzerland t LFNs and PFNs in ALICE Not so trivial… here’s the path: Alien LFN … fed to the Alien FC  Alien PFN, in two flavors – / – –The Alien PFN is what is asked to the storage The Alien PFN is given as an Xrootd LFN Xrootd name translation –If the site has a local Alien PFN prefix (DEPRECATED), add it (the ALICE GLOBAL name xlation) –Regular Xrootd translation (adds a prefix, i.e. the mountpoint name) Xrootd PFN –When you migrate/maintain you deal with these! F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t The name translation The xrootd name translation is very simple –It adds a prefix in front of everything We call it “localroot” –In practice, it is the name of the directory where the files are stored –An empty prefix is not admitted Admins are supposed to choose safe places F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t The ALICE Global name translation A complex thing which translates in a super- simple rule –Look in the ALICE LDAP for your SE –If there is a prefix for the data local to your cluster, then your cluster cannot be ok for the global redirector Copy the prefix We well call this “local path prefix” and put it in the xrootd config –If xrootd sees a LFN which does not have the “local path prefix”, it will internally add it. This is because any cluster has to expose a “global namespace” (so, without exposing weird local prefixes to other sites) –But the old stuff has still to work Stay tuned, more info later F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t The ALICE Global name translation F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t The ALICE Global name translation That prefix was a relic of the past –When the concepts of “local”, “physical”, “name translation”, were not very clear New ALICE SEs must have it empty –Because they export a coherent name space to the central services –In other words: An Alien PFN must look the same in every site The purpose of this additional tricky translation (LOCALPATHPFX) is to accommodate this also for older sites F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t The cache file system A lot of people seem to choke in front of this With the xrootd terminology we call this “oss.cache” Purpose: aggregate several mountpoints in a data server –They MUST be exported as an unique thing –The rest of the computing MUST NOT deal with local choices E.g. the name of the disks/directories in a machine These are information which MUST remain local Common errors –Putting data files in / –Putting together data and namespace (i.e. localroot) It works, but it’s messy and difficult to manage when you need it F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t The cache file system A unique directory contains the “name space” A directory tree which contains symlinks Meaningful name -> /mydisk1/data/xrdnamespace This is the xrootd internal prefix (localroot) The other N directories contain the data files Slightly renamed (not our business) Meaningful names: –/mydisk1/data/xrddata –/mydisk2/data/xrddata Example: /mydisk1/data/xrdnamespace/my/little/file.txt  /mydisk1/data/xrddata/my%little%file.txt –.. And the picture should be more clear now F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t Setup A view on the procedure and the options Rules of thumb where possible F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t Root or xrootd user? It depends on the site requirements Internally xrootd runs ALWAYS as a normal user If you need an RPM you need root access You may have Quattor The configuration does not need to reflect this –There are the usual pros and cons. Typically: Normal user setup: easier to maintain for small installations. Less downtime needed for operations. Root setup: very nice for RPM-based automated setups F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t A note on the file descriptors The purpose of a server is to handle clients –Otherwise it is useless Hence the machine has to be configured as a server, meaning typically: –Good TCP configuration The SLC defaults are very bad Look here: Ask Costin Grigoras for details/moral support –All the OS resources are available Max overall number of file descriptors (per machine, not per user) –The max # of file descriptors per user is >65000 In the Alien Howtos there is a recipe for that Xrootd will not start if this is not met. F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t Normal setup From the Wiki (and simplified) –Get the script wget dev/xrootd/tarballs/installbox/xrd-installer –Run it./xrd-installer --install --prefix –Check the result FAILED means a bad thing F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t Making an RPM From the Wiki page: –Go into a work subdirectory: mkdir work cd work –Get the script: wget dev/xrootd/tarballs/installbox/xrd-rpmerhttp://project-arda-dev.web.cern.ch/project-arda- dev/xrootd/tarballs/installbox/xrd-rpmer chmod 755./xrd-rpmer –Run it as root user (and wait): sudo./xrd-rpmer –Check that there were no compilation failures. The RPM will be placed in the usual directory /usr/src/redhat/RPMS/. Depending on the Linux distribution, the location could be different. –No way to know it in advance F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t Configuration files Two files, but only one needs modifications –The default authz.cnf is already configured for the ALICE security model system.cnf contains the configuration Redirector name, data partition names, etc… F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t Important config options F.Furano (CERN IT-DM) - Xrootd setup for ALICE VMSS_SOURCE: where this cluster tries to fetch files from, in the case they are absent. LOCALPATHPFX: the prefix of the namespace which has to be made "facultative” (not needed by everybody) LOCALROOT is the local (relative to the mounted disks) place where all the data (or the namespace) is put/kept by the xrootd server. OSSCACHE: probably your server has more than one disk to use to store data. Here you list the mountpoints where the data is stored. MANAGERHOST: the redirector’s name SERVERONREDIRECTOR: is this machine both a server and redirector? OFSLIB: the authorization plugin library to be used in xrootd. METAMGRHOST, METAMGRPORT: host and port number of the meta- manager (global redirector) XRDUSER The name of the user which runs the daemons (e.g. xrootd, aliprod)

CERN IT Department CH-1211 Genève 23 Switzerland t Config options MANAGERHOST –The name of the machine hosting the redirector Execute in the redirector machine: – hostname –f and copy/paste the result –The redirector and all the data servers will understand their role from this information Automagically –So, the content must be the same for all the machines of the cluster F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t Config options SERVERONREDIRECTOR –Is your cluster small (1-2 machines)? If so, you can use a machine to host BOTH the redirector services and the data server services. If you like this, put “1” –You need decent hw to do that –Reconsider this when you add new servers Otherwise put “0” F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t Config options XRDUSER –The daemons will be run under a user’s credentials Typically everybody uses “xrootd” or “aliprod” Write here the name of this system user MAKE SURE IT EXISTS MAKE SURE IT HAS A HOME DIRECTORY F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t Config options METAMGRHOST, METAMGRPORT –This is the address of the Global redirector –Through this all the data is accessible in the global domain VMSS_SOURCE –Also this is the URL prefix which point to the Global Redirector –Through this, the cluster auto-feeds itself from the global domain in case of troubles F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t Config options OFSLIB –The way the ALICE strong auth works –Specify TokenAuthzOfs to enable the authz Mandatory for ALICE Writing is only allowed via the auth of the central services –Specify Ofs to make your little tests out of production But the SE will not work without the strong auth F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t Config options LOCALROOT –The place where the xrootd daemon puts the written data –Something like: /mydisk1/xrdfiles/ –If you are aggregating multiple partitions (OSSCACHE option) then you will put something like this: /mydisk1/data/xrdnamespace This is going to be your most important directory Keep it separated from the others! F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t Config options LOCALPATHPFX –The “old style” local prefix to the data –Take it from LDAP (next slide) –If you are in doubt, please ask F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t Config options LOCALPATHPFX F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t Config options OSSCACHE –Here you list (with the xrootd syntax) the xrootd config lines which configure the cache file system, separated by “\n” e.g. if your machine can use as raw storage –/data/disk1 and –/data/disk2 HINT: don’t put data in the root dir of a mountpoint. It works but it’s uncomfortable… then do: “oss.cache public /data/disk1/xrddata\noss.cache public /data/disk2/xrddata” F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t Setup: start and stop F.Furano (CERN IT-DM) - Xrootd setup for ALICE Check the status of the daemons xrd.sh Check the status of the daemons and start the ones which are currently not running (and make sure they are checked every 5 mins). Also do the log rotation. xrd.sh -c Force a restart of all daemons xrd.sh -f Stop all daemons xrd.sh -k

CERN IT Department CH-1211 Genève 23 Switzerland t I want to migrate/clean up Too many different situations –E.g. Mismatches in the number of servers, in the partition sizes, in the partition numbers, … –Hence a single recipe is difficult to invent… –Better to understand how it works and be prepared BUT … The files are there They are “in clear” –Easy with one partition per server Just do it with the normal UNIX tools –But, with OSSCACHE we have also the need to migrate/recreate the filesystem part, made of symlinks Every case is different, I never saw the same thing twice In the worst cases a small shell script does the job Here we show two relevant cases F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t The ALICE::CERN::SE July trick F.Furano (CERN IT-DM) - Xrootd setup for ALICE All of the ALICE cond data is in ALICE:CERN::SE –5 machines pure xrootd, and all the jobs access it, from everywhere –There was the need to refurbish that old cluster (2 yrs old) with completely new hw –VERY critical and VERY stable service. Stop it and every job stops. A particular way to use the same pieces of the vMSS In order to phase out an old SE –Moving its content to the new one –Can be many TBs –rsync cannot sync 3 servers into 5 or fix the organization of the files Advantages –Files are spread evenly  load balancing is effective –More used files are fetched typically first –No service downtime, the jobs did not stop –Server downtime of 10-20min… just to be sure –The client side fault tolerance made the jobs retry with no troubles How to do it: –Point the VMSS_SOURCE to the old cluster –Point the load to the new cluster

CERN IT Department CH-1211 Genève 23 Switzerland t cmsd xrootd New SE (starting empty) The ALICE::CERN::SE July trick F.Furano (CERN IT-DM) - Xrootd setup for ALICE cmsd xrootd Old CERN::ALICE::SE (full) cmsd xrootd ALICE global redirector Grid Shuttle other

CERN IT Department CH-1211 Genève 23 Switzerland t The CNAF09 migration Move all the data (17TB) –From the phys. dir: /storage/gpfs_alice/alice –To the phys dir: /storage/gpfs_alice/storm –No downtime wanted (not really needed, but a very nice exercise) First we note (from the paths) that there is no cache file system at CNAF –They use GPFS as a backend store (+ STORM as SRM implementation) –All the files are stored in one directory tree which has to be moved to a different mount point. F.Furano (CERN IT-DM) - Xrootd setup for ALICE Credits: F.Noferini (CNAF)

CERN IT Department CH-1211 Genève 23 Switzerland t The CNAF09 migration Scenario #1: downtime –Announce downtime –Stop all the services –Start a monster copy with rsync –Check the destination –Update the xrootd config –Fix LOCALROOT to the new path –Restart services –End of downtime Depending on the data volume, it could take even days/weeks (17TB is not much, they could be 1700 or 17000) F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t The CNAF09 migration NO downtime, full service during migration –And the new files go in the new place Add a new r/w server to the cluster, on the new mountpoint Make the old server r/o (needs a cfg tweak) –So, all the new files will go to the new one –And the old ones are still accessible Start the monster rsync Check the destination Switch the old server off –The client’s fault tolerance will save the jobs, by jumping to the new server. No saving action needed. Done! F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t Conclusion I realize that this might not be enough –Impossible to cover all the use cases, even if simple –We could try to write a book… would you buy it? :-D Most of the complexity is hidden –But still there for advanced scenarios Basic principle: –Know how it works –Develop your own ideas –Get in touch with the ALICE TF for hints/assistance/moral support –The documentation for geeks is public: Quite advanced but very complete –More normal persons will like this: F.Furano (CERN IT-DM) - Xrootd setup for ALICE

CERN IT Department CH-1211 Genève 23 Switzerland t Thank you Questions?