Presentation is loading. Please wait.

Presentation is loading. Please wait.

Www.epikh.eu The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) Algiers, EUMED/Epikh Application Porting Tutorial, 2010/07/04.

Similar presentations


Presentation on theme: "Www.epikh.eu The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) Algiers, EUMED/Epikh Application Porting Tutorial, 2010/07/04."— Presentation transcript:

1 www.epikh.eu The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) Algiers, EUMED/Epikh Application Porting Tutorial, 2010/07/04 1 Architecture of the gLite Data Management System Andrea Cortellese (andrea.cortellese@ct.infn.it) INFN Catania Institute of High Energy Physics (IHEP) 6 th - 17 th September 2010 www.epikh.eu

2 2 Outline Challenges of data management in a Grid infrastructure Initial definitions Types of Storage Elements File naming conventions File catalogue Practical exercises (hands on) Be prepared for a bunch of acronyms! Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

3 3 Challenges Heterogeneity Data are stored on different storage systems using different access technologies Distribution Data are stored in different locations (in most cases there is no shared file system or common namespace) Data need to be moved between different locations Data description Data are stored as files (need to describe and locate them according to their content) Storage Resource Manager interface File Catalogue File Transfer Service Metadata Service Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

4 4 Getting started The Storage Element (SE) is the service which allows users and applications (programs) to store/retrieve data (files) The DMS provide services for location, access and transfer of files User do not need to know the file location, just its logical name. Files can be replicated or transferred to several locations (SEs) as needed. Files are shared within a VO Files are write-once, read-many Files cannot be changed unless remove or replaced Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

5 5 Getting started Files located in the Storage Elements (SEs)… Are mostly write-once, read-many. Accessible by users and applications from “anywhere” in the Grid. Several replicas of one file can be replicated at different sites. Cannot be changed unless remove or replaced. Storage Elements (SEs)… Provide storage space for files. Provide transfer protocol (GSIFTP) ~ GSI based FTP server Provide an interface for the management of disk and tape storage resources: Storage Resource Manager (SRM) Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

6 6 Types of Storage Elements dCache Consists of a server and one or more pool nodes. Centralized admin.: single point of access to the SE. Files are presented in the disk pools under a single virtual filesystem tree. Uses the GSI dCache Access Protocol (gsidcap). Storm Solution best suited to cope with large storage (> or >> 100 TB) Makes full advantage of parallel filesystem (GPFS, Lustre) SRM v2.2 interface CERN Advanced STORage manager (CASTOR) Files are migrated from a disk buffer frontend to a tape mass storage Uses the insecure Remote File I/O protocol (RFIO) Disk Pool Manager (DPM) Used for fairly small SEs (max 10 TB of total space) with disk- based storage only. Uses secure RFIO protocol Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

7 7 Storage Resource Manager (SRM) You as a user need to know all the systems!!! SRM I talk to them on your behalf I will even allocate space for your files And I will use transfer protocols to send your files there SE CASTOR SE DPM SE dCache The SRM is a single interface that takes care of local storage interaction and provides a Grid interface to the outside world. SE Storm Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

8 8 A practical example (1) Storm at SiteA DPM at SiteD dCache at SiteB She is working on a job which needs: - read MonteCarlo simulations from siteA - read experiment data from siteB - read environmental data from siteC - write output to home siteD DPM at SiteC Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

9 9 File Naming conventions (1) Grid Unique IDentifier (GUID) Every file has a GUID A non-human-readable unique identifier, e.g.: guid:38ed3f60-c402-11d7-a6b0-f53ee5a37e1d Note: all replicas of a file will share the same GUID Logical File Name (LFN) An alias that can be used to refer to a file, e.g.: lfn://grid/gilda/users/mario/myfile.dat Logical File Name 1 Logical File Name N GUID... Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

10 10 File Naming conventions (2) Storage URL (SURL) or Physical File Name (PFN) The location of an actual file on a storage system, e.g.: srm://aliserv6.ct.infn.it/dpm/home/gilda/project1/test.dat Note: Used by the system to find where the replica is physically stored Transport URL (TURL) Complete URI with the necessary information to access a file in a SE (including the access protocol) e.g.: rfio://lxshare0209.cern.ch//data/alice/ntuples.dat Logical File Name 1 Logical File Name N GUID... Physical File SURL N Physical File SURL 1 TURL 1... Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

11 11 SRM interactions Client SRM 1.The client asks the SRM for the file providing an SURL 2.The SRM asks the Storage Element to provide the file 3.The Storage Element notifies the availability of the file and its location 4.The SRM returns a TURL (Transfer URL), i.e. the location from where the file can be accessed 5.The client interacts with the storage using the protocol specified in the TURL 2 3 5 1 4 SE Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

12 12 Needles in a haystack How do I keep track of all files I have on the Grid? Even if I remember all the LFN’s of my files, what about someone else's files? How does the Grid keep track of the mapping between LFN(s), GUID and SURL(s)? File Catalogue LFC = LCG File Catalogue –LCG = LHC Compute Grid –LHC = Large Hadron Collider Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

13 13 File Catalogue Is the service which maintains mappings between LFN(s), GUID and SURL(s) It keeps track of the location of copies (replicas) of files It consists of a unique catalogue, where the LFN is the main key Looks like a “top-level” directory in the Grid For each of the supported VO a separate subdirectory exists under the "/grid" directory. All members of a given VO have read-write permissions in such a directory Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

14 14 The LFC Service User Interface SE B SE A SE C File Catalogue lfn:/grid/gilda/tcaland/mpi.txt Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

15 15 The LFC Service srm://host.example.com/foo/bar host.example.com /grid/dteam/dir1/dir2/file1.root LFN GUID 38ed3f60-c402-11d7 -a6b0… Replicas /grid/dteam/mydir/mylink Symlink Further LFNs can be added as symlinks to the main LFN. LCF key SURLs User Metadata System Metadata Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

16 16 Job submission – example 1 User Interface CE Worker Nodes WMS Small files: InputSandbox / OutputSandbox Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

17 17 Data Management – example 2 User Interface CE Worker Nodes WMS LFC SE Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

18 18 LFC commands Add/replace a commentlfc-setcomment Set file/directory access control listslfc-setacl Remove a file/directorylfc-rm Rename a file/directorylfc-rename Create a directorylfc-mkdir List file/directory entries in a directorylfc-ls Make a symbolic link to a file/directorylfc-ln Get file/directory access control listslfc-getacl Delete the comment associated with the file/directorylfc-delcomment Change owner and group of the LFC file-directorylfc-chown Change access mode of the LFC file/directorylfc-chmod Interact with the catalogue only Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

19 19 lcg-utils commands lcg-cpCopies a grid file to a local destination lcg-crCopies a file to a SE and registers the file in the catalog lcg-delDelete one file lcg-repReplication between SEs and registration of the replica lcg-gtGets the TURL for a given SURL and transfer protocol lcg-sdSets file status to “Done” for a given SURL in a SRM request Copy files to/from/between SEs. Keep the SEs and the Catalogue up to date. The RPM containing these tools (lcg_util) is installed in the WNs and UIs. Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

20 20 Environment Variables Make sure to use the correct BDII and LFC BDII - LCG_GFAL_INFOSYS –export LCG_GFAL_INFOSYS=wms-01.eumedgrid.eu:2170 LFC - LFC_HOST –export LFC_HOST=lfc.eela.ufrj.br Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

21 21 Let’s practice! Reference: https://grid.ct.infn.it/twiki/bin/view/GILDA/DataManagement Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

22 22 Environment Variables Pointing to the right BDII Pointing to the right LFC echo $LCG_GFAL_INFOSYS export LCG_GFAL_INFOSYS=wms-01.eumedgrid.eu:2170 echo $LFC_HOST export LFC_HOST=gridsrv3-4.dir.garr.it Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

23 23 Before starting… voms-proxy-info -all Make sure to have a proxy created voms-proxy-init --voms eumed Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

24 24 LFC: Listing file and directory lfc-ls -l /grid/eumed/ lfc-mkdir /grid/eumed/fgSillyTests Remember that LFC has a directory tree structure –/grid/ / Defined by the user LFC Namespace You can set LFC_HOME variable to use relative paths export LFC_HOME=/grid/eumed/ lfc-ls -l fgSillyTests Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

25 25 LFC: creating a directory lfc-mkdir /grid/eumed/yourname Create your own personal directory inside: –/grid/eumed/ You can check the creation typing: lfc-ls /grid/eumed Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

26 26 Downloading a file lcg-cp --vo eumed lfn:/grid/eumed/fgSillyTests/test_thirst.jpg file://$HOME/test_thirst.jpg First of all, let’s download a file from a SE to start “playing” with it. Basic Usage: Try it: lcg-cp --vo Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

27 27 From where it was downloaded? List the Replicas of the file: This command will return the SURL of all replicas A file can be stored on multiple SE's so that a job can download it from the closest SE while is running. lcg-lr --vo eumed lfn:/grid/eumed/fgSillyTests/test_thirst.jpg Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

28 28 Copying and registering a file 1/2 lcg-cr --vo -l -d lcg-cr –Copies a file to a SE and registers the file in the catalogue Use the lcg-info or lcg-infosites commands to figure out the available SEs This command will return the GUID for your file Make sure to have a directory in the LFC (/grid/eumed/yourname/) Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

29 lcg-infosites --vo eumed se Avail Space(Kb) Used Space(Kb) Type SEs ---------------------------------------------------------- 730164346 1954190213 n.a gridsrm.ts.infn.it 80910000000 n.a n.a torik1.ulakbim.gov.tr 60814375 6470898 n.a iceage-se-01.ct.infn.it 690051609 5161065384 n.a se01.isabella.grnet.gr 14652280930 n.a n.a prod-se-02.ct.infn.it 2000381018 n.a n.a storm-01.roma3.infn.it 7951919455 48673105 n.a se1.cnrst.magrid.ma 29 Copying and registering a file 2/2 lcg-cr --vo eumed -l lfn:/grid/eumed/yourname/yourfile.txt -d torik1.ulakbim.gov.tr file://$HOME/test_beware.jpg Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

30 30 Replicate a file between SEs lcg-rep --vo eumed -d prod-se-02.ct.infn.it lfn:/grid/eumed/yourname/yourfile.txt Basic Usage: Try it: lcg-rep --vo -d Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

31 31 Listing the replicas Use the same lcg-lr command used previously: The command will return the SURL of all replicas A file can be stored on multiple SE's so that a job can download it from the closest SE while is running. lcg-lr --vo eumed lfn:/grid/eumed/yourname/yourfile.txt Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

32 32 Adding metadata information lfc-setcomment /grid/eumed/yourname/yourfile.txt “Beware of these two guys” This is the only user-defined metadata that can be associated with catalogue entries. Basic Usage: Try it: lfc-setcomment "Your comments" Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

33 33 Listing with comments lfc-ls --comment /grid/eumed/yourname/yourfile.txt Try it: Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

34 34 Downloading a file lcg-cp --vo eumed lfn:/grid/eumed/yourname/yourfile.txt file://$HOME/theTestPicture.jpg Basic Usage: Try it: lcg-cp --vo Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

35 35 Deleting a file lcg-del -a --vo eumed lfn:/grid/eumed/yourname/yourfile.txt Basic Usage: When used with '-a' switch will delete all replicas and delete entry from catalog Try it: lcg-del -a --vo Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

36 36 Removing a LFC directory Basic Usage: Try it: lfc-rm -r lfc-rm -r /grid/eumed/yourname Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

37 37 Special Note: Get the file TURL lcg-gt gsiftp Basic Usage: Try it: lcg-gt Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010

38 38 Thank you for your kind attention ! Any questions ? Beijing, Advanced Tutorial, 06.09.2010 – 17-09.2010


Download ppt "Www.epikh.eu The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) Algiers, EUMED/Epikh Application Porting Tutorial, 2010/07/04."

Similar presentations


Ads by Google