Presentation is loading. Please wait.

Presentation is loading. Please wait.

The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Data Management Maha Metawei

Similar presentations


Presentation on theme: "The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Data Management Maha Metawei"— Presentation transcript:

1 www.epikh.eu The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Data Management Maha Metawei (maha_metawei@eri.sci.eg)maha_metawei@eri.sci.eg Electronic Research Institute (ERI) Joint EPIKH/EUMEDGRID-Support Event in Cairo Egypt, 25.10.2010

2 Outline Challenges of data management in a Grid infrastructure Initial definitions Types of Storage Elements File naming conventions File catalogue Practical exercises (hands on) Be prepared for a bunch of acronyms! Cairo, Joint EPiKH/EUMEDGRID-Support event in Egypt, 25.10.2010 2

3 3 Challenges Heterogeneity Data are stored on different storage systems using different access technologies Distribution Data are stored in different locations (in most cases there is no shared file system or common namespace) Data need to be moved between different locations Data description Data are stored as files (need to describe and locate them according to their content) Storage Resource Manager interface File Catalogue File Transfer Service Metadata Service 3 Cairo, Joint EPiKH/EUMEDGRID-Support event in Egypt, 25.10.2010

4 4 Getting started The Storage Element (SE) is the service which allows users and applications (programs) to store/retrieve data (files) The DMS provide services for location, access and transfer of files User do not need to know the file location, just its logical name. Files can be replicated or transferred to several locations (SEs) as needed. Files are shared within a VO Files are write-once, read-many Files cannot be changed unless remove or replaced 4 Cairo, Joint EPiKH/EUMEDGRID-Support event in Egypt, 25.10.2010

5 5 Getting started Files located in the Storage Elements (SEs)… Are mostly write-once, read-many. Accessible by users and applications from “anywhere” in the Grid. Several replicas of one file can be replicated at different sites. Cannot be changed unless remove or replaced. Storage Elements (SEs)… Provide storage space for files. Provide transfer protocol (GSIFTP) ~ GSI based FTP server Provide an interface for the management of disk and tape storage resources: Storage Resource Manager (SRM) 5 Cairo, Joint EPiKH/EUMEDGRID-Support event in Egypt, 25.10.2010

6 6 Types of Storage Elements dCache Consists of a server and one or more pool nodes. Centralized admin.: single point of access to the SE. Files are presented in the disk pools under a single virtual filesystem tree. Uses the GSI dCache Access Protocol (gsidcap). Storm Solution best suited to cope with large storage (> or >> 100 TB) Makes full advantage of parallel filesystem (GPFS, Lustre) SRM v2.2 interface CERN Advanced STORage manager (CASTOR) Files are migrated from a disk buffer frontend to a tape mass storage Uses the insecure Remote File I/O protocol (RFIO) Disk Pool Manager (DPM) Used for fairly small SEs (max 10 TB of total space) with disk- based storage only. Uses secure RFIO protocol 6 Cairo, Joint EPiKH/EUMEDGRID-Support event in Egypt, 25.10.2010

7 7 Storage Resource Manager (SRM) You as a user need to know all the systems!!! SRM I talk to them on your behalf I will even allocate space for your files And I will use transfer protocols to send your files there SE CASTOR SE DPM SE dCache The SRM is a single interface that takes care of local storage interaction and provides a Grid interface to the outside world. SE Storm 7 Cairo, Joint EPiKH/EUMEDGRID-Support event in Egypt, 25.10.2010

8 8 A practical example (1) Storm at SiteA DPM at SiteD dCache at SiteB She is working on a job which needs: - read MonteCarlo simulations from siteA - read experiment data from siteB - read environmental data from siteC - write output to home siteD DPM at SiteC Cairo, Joint EPiKH/EUMEDGRID-Support event in Egypt, 25.10.2010

9 9 File Naming conventions (1) Grid Unique IDentifier (GUID) Every file has a GUID A non-human-readable unique identifier, e.g.: guid:38ed3f60-c402-11d7-a6b0-f53ee5a37e1d Note: all replicas of a file will share the same GUID Logical File Name (LFN) An alias that can be used to refer to a file, e.g.: lfn://grid/gilda/users/mario/myfile.dat Logical File Name 1 Logical File Name N GUID... 9 Cairo, Joint EPiKH/EUMEDGRID-Support event in Egypt, 25.10.2010

10 File Naming conventions (2) Storage URL (SURL) or Physical File Name (PFN) The location of an actual file on a storage system, e.g.: srm://aliserv6.ct.infn.it/dpm/home/gilda/project1/test.dat Note: Used by the system to find where the replica is physically stored Transport URL (TURL) Complete URI with the necessary information to access a file in a SE (including the access protocol) e.g.: rfio://lxshare0209.cern.ch//data/alice/ntuples.dat Logical File Name 1 Logical File Name N GUID... Physical File SURL N Physical File SURL 1 TURL 1... 10 Cairo, Joint EPiKH/EUMEDGRID-Support event in Egypt, 25.10.2010

11 SRM interactions Client SRM 1.The client asks the SRM for the file providing an SURL 2.The SRM asks the Storage Element to provide the file 3.The Storage Element notifies the availability of the file and its location 4.The SRM returns a TURL (Transfer URL), i.e. the location from where the file can be accessed 5.The client interacts with the storage using the protocol specified in the TURL 2 3 5 1 4 SE 11 Cairo, Joint EPiKH/EUMEDGRID-Support event in Egypt, 25.10.2010

12 Needles in a haystack How do I keep track of all files I have on the Grid? Even if I remember all the LFN’s of my files, what about someone else's files? How does the Grid keep track of the mapping between LFN(s), GUID and SURL(s)? File Catalogue LFC = LCG File Catalogue –LCG = LHC Compute Grid –LHC = Large Hadron Collider 12 Cairo, Joint EPiKH/EUMEDGRID-Support event in Egypt, 25.10.2010

13 File Catalogue Is the service which maintains mappings between LFN(s), GUID and SURL(s) It keeps track of the location of copies (replicas) of files It consists of a unique catalogue, where the LFN is the main key Looks like a “top-level” directory in the Grid For each of the supported VO a separate subdirectory exists under the "/grid" directory. All members of a given VO have read-write permissions in such a directory 13 Cairo, Joint EPiKH/EUMEDGRID-Support event in Egypt, 25.10.2010

14 The LFC Service User Interface SE B SE A SE C File Catalogue lfn:/grid/gilda/tcaland/mpi.txt 14 Cairo, Joint EPiKH/EUMEDGRID-Support event in Egypt, 25.10.2010

15 The LFC Service srm://host.example.com/foo/bar host.example.com /grid/dteam/dir1/dir2/file1.root LFN GUID 38ed3f60-c402-11d7 -a6b0… Replicas /grid/dteam/mydir/mylink Symlink Further LFNs can be added as symlinks to the main LFN. LCF key SURLs User Metadata System Metadata 15 Cairo, Joint EPiKH/EUMEDGRID-Support event in Egypt, 25.10.2010

16 Job submission – example 1 User Interface CE Worker Nodes WMS Small files: InputSandbox / OutputSandbox 16 Cairo, Joint EPiKH/EUMEDGRID-Support event in Egypt, 25.10.2010

17 Data Management – example 2 User Interface CE Worker Nodes WMS LFC SE 17 Cairo, Joint EPiKH/EUMEDGRID-Support event in Egypt, 25.10.2010

18 LFC Commands lfc-chmodChange access mode of the LFC file/directory lfc-chownChange owner and group of the LFC file-directory lfc-delcommentDelete the comment associated with the file/directory lfc-getaclGet file/directory access control lists lfc-lnMake a symbolic link to a file/directory lfc-lsList file/directory entries in a directory lfc-mkdirCreate a directory lfc-renameRename a file/directory lfc-rmRemove a file/directory lfc-setaclSet file/directory access control lists lfc-setcommentAdd/replace a comment Summary of the LFC Catalog commands 18 Cairo, Joint EPiKH/EUMEDGRID-Support event in Egypt, 25.10.2010

19 lcg utils commands Replica Management lcg-cp Copies a grid file to a local destination lcg-cr Copies a file to a SE and registers the file in the catalog lcg-del Delete one file lcg-rep Replication between SEs and registration of the replica lcg-gt Gets the TURL for a given SURL and transfer protocol lcg-sd Sets file status to “Done” for a given SURL in a SRM request File Catalog Interactionlcg-aa Add an alias in LFC for a given GUID lcg-ra Remove an alias in LFC for a given GUID lcg-rf Registers in LFC a file placed in a SE lcg-uf Unregisters in LFC a file placed in a SE lcg-la Lists the alias for a given SURL, GUID or LFN lcg-lg Get the GUID for a given LFN or SURL lcg-lr Lists the replicas for a given GUID, SURL or LFN 19 Cairo, Joint EPiKH/EUMEDGRID-Support event in Egypt, 25.10.2010

20 Hands on Login to cairoxx@server2.eun.eg [Password:cairoschool]cairoxx@server2.eun.eg [cairo01@server2 ~]$ export LFC_CATALOG_TYPE=lfc [cairo01@server2 ~]$ export LFC_HOST=lfc-gilda.ct.infn.it [cairo01@server2 ~]$ Voms-proxy-init –voms gilda [cairo01@server2 ~]$ lfc-ls /grid/gilda/ LFCApiJava MAGIC MrBayes NOAH aula_grid aula_grid_11 balasko cdg clermont corsogrid emidio generated greifswald 20 Cairo, Joint EPiKH/EUMEDGRID-Support event in Egypt, 25.10.2010

21 Hands on Create a directory on LFC [cairo01@server2 ~]$ lfc-mkdir /grid/gilda/tutorials/cairoxx (Put a number from 01 to 30 instead of xx) [cairo01@server2 ~]$ lfc-ls -l /grid/gilda/tutorials/ | grep cairoxx drwxrwxr-x 0 1425 104 0 Oct 23 19:59 cairo01 Create a dummy file: [cairo01@server2 ~]$ echo "Put something here" > text_file.txt Store this file on the aliserv6.ct.infn.it Storage Element and register it on the LFC:aliserv6.ct.infn.it [cairo01@server2 ~]$ lcg-cr --vo gilda file:/home/cairo01/text_file.txt -d aliserv6.ct.infn.it -l lfn:/grid/gilda/tutorials/cairo01/text_file.txt 21 Cairo, Joint EPiKH/EUMEDGRID-Support event in Egypt, 25.10.2010

22 Hands on Output should look like: guid:5beed2e7-142e-4bec-910e-f0538a7f36c7 Get the file SURL: [cairo01@server2 ~]$ lcg-lr --vo gilda lfn:/grid/gilda/tutorials/cairo01/text_file.txt srm://aliserv6.ct.infn.it/dpm/ct.infn.it/home/gilda/generated/2010-10- 23/fileb1b3da27-5679-4f85-bb47-ec10e8259bdc Replicate file on another SE: [cairo01@server2 ~]$ lcg-rep --vo gilda -d gilda- 02.pd.infn.it lfn:/grid/gilda/tutorials/cairo01/text_file.txt Now get 2 SURL for this file: [cairo01@server2 ~]$ lcg-lr --vo gilda lfn:/grid/gilda/tutorials/userdirectory/text_file.txt 22 Cairo, Joint EPiKH/EUMEDGRID-Support event in Egypt, 25.10.2010

23 Hands on Output should look like: srm://aliserv6.ct.infn.it/dpm/ct.infn.it/home/gilda/generated/2010-10- 23/fileb1b3da27-5679-4f85-bb47-ec10e8259bdc srm://gilda-02.pd.infn.it/dpm/pd.infn.it/home/gilda/generated/2010-10- 23/file2b4c7871-de9b-46b4-a0e8-41e854024093 Create symbolic link: [cairo01@server2 ~]$ lfc-ln -s $HOME/text_file.txt /grid/gilda/tutorials/cairo01/text_file_symlink.txt Then try to list it: [cairo01@server2 ~]$ lfc-ls -l /grid/gilda/tutorials/cairo01 -rw-rw-r-- 1 1425 104 19 Oct 23 20:33 text_file.txt lrwxrwxrwx 1 1425 104 0 Oct 23 21:00 text_file_symlink.txt -> /home/cairo01/text_file.txt 23 Cairo, Joint EPiKH/EUMEDGRID-Support event in Egypt, 25.10.2010

24 Hands on Download a file from SE to UI: [cairo01@server2 ~]$ lcg-cp --vo gilda lfn:/grid/gilda/tutorials/cairo01/text_file.txt file://$HOME/text_file_copy.txt file://$HOME/text_file_copy.txt View your file: [cairo01@server2 ~]$ cat text_file_copy.txt Remove back your file from SE and the file catalog entry as well: [cairo01@server2 ~]$ lcg-del -a --vo gilda lfn:/grid/gilda/tutorials/cairo01/text_file.txt List the file on your working directory: [cairo01@server2 ~]$ lfc-ls /grid/gilda/tutorials/cairo01 Delete your working directory: [cairo01@server2 ~]$ lfc-rm -r /grid/gilda/tutorials/cairo01 Make sure it is deleted: [cairo01@server2 ~]$ lfc-ls /grid/gilda/tutorials | grep cairo01 24 Cairo, Joint EPiKH/EUMEDGRID-Support event in Egypt, 25.10.2010

25 References gLite documentation homepage –http://glite.web.cern.ch/glite/documentation/default.asphttp://glite.web.cern.ch/glite/documentation/default.asp DM subsystem documentation –http://egee-jra1-dm.web.cern.ch/egee-jra1-dm/doc.htmhttp://egee-jra1-dm.web.cern.ch/egee-jra1-dm/doc.htm LFC and DPM documentation –https://twiki.cern.ch/twiki/bin/view/LCG/DataManagementDocum entationhttps://twiki.cern.ch/twiki/bin/view/LCG/DataManagementDocum entation gLite Data Management Tutorial: – https://grid.ct.infn.it/twiki/bin/view/GILDA/DataManagement#C reate_directory https://grid.ct.infn.it/twiki/bin/view/GILDA/DataManagement#C reate_directory 25 Cairo, Joint EPiKH/EUMEDGRID-Support event in Egypt, 25.10.2010

26 26 Cairo, Joint EPiKH/EUMEDGRID-Support event in Egypt, 25.10.2010


Download ppt "The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Data Management Maha Metawei"

Similar presentations


Ads by Google