Presentation is loading. Please wait.

Presentation is loading. Please wait.

EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org EGEE middleware: gLite Data Management EGEE Tutorial 23rd APAN Meeting, Manila Jan.

Similar presentations


Presentation on theme: "EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org EGEE middleware: gLite Data Management EGEE Tutorial 23rd APAN Meeting, Manila Jan."— Presentation transcript:

1 EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org EGEE middleware: gLite Data Management EGEE Tutorial 23rd APAN Meeting, Manila Jan 22, 2007

2 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 2 Agenda gLite Data Management –Introduction –Examples –Name Convention –Storage Elements –LCG File Catalog Data Management Practical FTS Overview FTS Practical

3 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 3 Data Management System (DMS) Provides file manipulation services for users and other Grid services. DMS enables the location, access and transfer of data –User do not need to know data location, just the logical name –Data is accessed through standard interfaces –Data can be replicated or transferred to several locations as needed –Data is shared within a VO

4 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 4 Scope of data services in gLite Simply, DMS provides all operation that all of us are used to performing  Uploading /downloading files  Creating file /directories  Renaming file /directories  Deleting file /directories  Moving file /directories  Listing directories  Creating symbolic links Note: Files are write-once, read-many –Files cannot be changed unless remove or replaced –No intention of providing a global file management system

5 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 5 Data Issues and Grid Solutions Resource centers need meet growing demand for storage – Storage Element capable to manage multiple disk pools  Disk Pool Manager (DPM), dCache, CASTOR Data is stored on different storage systems technologies –Common interface required to hide underlying complexity  Storage Resource Manager (SRM) – storage management protocol  GridFTP – secure file transfer Data is stored at different locations with separate namespace –File catalogue to provide uniform view of Grid data  LCG File Catalog (LFC) Applications need to access Grid data management services –Data management API  GFAL

6 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 6 Data management example ResourceBrokerStorageElementComputingElement DataSets info Input “sandbox” Input “sandbox” + Broker Info Output “sandbox” “User interface” LCG FileCatalogue (LFC) StorageElement File replicated onto 2 SEs

7 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 7 Data management exampleStorageElement1 “User interface” LCG FileCatalogue (LFC) Storage Element 2 File replicated onto 2 SEs “Myfile.dat” Myfile.dat File_on_se1 File_on_se2 guid

8 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 8 Data management exampleStorageElement1 “User interface” LCG FileCatalogue (LFC) StorageElement2 “Myfile.dat” Myfile.dat “Logical filename” File_on_se1 (“SURL”: site URL) File_on_se2 (“SURL”: site URL) “GUID” Global Unique Identifier

9 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 9 Name conventions Logical File Name (LFN) –An alias created by a user to refer to some item of data, e.g. “lfn:/grid/cms/20030203/run2/track1” Globally Unique Identifier (GUID) –A non-human-readable unique identifier for an item of data, e.g. “guid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6” Storage URL (SURL) or Physical File Name (PFN) –The location of an actual piece of data on a storage system, e.g. “srm://pcrd24.cern.ch/flatfiles/cms/output10_1” (SRM) “sfn://lxshare0209.cern.ch/data/alice/ntuples.dat” (Classic SE) Transport URL (TURL) –Temporary locator of a replica + access protocol: understood by a SE, e.g. “rfio://lxshare0209.cern.ch//data/alice/ntuples.dat”

10 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 10 Storage Element Provides –Storage space for files –SRM Interface –Transfer protocol (gsiFTP) ~ GSI based FTP server –POSIX-like file access  Accessed via Grid File Access Layer (GFAL) API interface To read parts of files too big to copy Example is Disk Pool Manager (DPM) –Scalable management for independent disk pools for sites –Easy to install, configure and manage –Volatile and Permanent –Secure remote and local transfer protocols  GridFTP, secure RFIO

11 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 11 LFC Service LFC = LCG File Catalogue –LCG = LHC Compute Grid –LHC = Large Hadron Collider Provides –Mapping between LFN, GUID and SURL –Transactions, Sessions, Bulk queries –Hierarchical namespace, symbolic links –System metadata –single string user metadata All members of a given VO have read-write permissions in their directory Commands look like UNIX with “lfc-” in front (often)

12 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 12 LFC Continued Users primarily access and manage files through “logical filenames” Mapping by the “LFC” catalogue server Defined by the userLFC Namespace LFC has a directory tree structure /grid/ /

13 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 13 Two sets of commands lfc commands –Use LFC commands to interact with the catalogue only  To create catalogue directory  List files –Used by you and by lcg-utils lcg-utils –Couples catalogue operations with file management  Keeps SEs and catalogue in step! –copy files to/from/between SEs –Replicated

14 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 14 LFC Catalog commands Add/replace a commentlfc-setcomment Set file/directory access control listslfc-setacl Remove a file/directorylfc-rm Rename a file/directorylfc-rename Create a directorylfc-mkdir List file/directory entries in a directorylfc-ls Make a symbolic link to a file/directorylfc-ln Get file/directory access control listslfc-getacl Delete the comment associated with the file/directorylfc-delcomment Change owner and group of the LFC file-directorylfc-chown Change access mode of the LFC file/directorylfc-chmod Summary of the LFC Catalog commands

15 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 15 Summary of lcg-utils commands Replica Management lcg-cpCopies a grid file to a local destination lcg-crCopies a file to a SE and registers the file in the catalog lcg-delDelete one file lcg-repReplication between SEs and registration of the replica lcg-gtGets the TURL for a given SURL and transfer protocol lcg-sdSets file status to “Done” for a given SURL in a SRM request

16 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 16 We are about to… List directory Upload a file to an SE and register a logical name (lfn) in the catalog Create a duplicate in another SE List the replicas Create a second logical file name for a file Download a file from an SE to the UI Please go to the web page for this practical STOP BEFORE THE “FILE TRANSFER” EXAMPLES PLEASE!

17 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 17 File Transfer Service FTS is a low level data movement service Why is it needed? –Improves reliability for transfers –Provides asynchronous file transfer  schedule transfers when resources are available –Provides control of transfer properties (channel concept) No catalogue interactions yet   users have to handle SURL

18 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 18 FTS Concepts Transfer Job –A set of source/destination pairs specifying files to transfer –Submitted to FTS for processing Channel –A job is assigned to a channel after submission –Represents a point-to-point network link –Catch all channels are possible: any-to-me, me-to-any –Similar to a queue where you can specify  VO share for the queue  Number of concurrent file transfer  Number of concurrent streams (gridFTP)

19 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 19 FTS architecture All components are decoupled from each other –Each interacts only with the database Experiments interact via web-service –User: FileTransfer –Admin: ChannelManagement VO agents assigns jobs to channels Channel agents manages assigned file transfers Monitoring and statistics can be collected via the DB

20 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 20 Summary of fts client commands FTS client glite-transfer-submitSubmit a transfer job : needs at least source and destination SURL glite-transfer-statusGiven one or more job ID, query about their status glite-transfer-cancelDelete the transfer with the give Job ID glite-transfer-listQuery about status of all user’s jobs; support options for query restrictions glite-transfer- channel-list Show all available channel; detailed info only if user has admin privileges

21 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 21 FTS Practical Create long lived proxy for FTS Locate a file (SURL) to transfer Submit a job to FTS Check FTS job status


Download ppt "EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org EGEE middleware: gLite Data Management EGEE Tutorial 23rd APAN Meeting, Manila Jan."

Similar presentations


Ads by Google