Data Management in Release 2

Slides:



Advertisements
Similar presentations
Workflows over Grid-based Web services General framework and a practical case in structural biology gLite 3.0 Data Management Hands-on David García Aristegui.
Advertisements

Workflows over Grid-based Web services General framework and a practical case in structural biology gLite 3.0 Data Management David García Aristegui Grid.
EGEE is a project funded by the European Union under contract IST Grid Data Management Hands-on Simone Campana LCG Experiment Integration and.
DataGrid is a project funded by the European Commission under contract IST WP2 – R2.1 Overview of WP2 middleware as present in EDG 2.1 release.
1 CHEP 2000, Roberto Barbera Tests of data management services in EDG 1.2 ALICE Off-line Week,
Utilizing the GDB debugger to analyze programs Background and application.
Grid Data Management Assaf Gottlieb - Israeli Grid NA3 Team EGEE is a project funded by the European Union under contract IST EGEE tutorial,
FILE TRANSFER PROTOCOL Short for File Transfer Protocol, the protocol for exchanging files over the Internet. FTP works in the same way as HTTP for transferring.
EGEE is a project funded by the European Union under contract IST Data Services Valeria Ardizzone EGEE NA4 Generic Applications INFN Catania.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Services Abderrahman El Kharrim
Basic Grid Job Submission Alessandra Forti 28 March 2006.
1 Chapter Overview Creating User and Computer Objects Maintaining User Accounts Creating User Profiles.
Hands-On Microsoft Windows Server 2008 Chapter 8 Managing Windows Server 2008 Network Services.
QAD .Net UI: New Enhancements
Microsoft Windows 2003 Server. Client/Server Environment Many client computers connect to a server.
1 THE UNIX FILE SYSTEM By Chokechai Chuensukanant ID COSC 513 Operating System.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
SRM at Clemson Michael Fenn. What is a Storage Element? Provides grid-accessible storage space. Is accessible to applications running on OSG through either.
1 Administering Shared Folders Understanding Shared Folders Planning Shared Folders Sharing Folders Combining Shared Folder Permissions and NTFS Permissions.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE middleware: gLite Data Management EGEE Tutorial 23rd APAN Meeting, Manila Jan.
Jan 31, 2006 SEE-GRID Nis Training Session Hands-on V: Standard Grid Usage Dušan Vudragović SCL and ATLAS group Institute of Physics, Belgrade.
Replica Management Services in the European DataGrid Project Work Package 2 European DataGrid.
Stephen Burke – Data Management - 3/9/02 Partner Logo Data Management Stephen Burke, PPARC/RAL Jeff Templon, NIKHEF.
January 26, 2003Eric Hjort HRMs in STAR Eric Hjort, LBNL (STAR/PPDG Collaborations)
Jens G Jensen RAL, EDG WP5 Storage Element Overview DataGrid Project Conference Heidelberg, 26 Sep-01 Oct 2003.
Configuring and Troubleshooting Identity and Access Solutions with Windows Server® 2008 Active Directory®
Free Powerpoint Templates Page 1 Free Powerpoint Templates Users and Documents.
INFSO-RI Enabling Grids for E-sciencE Αthanasia Asiki Computing Systems Laboratory, National Technical.
Managing Data DIRAC Project. Outline  Data management components  Storage Elements  File Catalogs  DIRAC conventions for user data  Data operation.
SEE-GRID-SCI Storage Element Installation and Configuration Branimir Ackovic Institute of Physics Serbia The SEE-GRID-SCI.
Bookkeeping Tutorial. 2 Bookkeeping content  Contains records of all “jobs” and all “files” that are produced by production jobs  Job:  In fact technically.
David Adams ATLAS ATLAS distributed data management David Adams BNL February 22, 2005 Database working group ATLAS software workshop.
Induction: General components of Grid middleware and User Interfaces –April 26-28, General components of Grid middleware and User Interfaces Roberto.
Data Management The European DataGrid Project Team
Further aspects of EGEE middleware components INFN, Catania EGEE is funded by the European Union under contract IST
Data Management The European DataGrid Project Team
1 Chapter Overview Understanding Shared Folders Planning, Sharing, and Connecting to Shared Folders Combining Shared Folder Permissions and NTFS Permissions.
User Interface UI TP: UI User Interface installation & configuration.
1 DIRAC Data Management Components A.Tsaregorodtsev, CPPM, Marseille DIRAC review panel meeting, 15 November 2005, CERN.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Architecture of LHC File Catalog Valeria Ardizzone INFN Catania – EGEE-II NA3/NA4.
GRID commands lines Original presentation from David Bouvet CC/IN2P3/CNRS.
Grid Data Management Assaf Gottlieb Tel-Aviv University assafgot tau.ac.il EGEE is a project funded by the European Union under contract IST
Federating Data in the ALICE Experiment
The EDG Testbed Deployment Details
Oxana Smirnova, Jakob Nielsen (Lund University/CERN)
Classic Storage Element
The Data Grid: Towards an architecture for Distributed Management
CCNA Routing and Switching Routing and Switching Essentials v6.0
Java API del Logical File Catalog (LFC)
The European DataGrid Project Team
gLite Data management system overview
Evaluation of “data” grid tools
lcg-infosites documentation (v2.1, LCG2.3.1) 10/03/05
Hands-On Session: Data Management
EDG Final Review Demonstration
Chapter 10: Device Discovery, Management, and Maintenance
CCNA Routing and Switching Routing and Switching Essentials v6.0
WP2/WP7 Demonstration WP7 High Throughput Data transfers WP2/WP7 Replica Selection based on Network Cost Functions WP2 Replica Location Service.
Stephen Burke, PPARC/RAL Jeff Templon, NIKHEF
Chapter 10: Device Discovery, Management, and Maintenance
The EU DataGrid Data Management
Data Management Ouafa Bentaleb CERIST, Algeria
Configuring Internet-related services
Data services in gLite “s” gLite and LCG.
The EU DataGrid Data Management
Grid Data Replication Kurt Stockinger Scientific Data Management Group Lawrence Berkeley National Laboratory.
Architecture of the gLite Data Management System
gLite Data and Metadata Management
Presentation transcript:

Data Management in Release 2 The User’s and Administrator’s Points of View Peter.Kunszt@cern.ch EDG Conference Barcelona, 12 -15 May 2003 Good afternoon Ladies and Gentlemen. My name is Gabriel Zaquine from WP12 (project Management) where I have the role of Quality Engineer. I’m from CS-SI which is one of the industrial partners on the project. I’m going to present you the QUALITY ASSURANCE on the DataGrid Project.

} } Outline Data Management Tools API’s Services Admin tools For users For admins

Overview: Data Management for the User Users want to access their files transparently find the locations of their files query and browse the catalogs group files into collections make data available at many sites delete files and data

Overview: Data Management for the Administrator Administrators want to installation, configuration validation and functionality testing detect inconsistencies, reproduce problems clean up catalogs from erroneous and data clean up data stores subscribe to data to be replicated automatically (R2.1)

Naming : Everything is a URI All file names are valid URIs, i.e. they have a scheme and a scheme- specific part, separated by a colon. URLs are also valid URIs. URI jargon: [scheme:]scheme-specific-part[#fragment] [scheme:][//authority][path][?query][#fragment] User’s view : Logical File Name. Scheme = lfn lfn:some_string restrictions: being a valid URI excludes certain characters like space, $, :, %, etc. These may be escaped using % (%20 is the space character). Catalog view : GUID. Scheme = guid guid:73e16e74-26b0-11d7-b1e0-c5c68d88236a

Naming (Cont.) RLS stores SURL – GUID mappings Storage Resource Manager view: Site File Names and Site URL. Scheme : srm srm://srmhost.domain/path/to/data/file.dat This is called a SURL. The SFN is just the path part of this URI, i.e. path/to/data/file.dat in this case. RLS stores SURL – GUID mappings RMC stores GUID – LFN mappings

User’s Command Line Interface The EDG Replica Manager is the user’s interface to all Data Management functions. It is installed on UI and WN in $EDG_LOCATION/bin/edg-replica-manager edg-rm as a short-hand. Command similar to cvs: edg-rm [options] command [command-options]

Options for all commands General options for all commands are: -h, --help print a help screen on all commands or help on commands if command is given -v, --verbose print informative messages while executing -i, --insecure connect insecurely to the services --vo=VO set the vo to ‘VO’. Necessary for insecure operations and until we integrate with VOMS currently this option is mandatory. --config=file load configuration from ‘file’ instead of the default config file at $EDG_LOCATION/etc/edg-replica- manager/edg-replica-manager.conf

Bringing a file in the Grid Make a file available in the grid: copyAndRegisterFile sourceURL [command-options] Copy the file given by source and register it in the RLS. The command will return the GUID of the registered file. Valid source URL schemes are file, gsiftp, http, ftp. For a local file you have to specify the full file path using the file scheme. Short form: cr, creg Options -d destination specify the host of the destination SE or even the full URL of the destination SURL. -l logical-filename specify a logical file name -p protocol specify the protocol to be used (currently only gsiftp) -n nstreams specify the number of streams to be used (def: 8) Example edg-rm --vo=wpsix cr file:/home/pkunszt/test.dat –l lfn:pkunszt-test1

Bringing a file in the Grid 2. If a file is already available through an SRM but is not registered yet, it can be registered without the need to copy it using registerFile surl [command-options] Register the file given by the SURL it in the RLS. The command will return the GUID of the registered file. Short form: rf Options -l logical-filename specify a logical file name Example edg-rm --vo=wpsix rf srm:/lxshare0408.cern.ch/flatfiles/wpsix/test1.dat –l lfn:pkunszt- test1

Listing files available on the Grid listReplicas lfn-or-guid-or-surl list all SURLs that correspond to the given lfn, guid or surl List closest file: listBestFile lfn-or-guid list the SURL that is ‘closest’ in terms of network cost. Getting the GUID listGUID lfn-or-surl retrieve the GUID corresponding to the given lfn or surl.

Getting a file from the Grid To get a local copy of a file that is ‘on the Grid’ use the command copyFile sourceURL destURL [command-options] Copy a file. The source may be any kind of file, an LFN, GUID, SURL or other (http, ftp, gsiftp, file), but the destination can only be a non-grid file, i.e. of the scheme file, gsiftp. The command is to a large extent identical to globus-url-copy but it can also take LFNs and GUIDs as source files. Short form: cp Options -f, --force overwrite destination if it exists -p, --protocol prot use the given protocol. Currently only gsiftp is supported. -n, --nstreams n set the number of parallel streams. Defaults to 8. Example edg-rm --vo=wpsix cp lfn:peter-test1 file:/home/pkunszt/test.dat

Copying files in the Grid To create a new replica at a given site use replicateFile sourceURL [command-options] Copy a file to a grid site. Unlike copyFile, the destination may only be a SURL or SE host. The source has to be an LFN, GUID or SURL. Short form: rep Options -d, --destination dest destination SE host or SURL. If it is omitted, the default SE is used as destination -p, --protocol prot use the given protocol. Currently only gsiftp is supported. -n, --nstreams n set the number of parallel streams. Defaults to 8. Example edg-rm --vo=wpsix rep lfn:peter-test1 –d lxshare0408.cern.ch The command getBestFile is identical, the source has to be GUID or LFN. The best source to copy from is determined through the optimizer.

Removing and unregistering To completely remove a file from the Grid, use deleteFile sourceURL [command-options] Delete a file. If the source is an LFN or GUID, the destination SE host (from where to delete the file) has to be given. Or the SURL can be specified directly. For GUIDs the --all flag may be set, in which case all replicas of the given GUID are removed. Short form: del Options -d, --destination dest destination SE host. --all remove all replicas (only valid with GUIDs) Example edg-rm --vo=wpsix del lfn:peter-test1 –d lxshare0408.cern.ch Similarly, files may be unregistered (the copy will not be physically removed) using unregisterFile.

Miscellaneous Commands Getting the Server and Client version getVersion Printing the information that is used from the info providers printInfo Listing the contents of a directory on an SRM listDirectory

Environment Variables: Logging behavior EDG_RM_LOGPROPERTIES by default logging is disabled. the default logging properties are read from $EDG_LOCATION/etc/edg-replica-manager/log4j.properties to change it as a user: copy this file and edit. Line of interest: log4j.logger.org.edg.data=OFF Set this to one of FATAL INFO WARN DEBUG which are the available logging level. DEBUG gives the most verbose information. Set EDG_RM_LOGPROPERTIES to your local copy. EDG_RM_LOGFILE by default, the log file goes to /tmp/edg-replica-manager-username.log where username is the unix user that the command is run under. Set EDG_RM_LOGFILE to change this.