Integrating SRB with the GIGGLE framework

Slides:



Advertisements
Similar presentations
DataTAG WP4 Meeting CNAF Jan 14, 2003 Interfacing AliEn and EDG 1/13 Stefano Bagnasco, INFN Torino Interfacing AliEn to EDG Stefano Bagnasco, INFN Torino.
Advertisements

The Replica Location Service In wide area computing systems, it is often desirable to create copies (replicas) of data objects. Replication can be used.
Andrew McNab - Manchester HEP - 17 September 2002 Putting Existing Farms on the Testbed Manchester DZero/Atlas and BaBar farms are available via the Testbed.
Data Management Expert Panel. RLS Globus-EDG Replica Location Service u Joint Design in the form of the Giggle architecture u Reference Implementation.
Andrew McNab - Manchester HEP - 2 May 2002 Testbed and Authorisation EU DataGrid Testbed 1 Job Lifecycle Software releases Authorisation at your site Grid/Web.
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE SAN DIEGO SUPERCOMPUTER CENTER Particle Physics Data Grid PPDG Data Handling System Reagan.
Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
N° 1 LCG EDG Data Management Catalogs in LCG James Casey LCG Fellow, IT-DB Group, CERN
Magda – Manager for grid-based data Wensheng Deng Physics Applications Software group Brookhaven National Laboratory.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
Don Quijote Data Management for the ATLAS Automatic Production System Miguel Branco – CERN ATC
Jan Storage Resource Broker Managing Distributed Data in a Grid A discussion of a paper published by a group of researchers at the San Diego Supercomputer.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Cosener’s House – 30 th Jan’031 LHCb Progress & Plans Nick Brook University of Bristol News & User Plans Technical Progress Review of deliverables.
Grid Status - PPDG / Magda / pacman Torre Wenaus BNL U.S. ATLAS Physics and Computing Advisory Panel Review Argonne National Laboratory Oct 30, 2001.
BaBar Data Distribution using the Storage Resource Broker Adil Hasan, Wilko Kroeger (SLAC Computing Services), Dominique Boutigny (LAPP), Cristina Bulfon.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE middleware: gLite Data Management EGEE Tutorial 23rd APAN Meeting, Manila Jan.
Serverless Network File Systems Overview by Joseph Thompson.
Enabling Grids for E-sciencE Introduction Data Management Jan Just Keijser Nikhef Grid Tutorial, November 2008.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
Replica Management Services in the European DataGrid Project Work Package 2 European DataGrid.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
SkimData and Replica Catalogue Alessandra Forti BaBar Collaboration Meeting November 13 th 2002 skimData based replica catalogue RLS (Replica Location.
Owen SyngeTitle of TalkSlide 1 Storage Management Owen Synge – Developer, Packager, and first line support to System Administrators. Talks Scope –GridPP.
INFSO-RI Enabling Grids for E-sciencE OSG-LCG Interoperability Activity Author: Laurence Field (CERN)
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
WLCG Grid Deployment Board, CERN 11 June 2008 Storage Update Flavia Donno CERN/IT.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Jens G Jensen RAL, EDG WP5 Storage Element Overview DataGrid Project Conference Heidelberg, 26 Sep-01 Oct 2003.
1 e-Science AHM st Aug – 3 rd Sept 2004 Nottingham Distributed Storage management using SRB on UK National Grid Service Manandhar A, Haines K,
Introduction to The Storage Resource.
INFSO-RI Enabling Grids for E-sciencE Introduction Data Management Ron Trompert SARA Grid Tutorial, September 2007.
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
Data Management The European DataGrid Project Team
Distributed Data Access Control Mechanisms and the SRM Peter Kunszt Manager Swiss Grid Initiative Swiss National Supercomputing Centre CSCS GGF Grid Data.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
The Storage Resource Broker and.
LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
The AstroGrid-D Information Service Stellaris A central grid component to store, manage and transform metadata - and connect to the VO!
Seminar On Rain Technology
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
User Domain Storage Elements SURL  TURL LFC Domain (LCG File Catalogue) SA1 – Data Grid Interoperation Enabling Grids for E-sciencE EGEE-III INFSO-RI
SEMINAR TOPIC ON “RAIN TECHNOLOGY”
J Jensen / WP5 /RAL UCL 4/5 March 2004 GridPP / DataGrid wrap-up Mass Storage Management J Jensen
EGEE Data Management Services
Jean-Philippe Baud, IT-GD, CERN November 2007
StoRM: a SRM solution for disk based storage systems
Vincenzo Spinoso EGI.eu/INFN
(on behalf of the POOL team)
Moving the LHCb Monte Carlo production system to the GRID
Data Bridge Solving diverse data access in scientific applications
Tim Barrass Split ( ?) between BaBar and CMS projects.
GGF OGSA-WG, Data Use Cases Peter Kunszt Middleware Activity, Data Management Cluster EGEE is a project funded by the European.
Comparison of LCG-2 and gLite v1.0
Introduction to Data Management in EGI
Sergio Fantinel, INFN LNL/PD
POOL File Catalog: Design & Status
Dirk Düllmann CERN Openlab storage workshop 17th March 2003
Data Management in Release 2
Interoperability & Standards
Data services in gLite “s” gLite and LCG.
INFNGRID Workshop – Bari, Italy, October 2004
Event Storage GAUDI - Data access/storage Framework related issues
Information Services Claudio Cherubino INFN Catania Bologna
Presentation transcript:

Integrating SRB with the GIGGLE framework Simon Metson Owen Maroney Tim Barrass {s.metson,o.maroney,tim.barrass}@bristol.ac.uk ACAT ’03, KEK, Japan Hello, my name is Simon Metson. My colleagues and I are members of the Particle Physics group at Bristol in the UK.I’m here today to talk about a project that my colleagues and I have been developing. We’ve been giving some thought to how the SDSC’s Storage Resource Broker might interoperate with the European DataGrid. We’ve identified three components to this system, and today I’m going to talk about the first- the integration of the SRB with the GIGGLE framework. 24/02/2019

The GIGGLE Framework Chervenak et al, http://www. globus Generic distributed data discovery system High level of redundancy, failover EDG/LCG and Globus are both developing implementations RLI LRC Framework proposed by Chervenak et al Generic system of data discovery Based on a hierachy of services to avoid a single points of failure SE register PFN:GUID mapping in LRC LRC pushes GUID:LRC mappings onto RLI RLI forwards queries to relevant LRC LRC returns PFN EDG/LCG and Globusboth developing implementations 24/02/2019

Storage Resource Broker SDSC, http://www.npaci.edu/DICE/SRB Integrated data distribution tool Presents a single abstracted file space composed of distributed resources Tape, disk, compound resourses Produced by SDSC ~5 years old – maturing product Used by many large projects (including CMS for its recent Monte Carlo pre-production) MCat SRB 24/02/2019

The Aim Could files stored in SRB be accessed by Grid tools? Interoperability is one of the key benefits of grid middleware! Extend (not replace or modify) tools available to user Active collaboration between members of CMS, BaBar and the SDSC SRB Group Scenarios: Data discovery – locating SRB files using RLS Job submission – publish resource information and send jobs to farms with ‘close’ SRB servers File replication – consistent copying of files to and from SRB using grid tools 24/02/2019

Data Discovery Grid uses GIGGLE framework to publish data management information RLI LRC 24/02/2019

Data Discovery RLI MCat LRC SRB Grid uses GIGGLE framework to publish data management information RLI LRC SRB uses MCat database MCat SRB 24/02/2019

Data Discovery RLI RLI MCat LRC LRC SRB SRB GMCat MCat Create LRC interface to MCat - GMCat MCat SRB RLI LRC GMCat Grid uses GIGGLE framework to publish data management information RLI LRC SRB uses MCat database MCat SRB 24/02/2019

Implementation Plan … Proof of concept prototype complete LRC GMCAT Sync LRC with MCAT contents Test namespace mapping MCAT /zone/user.domain/my/path/myFile LRC srm://hostname/a/path/theFile Not performant Compares the whole MCat database using “Scommands” New functionality of next SRB release expected to improve this Time stamping – reduces number of files to check GUID generated automatically – attaching GUID metadata is slow LRC push GMCAT pull MCAT Extended workshop with SDSC, who took on board our ideas SRB files located by Dataname and replica number Dataname is unix like path Replica number enumerates replicas LRC SURL has hostname, path and file, and scheme Scheme will be srb Need to extract hostname from SRB Create path from dataname and repl_enum To make mapping Generate SURL and GUID Store GUID in MCAT for consistency Push SURL:GUID mapping onto LRC when synching NEXT!!! Go through the mapping in detail, following slides 24/02/2019

The Prototype In Detail User puts file into SRB with SRB client $ Sput myfile.txt /phy.bris.ac.uk/home/srbadmin.phy.bris.ac.uk/test/myfile.txt $ ./server_interface.pl guid = e4fb1ff3-ae4f-4e38-9b88-3703a15a92d8 $ edg-lrc -i mappingsByPfn --vo srbrls -h tuber15.phy.bris.ac.uk \ "*/phy.bris.ac.uk/home/srbadmin.phy.bris.ac.uk/test/*" guid:e4fb1ff3-ae4f-4e38-9b88-3703a15a92d8, srb://tuber13.phy.bris.ac.uk/phy.bris.ac.uk/home/srbadmin.phy.bris.ac.uk/test/myfile.txt_0 24/02/2019

The Prototype In Detail User puts file into SRB with SRB client $ Sput myfile.txt /phy.bris.ac.uk/home/srbadmin.phy.bris.ac.uk/test/myfile.txt $ ./server_interface.pl guid = e4fb1ff3-ae4f-4e38-9b88-3703a15a92d8 $ edg-lrc -i mappingsByPfn --vo srbrls -h tuber15.phy.bris.ac.uk \ "*/phy.bris.ac.uk/home/srbadmin.phy.bris.ac.uk/test/*" guid:e4fb1ff3-ae4f-4e38-9b88-3703a15a92d8, srb://tuber13.phy.bris.ac.uk/phy.bris.ac.uk/home/srbadmin.phy.bris.ac.uk/test/myfile.txt_0 GMCat updates LRC 24/02/2019

The Prototype In Detail User puts file into SRB with SRB client $ Sput myfile.txt /phy.bris.ac.uk/home/srbadmin.phy.bris.ac.uk/test/myfile.txt $ ./server_interface.pl guid = e4fb1ff3-ae4f-4e38-9b88-3703a15a92d8 $ edg-lrc -i mappingsByPfn --vo srbrls -h tuber15.phy.bris.ac.uk \ "*/phy.bris.ac.uk/home/srbadmin.phy.bris.ac.uk/test/*" guid:e4fb1ff3-ae4f-4e38-9b88-3703a15a92d8, srb://tuber13.phy.bris.ac.uk/phy.bris.ac.uk/home/srbadmin.phy.bris.ac.uk/test/myfile.txt_0 GMCat updates LRC File is visible through LRC 24/02/2019

Implementation Plan… Developing production system prototype Modifications & further testing required to ensure correct operation within production environment Include features of future SRB releases as they become available Possible use for upcoming CMS Data Challenge LCG POOL objects stored in SRB space Need to access them via an EDG LRC… 24/02/2019

Future Implementation Move to webservice implementation Match EDG LRC interface Provide dynamic mapping of MCAT to EDG RLS namespace No large database queries More maintainable code No production ready RLI currently available CMS intend to use single LRC for DC04 Currently this means we’ll not be able to use the dynamic webservice, unless MCat stores other LRC data 24/02/2019

The Complete System Will need to address information publishing and file replication Former straightforward- SRB servers can publish in the same way as an EDG Storage Element Latter difficult SRM interface to SRB? Most elegant, but there are difficulties gsiFTP server on SRB? Problems- that SRM is designed to solve EDG Replica Manager talks to SRB natively? Not as generic a solution as interface like SRM 24/02/2019

Conclusion SRB and the EDG/LCG can interoperate Data Discovery component well understood, and useful in isolation Full interoperation requires some development effort Great interest from BaBar, SDSC, CMS and RAL on various aspects so far Website in production at http://www.cern.ch/bristol-escience/SRB-RLS-web/ 10,000 new files takes ~45 minutes on a low-spec MCAT box If no files have been added to a list of 10,000 files, synch takes 4 minutes Isn’t this what a fed MCAT does? Difference in EDG / Globus RLS? Timescale for completion of Data Discovery? Full system? What are the difficulties in implementating an SRM interface? Why is a gsiFTP server bad? Are you intending to replace the EDG SE? 24/02/2019