Presentation is loading. Please wait.

Presentation is loading. Please wait.

Core SRB Technology for 2005 NCOIC Workshop By Michael Wan And Wayne Schroeder SDSC SDSC/UCSD/NPACI.

Similar presentations


Presentation on theme: "Core SRB Technology for 2005 NCOIC Workshop By Michael Wan And Wayne Schroeder SDSC SDSC/UCSD/NPACI."— Presentation transcript:

1 Core SRB Technology for 2005 NCOIC Workshop By Michael Wan And Wayne Schroeder SDSC SDSC/UCSD/NPACI

2 Outline Basic Concepts behind SRB SRB architecture SRB features SRB Usage Model Wayne: –SRB productization - Installation, Administration, etc –Security and Authentication –Examples and demo

3 Initial Design of SRB Transparency and Uniformity –Data are increasingly distributed –Design Goal – use a single interface and authorization mechanism to access data across: –Multiple hosts –Multiple OS platforms –Multiple resource type (UNIX FS, HPSS, UniTree, DBMS..)

4 Initial Design of SRB Global view –Global Logical Name space – Data organization UNIX like directories (collections) and files (data) Mapping of logical name to physical attributes - host address, physical path. UNIX like API and utilities –Single Global User Name Space Single sign-on No need for UNIX account on every systems Robust access control

5 SRB Architecture Federated middleware system Client/server model – –Federation of resource servers with uniform interfaces client-server server-server - Each request handler has 2 versions –Local –Remote – pass off to server that can handle the request –All Servers use same software Simplicity – easy to implement, easy to debug –Robust access control user level, grant access to multiple users group level tickets MCAT – –Metadata catalog

6 Federation of Servers MCAT Server1 Server2 Mcat Server

7 SRB as a Data Grid SRB MCAT DB SRB Data Grid has arbitrary number of servers Complexity is hidden from users

8 SRB server design Three layers design –Top layer Interacts with clients and other servers through tcp/ip sockets User authentication Handle function requests – parses requests and invokes handlers in middle and bottom layers.

9 SRB server design (cont2) Middle layer (logical layer) –Most requests pass through here –Input parameters are in their logical representations (logical path name, logical resource name) –Generally, two types of requests – Data access – –Queries MCAT, translates from logical to physical representations –Calls functions in the bottom (physical) layer to access data Metadata access – –Interacts with MCAT

10 SRB server design (con2) –Bottom layer (physical layer) Where all data I/O to/from resources are done Handles three types of resources File system –Drivers to interface with different FS –FS supported : UNIX, HPSS, ADS, UniTree, gridFTP (to be released) DB large objects DB tables –Access DB tables (query, insert, …)

11 SRB Features -Authentication Support 2 authentication schemes –Encrypt1 (SDSC) – No plain text password over the net –GSI (Globus) –Wayne will give details

12 Performance Enhancement Parallel I/O –For transferring large files –Uses multi-threads for data transfer and disk I/O –Interface with HPSS’s mover protocol for parallel I/O –Parallel third party transfer for copy and replicate –One hop data transfer between client and data resource Bulk Operation –Uploading and downloading large number of small files –Multi-threads –Bulk registration – 500 files in one call –3-10 times speedup

13 SRB server1 SRB agent SRB server2 Sput – serial mode MCAT Sput SRB agent 1 2 3 4 5 6 srbObjCreate srbObjWrite 1.Logical-to-Physical mapping 2. Identification of Replicas 3.Access & Audit Control Peer-to-peer Request Server(s) Spawning Data Transfer R

14 SRB server1 SRB agent SRB server2 Parallel mode Data Transfer – Client Initiated MCAT Sput -M SRB agent 1 2 3 4 7 8 srbObjPut 1.Logical-to-Physical mapping 2. Identification of Replicas 3.Access & Audit Control Return socket addr., port and cookie Connect to server Data transfer R 5 6

15 Performance Enhancement (cont1) Container – –physical grouping of small files –for tape I/O or archival resources –Easy to use, transparent to users

16 Data Replication A SRB file can have multiple replica Replica can be stored in different resources Sls –l mfile –fedsrbbrick8 0 demoResc 3029449 2005-07-29-15.37 % mfile –fedsrbbrick8 1 demoResc1 3029449 2005-07-29-21.28 % mfile Commands that uses replica –Sreplicate – replicate a file to the specified resource –Sbackupsrb – backup a file to the specified resource –SsyncD – Synchronize the replica of a file

17 PhyMove –move SRB files to another resource Move files to another resource without making another replica Normally used by admin to move files around Bulk phyMove – large number of small files Parallel I/O – large files Container – move files into container Heavily used by the BBSRC project for distributed archive. –Files uploaded to local server –Files eventually moved to a central archival resource by admin

18 Performance Enhancement (cont2) Use of checksum –a MCAT metadata associated with a file –Checksum routines is part of server and client codes –For verification and synchronization of data –Built into most data handling utilities Sput, Sget, Srsync, Schksum

19 Metadata in SRB SRB System Metadata Free-form Metadata (User-defined) –Attribute-Value-Unit Triplets… Extensible Schema Metadata –User Defined –Tables integrated into MCAT Core Schema External Database Metadata operations –Metadata Insertion through User Interfaces –Bulk Metadata Insertion –Template based Metadata Extraction –Query Metadata through well defined Interfaces

20 SRB Proxy operation Perform operations on server on behalf of user –Operation where data is located –File format conversion, md5 checksum, subsetting and filtering, etc Two types of proxy operations –Proxy commands Server fork and exec executable/script on server Pipe output back to client –Proxy functions Functions built into server Well defined framework for writing proxy functions

21 HDF5-SRB Model Data flow Client API srbObjRequest(void *obj, int objID) Server API srbObjProcess(void *obj, int objID) 1. packMsg() 2. unpackMsg() 3. H5Obj::op() 4. Access file 5. packMsg() 6. unpackMsg() SRB Server HDF5 Library HDF5 file

22 Zone Federation Federation of multiple MCATs –MCAT ZONE defines a federation of SRB resources controlled by a single MCAT Each Zone has full control of its own administrative domain Each Zone can operate entirely independently from other zone. Data and Resource sharing across ZONES –Use storage resources in foreign zones –Share data across zones –Copy data across zones

23 Peer to peer Federated MCAT Zone MCAT1 MCAT2 MCAT3 Server1.1 Server1.2 Server2.1 Server2.2 Server3.1

24 SRB Client Implementations A set of Basic APIs –Over 160 APIs –Used by all clients to make request to servers Scommands –Unix like command line utilities for UNIX and Window platforms –Over 60 - Sls, Scp, Sput, Sget …

25 SRB Client Implementations (cont) inQ – Window GUI browser Jargon – Java SRB client classes –Pure Java implementation mySRB – Web based GUI –run using web browser Java Admin Tool –GUI for User and Resource management Matrix – Web service for SRB work flow

26 inQ Windows GUI

27 MySRB – Web Based SRB Interface SRB Browser Advanced Metadata manipulation

28 SRB Usage Model Various Usage models Specific Usages –SLAC’s Babar experiment –UK eScience BBSRC –BIRN

29 SRB Configuration – Peer-to-peer Data Grid Resource server Resource server Resource server Resource server Data sharing, no central resourcet Projects – NARA, BIRN

30 SRB Configuration - Exploding Star Source Server Satellite server Satellite server Satellite server Satellite server Satellite server Data source – physics experiment Projects – Babar, kek

31 SRB Configuration - Imploding Star Central Cache Server Satellite source server Satellite source server Satellite source server Satellite source server Satellite source server Archival Storage Model Projects – UK eScience – BBSRC Central Archival server

32 Peer to peer Federation of MCAT Zone MCAT1 MCAT2 MCAT3 Server1.1 Server1.2 Server2.1 Server2.2 Server3.1

33 Summary of the Babar Project Preproduction evaluation – 2003 –Highlight of Wilco Kroeger’s (SLAC) talk at IEEE 2003 –Title - “Distributing Babar Data using SRB” BaBar Computing resources are geographically distributed: 5 Tier-A center GridKA (D), IN2P3 (F), INFN-Padova (I), RAL (UK), SLAC (USA) Data have to be replicated to the Tier-A sites. Number of files is 1M. Size 100’s TB

34 Babar Preproduction – SRB Usage Allows transparent access to files. –Don’t need to know host or storage medium (disk,tape). Accessing files/collections by attributes. –Find files that were produced at a certain time or site. –Find collections from a particular run period. Preproduction test – 2 weeks of MCAT and file transfer tests

35 Babar Production Update Transferred ~70 Tb and 140K files Peak rate ~2 Tb/day. Average rate – 1 Tb/day Downtime encountered – hardware problem –DB updates Plan to federate SLAC and In2p3 Zones – –In2p3 picks up some of the load Thanks to Wilko Kroeger (SLAC) and Jean- Yves Nief (In2p3) for the info

36 UK eScience BBSRC Archival of Biological Data from 16 sites to a central resource Data ingested into local resources Admin uses bulk Sphymove to move data from local resources to a central cache Moves data into containers Replicates containers to cache resource at RAL Replicates containers to ADS archival at RAL Removes cache copies

37 UK eScience BBSRC Develop some software on their own –User interface using Jargon GUI Users not exposed to all SRB functionalities –Request tracker – track data movement after ingestion Status –Project started at beginning of this year –Just done with pilot program using SRB3.2 –Upgrading to 3.3 for production

38 Biomedical Informatics Research Network (BIRN) Major collaboration with SDSC, several of the projects’ Co-Investigators and Co-PIs are at SDSC.. SRB provides the ability to transparently share data across remote sites.

39 The BIRN SRB Data Grid

40 The BIRN Data Grid

41 SRB in BIRN BIRN Toolkit Mediator Viewing/Visualization Queries/ResultsApplications Data Management File System MCAT HPSS Data Model Data Access Data Grid Computational Grid Collaboration NMI Grid Management Globus GridPort Scheduler Distributed Resources Database SRB Database


Download ppt "Core SRB Technology for 2005 NCOIC Workshop By Michael Wan And Wayne Schroeder SDSC SDSC/UCSD/NPACI."

Similar presentations


Ads by Google