Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Access on the TeraGrid (Possibilities & Directions) Dan Fraser, ANL Ann Chervenak, ISI TeraGrid data workshop Jan ’07, San Diego.

Similar presentations


Presentation on theme: "Data Access on the TeraGrid (Possibilities & Directions) Dan Fraser, ANL Ann Chervenak, ISI TeraGrid data workshop Jan ’07, San Diego."— Presentation transcript:

1 Data Access on the TeraGrid (Possibilities & Directions) Dan Fraser, ANL Ann Chervenak, ISI TeraGrid data workshop Jan ’07, San Diego

2 Overview l Architecture ideas from LCG u Trending toward SOA (changeable parts) u Possible synergy with TeraGrid Data (end) l Globus Data Directions u GridFTP and potential benefits for TG Users u Reliable File Transfer (RFT) u Replica Locator Service (RLS) l Distributed database that records locations of data copies u Data Replication Service (DRS) l Integrates RFT & RLS to replicate & register files u Data Access and Integration Service (DAIS) l Service to access relational and XML database

3 Architecture ideas from LCG CAstord-CacheDPMMOPS (3 T1s)(9 T1s & T2s)(T2s only) (T2s only) GridFTP is the underlying transfer mechanism RFT(OSG)FTS (EGEE) [SRM-cp, Unicore, Oracle, IBM] Tape Reliability & retries, Single point of control for VOs & bulk transfers Advanced client tools – RLS, DRS, RFT (Policies) PhedexDon QuixoteFTDDirac Experimental(CMS)(Atlas)(Alice)(LHCB) Tool KitsSubscribe to datasets.diymeta-scheduler Pick best copy SRM Interface (LCG Requirement) + POSIX I/O Experimental framework – pure science codes Proposed DMIS interface

4 PhEDEx (Physics Experiment Data Export) (slide from LCG presentation) l Large scale dataset replica management system l Managed data flow following a transfer topology (Tier0 → Tier1  Tier2) l Routed multi-hop transfers. Routing agents determine the best route l Reliable point-to-point transfers based on unreliable Grid transfer tools l Set of quasi-independent, asynchronous software agents posting messages in a central blackboard l Nodes subscribe for data allocated from other nodes l Enables distribution management at dataset level rather than at file level l Implements experiment’s policy on data placement l Allows prioritization and scheduling l In production (~3 years) u Managing transfers of several TB/day u ~100 TB known to PhEDEx, ~200 TB total replicated u Running at CERN, 7 Tier-1’s, 10 Tier-2’s

5 What can we learn? l Different communities have different needs and use their own data-specific tools. u One monolithic file system could be nice but not necessary to get work done. (yet) l Common central tools help everyone (Catalogues, Metadata, Replica Management, Easy reliable file access, Workflows [scheduling]) l Trend toward isolating “data specialized” code. l Common interfaces allow teams to play nicely together. l GSI is a big win, eventually. (hidden by portals) l … details to be filled in by people in this room

6 Overview l Architecture Ideas from the LCG u Trending toward SOA (changeable parts) u Possible synergy with TeraGrid Data l Globus Data Directions u GridFTP and potential benefits for TG Users u Reliable File Transport (RFT) u Replica Locator Service (RLS) l Distributed database that records locations of data copies u Data Replication Service (DRS) l Integrates RFT & RLS to replicate & register files u Data Access and Integration Service (DAIS) l Service to access relational and XML database

7 What is GridFTP Data Storage Interfaces (DSI) -POSIX -SRB GridFTP Server -separate control, data -striping XIO Drivers -TCP -UDT (UDP) -parallel streams -GSI Client Interfaces -Globus-URL-Copy -C Library -RFT (3 rd party) I/O File Systems Clients

8 Extensible IO (XIO) system l Provides a framework that implements a Read/Write/Open/Close Abstraction l Drivers are written that implement the functionality (file, TCP, UDP, GSI, etc.) l Different functionality is achieved by building protocol stacks l GridFTP drivers will allow 3 rd party applications to easily access files stored under a GridFTP server l Other drivers could be written to allow access to other data stores. l Changing drivers requires minimal change to the application code. l Ported GridFTP to use UDT in less than a day u AFTER the UDT driver was written

9 Parallelism vs Striping

10 Memory to Memory Striping Performance

11 Why people use GridFTP l Security (GSI, and now SSH) l Performance using parallel streams l Performance using striping (and PS) l Partial File Transfer l Third party control (reliable & restartable) l Data extensibility l Protocol extensibility

12 Top GridFTP Myths l Hard to install l Requires all of Globus l Requires GSI

13 Moving Forward Data Storage Interfaces (DSI) -POSIX -SRB I/O File Systems Clients -HPSS -Small File optimization -Virtual Deployment -Dynamic Registration -SSH XIO Drivers -TCP -UDT (UDP) -parallel streams -GSI GridFTP Server -separate control, data -striping Client Interfaces -Globus-URL-Copy -C Library -RFT (3 rd party)

14 Future GridFTP Directions l Client/Server side u Lots of Small File optimizations (beta) – transfer a sequence of small files as if they were one file. u Dynamic Mover Registration Infrastructure (GFork) l Enhance reliability, especially during striping l Dynamically scale to meet ever-changing transfer demands l Enable users to configure fast transfers u Dynamic deployment via Virtual Machines (Infiniband) u Managed Object Placement Service (MOPS) l XIO side u Enable transfers using SSH (beta) l DSI side u HPSS (beta) Help us prioritize for TeraGRID

15 Lots Of Small Files (LOSF) l Pipelining u Many transfer requests outstanding at once u Client sends second request before the first completes u Latency of request is hidden in data transfer time l Cached Data channel connections u Reuse established data channels (Mode E) u No additional TCP or GSI connect overhead

16 Fast LOSF l 1 GB of data partitioned into equal sized files l Performance doesn't degrade for pipelining until 100KB

17 GridFTP Advanced Configurations l GFork u Robust unix fork/setuid model u Allows server state to be maintained across connections l Dynamic backends u Stability in the event of backend failure u Growing resource pools for peak demands l Frontend Replication

18 GFork Client Server Host GFork Server GridFTP Plugin GridFTP Server Instance Fork GridFTP Server Instance GridFTP Server Instance State Sharing Link Client Inherited Links Control Channel Connections

19 Dynamic Backends Frontend Host GFork Server GridFTP Plugin Frontend Instance Fork Lookup available backend Backend Instance Backend Host INetD Fork Registration Daemon Registration Control Connection l Multiple BEs register with plugin u Plugin maintains the list of available BEs l FE instance selects N BEs for use l If any one BE fails another can be used l BE pool can grow and shrink

20 Reliable File Transfer l RFT accepts a SOAP description of the desired transfer l It writes this to a database l It then uses the Java GridFTP client library to initiate 3 rd part transfers on behalf of the requestor. l Restart Markers are stored in the database to allow for restart in the event of an RFT failure. l Supports concurrency, i.e., multiple files in transit at the same time. This gives good performance on many small files.

21 Reliable File Transfer l Comparison with globus-url-copy u Supports all the same options (buffer size, etc) u Increased reliability because state is stored in a database. u Service interface l The client can submit the transfer request and then disconnect and go away l Think of this as a job scheduler for transfer job l Two ways to check status u Subscribe for notifications u Poll for status

22 Globus Replica Management Current Globus tools: l Replica Location Service (RLS): u Provides registration and discovery of data items l Data Replicatoin Service (DRS) u Pull-based data replication from existing data items using RFT and registration of files in RLS Long-term plan (CEDS): provide flexible, policy-driven replication services u Maintain a certain level of redundancy for all data items u Subscribe to data items with certain characteristics and automatically receive copies of new, matching data items u Keep replicas consistent with one another

23 Replication Scenario: The LIGO Project l Laser Interferometer Gravitational Wave Observatory l Data sets first published at Caltech u Publication includes specification of metadata attributes l Data sets may be replicated at up to 10 LIGO sites u Sites perform metadata queries to identify desired data u Pull copies of data from Caltech or other LIGO sites l Customized data management system: the Lightweight Data Replicator System (LDR) u Uses existing Globus tools (GridFTP, RLS)

24 The Globus Replica Location Service A Replica Location Service (RLS) is a distributed registry that records the locations of data copies and allows replica discovery u RLS maintains mappings between logical identifiers and target names u Must perform and scale well: support hundreds of millions of objects, hundreds of clients l E.g., LIGO (Laser Interferometer Gravitational Wave Observatory) Project u RLS servers at 10 sites u Maintain associations between 11 million logical file names & 120 million physical file locations

25 Replica Location Indexes Local Replica Catalogs LRCs use soft state update mechanisms to inform RLIs about their state: relaxed consistency of index Optional compression of state updates reduces communication, CPU and storage overheads RLS Framework Local Replica Catalogs (LRCs) contain consistent information about logical-to- target mappings Replica Location Index (RLI) nodes aggregate information about one or more LRCs

26 RLS Status l Stable component u Greatly improved performance and scalability in last 2 years u No major changes to existing RLS functionality, interfaces u New interface: WS-RF compatible web services interface (WS-RLS) l Major difficulty for users has been installation and configuration of open source relational database backends l New features u Support for embedded database backend (sqlite) u Easier configuration of relational database backends u Pure Java client for RLS (available approx. March 2007) l Planned Features u Dynamic deployment of RLS services u Better support for RLS configuration management in VOs u Finer-grained authorization support for users

27 Motivation for Data Replication Service l Data-intensive applications need higher-level data management services that integrate lower-level Grid functionality u Efficient data transfer (GridFTP, RFT) u Replica registration and discovery (RLS) u Eventually validation of replicas, consistency management, etc. l Goal is to generalize the custom data management systems developed by several application communities l Eventually plan to provide a suite of general, configurable, higher-level data management services l Globus Data Replication Service (DRS) is the first of these services

28 The Data Replication Service l Included in the GT4.0.2 release l Design based on publication component of the LIGO Lightweight Data Replicator system u Developed by Scott Koranda l Client specifies (via DRS interface) which files are required at local site l DRS uses: u Globus Delegation Service to delegate proxy credentials u RLS to discover where replicas exist in the Grid u Selection algorithm to choose among available source replicas (provides a callout; default is random selection) u Reliable File Transfer (RFT) service to copy data to site l Via GridFTP data transport protocol u RLS to register new replicas

29 DRS Functionality l Delegate credential via Delegation Service l Create a Replicator resource via DRS l Discover replicas of desired files in RLS, select among replicas l Transfer data to local site with Reliable File Transfer Service using GridFTP Servers l Register new replicas in RLS catalogs l Monitor Replicator resource and trigger events l Inspect state of DRS resource and Resource Properties l Destroy Replicator resource RPs Replicator DRS RPs Transfer RFT RLS Index RLS Catalog GridFTP Server GridFTP Server Client

30 Next Generation: Data Placement Services l Center for Enabling Petascale Distributed Science (CEDS) u Recently funded by DOE Scidac2 as a Center for Enabling Technologies l Includes: u USC Information Sciences Institute u Argonne National Laboratory u University of Wisconsin Madison u Lawrence Berkeley National Laboratory u Fermi National Laboratory l Higher-level, policy-driven placement of data l End-to-end provisioning of data resources to carry out placement decisions

31 Layered Architecture

32 Higher-Level Data Placement Services l Decide where to place objects and replicas in the distributed Grid environment l Policy-driven, based on needs of application l Effectively creates a placement workflow that is passed to the Reliable Distribution Service Layer for execution l Push- or pull-based service that places explicit list of data items l Metadata-based placement u Decide where data objects are placed based on results of metadata queries for data with certain attributes l N-Copies: maintain N copies of data items u Placement service checks existing replicas, creates/delete replicas to maintain N copies l Publication/Subscription u Allows sites or clients to subscribe to topics of interest u Data objects are placed as indicated by subscriptions

33 Reliable Distribution Layer l Responsible for carrying out the distribution or placement “plan” generated by higher-level service l Extend functionality of reliable file transfer services l Needs to provide feedback to higher level placement services on the outcome of the placement workflow l Call on lower-level services to coordinate (e.g., GridFTP data transport service)

34 OGSA-DAI in a nutshell l An extensible framework for data access and integration l Expose heterogeneous data resources to a grid through web services l Interact with data resources u Queries and updates u Data transformation / compression u Data delivery u Application-specific functionality u Supports relational, XML and text and binary files u Supports various delivery options and transforms u Supports secure conversation message-level security using X509 certificates l A base for higher-level services u Federation, mining, visualisation,…

35 OGSA-DAI motivation l Entering an age of data u Data Explosion l CERN LHC will generate 1GB/s = 10PB/y l Pixar generate 100 TB/movie u Storage getting cheaper l Data stored in many different ways u Relational databases u XML databases u Text and binary files l Need ways to facilitate u Data discovery u Data access u Data integration l Empower e-Business and e-Science u The grid is a vehicle for achieving this

36 Data services Data Resource Accessor Relational XMLDB Data Resource Accessor Data Service Resource Files Data Service Resource SQLOne XMLOne FilesOne Data Service

37 Architecture Ideas from LCG CAstord-CacheDPMMOPS (3 T1s)(9 T1s & T2s)(T2s only) (T2s only) GridFTP is the underlying transfer mechanism RFT(OSG)FTS (EGEE) [SRM-cp, Unicore, Oracle, IBM] Tape Reliability & retries, Single point of control for VOs & bulk transfers Advanced client tools – RLS, DRS, RFT (Policies) PhedexDon QuixoteFTDDirac Experimental(CMS)(Atlas)(Alice)(LHCB) Tool KitsSubscribe to datasets.diymeta-scheduler Pick best copy SRM Interface (LCG Requirement) + POSIX I/O Experimental framework – pure science codes Proposed DMIS interface

38 Translation to TeraGrid ?? GPFSHPSSSRB xferpNFS? GridFTP, pFTPGridFTP, ?? Mechanism single point of control for VOs Advanced client tools – RLS, DRS, RFT (Policies) AtmosphericAstronomy Medicine … Experimental Tool KitsSubscribe to datasets.diymeta-scheduler Pick best copy Experimental framework – pure science codes TGCP Interface | HPSS interface | (possible use of DMIS?)

39 For More Information l GridFTP u l RLS u “Performance and Scalability of a Replica Location Service,” High Performance Distributed Computing Conference, u Documentation: l DRS u “Wide Area Data Replication for Scientific Collaborations,” Grid Computing (Grid2005), u Documentation:

40 Discussion ? (over dinner?)


Download ppt "Data Access on the TeraGrid (Possibilities & Directions) Dan Fraser, ANL Ann Chervenak, ISI TeraGrid data workshop Jan ’07, San Diego."

Similar presentations


Ads by Google