Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Bridge Solving diverse data access in scientific applications

Similar presentations


Presentation on theme: "Data Bridge Solving diverse data access in scientific applications"— Presentation transcript:

1 Data Bridge Solving diverse data access in scientific applications
Zoltán Farkas, Péter Kacsuk, Mark Santcroos, Silvia Olabarriaga, Ákos Balaskó, Krisztián Karóczkai

2 Outline Problem statement Data Bridge as independent DCI service:
Data Bridge concept Use-cases Data Bridge architecture WS-PGRADE integration Data browsing portlet gUSE integration

3 Problem statement Scientific applications: Data sources:
Individual jobs or workflows Access data from diverse sources Science Gateways can hide the details, but… Data sources: Diverse types: HTTP, FTP, GridFTP, SRM, iRODS, … Thus, different APIs are needed to access these One possible solution is to use a service that can be used to access the sources through a unified interface

4 Existing solutions Name Supported storages Access possibilities
OGSA-DAI Web services, XML databases, file services Web service Storage Resource Broker File systems, Relational Databases Web, APIs, Command line iRODS Disk, Tape, Database, Filesystem with Metadata catalog Web, WebDAV, Java API, Command line jSAGA FTP, GridFTP, SRM, LFC Java API Globus Online FTP, GridFTP Web interface

5 Data Bridge Offers a simple service that provides a generic interface above different DCI's storage services to handle the data stored The service in different use cases offers a way to browse, upload and download data, and with the help of multiple server instances it enables inter-DCI data transfer as well

6 Use cases Use case 1: Browse a single DCI data storage from WS-PGRADE, upload data Use case 2: Transfer data files between different DCIs Use case 3: Fetch input data on a DCI worker node from an other DCI Use case 4: Cloud storage usage

7 Use case 1: Storage browsing and data upload
WS-PGRADE Browse and upload Storage Browsing Portlet Data Bridge Adaptor Interface Storage Adaptor Storage

8 Use case 2: Data Transfer – Using multi-level Data Bridge
Client: Storage Browsing Portlet Custom application Data Bridge Adaptor Interface Storage Adaptor1 Data Bridge Adaptor Data Bridge Adaptor Interface Storage Adaptor2 Storage1 Storage2

9 Data bridge usage guidelines:
Use case 3: Fetch data on a DCI’s worker node from a „foreign” DCI’s storage Data bridge usage guidelines: First try to fetch the data using native tools Only if this fails, use the Data Bridge DCI Worker node Data Bridge Wrapper Pre-process Adaptor Interface Executable Storage Adaptor Storage Post-process

10 Use case 3: Get FTP data from PBS
Could be other protocols (e.g. SRM) as well PBS Worker node Data Bridge Wrapper Pre-process Adaptor Interface Executable FTP Adaptor FTP Server Post-process

11 Use case 4: Cloud Storage access from WS-PGRADE/gUSE
Currently, no S3 support in WS-PGRADE An S3 Data Bridge adaptor would fix this WS-PGRADE/gUSE DCI Worker node Job Amazon S3 Data Bridge

12 Data Bridge Architecture
Public Interface HTTP servlet Adaptor Manager Temporary URL queue Worker Pool URI URI URI Thread1 Thread2 Threadn Adaptor Interface DCI Adaptor1 DCI Adaptor2 DCI Adaptor3 DCI Adaptorm jSAGA

13 Data Bridge components
Interfaces: Public Interface Adaptor Interface Adaptor Manager Worker Threads DCI Adaptors

14 Data Bridge components- Interfaces
Public Interface: Provides the public interface for external components (Portlets, gUSE, …) Web Service interface Adaptor Interface: A Java interface that hides the details of the different adaptors

15 Data Bridge Public Interface
Operations: List Mkdir Delete Get Put Copy Move Entities: URI (either a path, an URL or some specific class) Error reports: Common exceptions

16 Data Bridge Public Interface - URI
Represents an element with a given URI (a directory, a file, metadata attributes, …) Also needs to carry security credentials (if needed) Attributes: Nothing special in the base class For gLite, e.g: Path: the full path Type: directory or file Size: length of the entity (0 for directories) Attributes: optional, contains information as returned by the Adaptor Interface's Stat function

17 Data Bridge Public Interface – Get and Put
Two-phase up- and download with the temporary URL queue: First, the web service interface is invoked to register the transfer request Next, a simple HTTP client may use HTTP GET or POST/PUT to down- or upload the data This way, web service invocation („heavyweight” SOAP) is separated from data transfer („lightweight” HTTP) Public Interface HTTP servlet Adaptor Manager Temporary URL queue Worker Pool URI URI URI Thread1 Thread2 Threadn Adaptor Interface DCI Adaptor1 DCI Adaptor2 DCI Adaptor3 DCI Adaptorm

18 Adaptor Manager and Worker threads
Provided by JAX-WS web service API Tasks: Manage incoming requests Initialize worker threads to perform the requested operation With the help of different adaptors

19 DCI Adaptors Implement: Adaptor Interface Tasks: Types:
Perform operations requested by the Worker Threads, that is operations invoked through the web service Types: gLite (using jSAGA) GridFTP (using jSAGA) FTP (using jSAGA) Data Bridge: special adaptor to forward requests to other Data Bridges

20 Data Bridge clients Web Service clients: Java API:
Create your own based on the WSDL (or REST) Java API: Provides a convenient tool to use Data Bridge Public Interface functions Data transfer functions should accept InputStream and OutputStream objects as their arguments

21 WS-PGRADE integration
A Data Browsing portlet that eases storage management

22 WS-PGRADE Workflow I/O configuration
During a workflow node's IO configuration the user should be able to select files from storages The provided interface should be the same as the selected storage's Storage Browsing portlet (only with one panel)

23 Current status, future work
Core Data Bridge (available as a web service) ready, working with most major protocols (FTP, GridFTP, SRM) User Interface development has been started, first version will be available as part of WS-PGRADE/gUSE shortly

24 Thank you for your attention!
Questions Thank you for your attention! ?


Download ppt "Data Bridge Solving diverse data access in scientific applications"

Similar presentations


Ads by Google