Presentation on theme: "An Open Standards-based Scalable Heavy Lifting Data Transfer Service for e-Research David Meredith, Peter Turner, Alex Arana, Gerson Galang, David Wallom,"— Presentation transcript:
An Open Standards-based Scalable Heavy Lifting Data Transfer Service for e-Research David Meredith, Peter Turner, Alex Arana, Gerson Galang, David Wallom, Phil Kershaw, Weijing Fang, Ally Hume, Mario Antonioletti, Steve Crouch
Problem Moving data is a growing problem Data increasing in size – difficult to move about – Storage – Network Initiating data transfers across different protocols (data onto/off grids) from a range of clients – Remote user - desktop, portal – Grid + Web e.g. copy from beam-line data resource to my home storage lab Cant do transfer through clients – not scalable Need something lightweight for users
Users/Use Cases For users from e.g.: – Diamond Synchrotron, STFC – Australian Synchrotron Facility Use Cases: – Hermes (e.g. Oxford Anatomy Institute of Biology – not wanting to deploy whole other machine to do this – 100gbs of data. They want desktop client to do this) – NGS Portal – Any Commons VFS-style Client – SAGA client?
High-level Requirements Properties: – Scalable – Durable/Reliable – Asynchronous – Support protocols: ftp/sftp/http/https/gsiftp/SRB/iRODS/SRM Core requirement: third party transfer needs to be cross-platform (e.g. SRB -> gsiftp) Construct XML that specifies requirements, send to 3 rd party service for asynchronous
Realisations Need to discuss at a high-level – separate into particular layers – Top-level service, scheduling/movement – I/fs to individual data protocols (i.e. thru VFS) Could go to data service providers and ask them to support 3 rd party – But process could take too long – The tech is already out there Would this go into UMD (Unified Middleware Distribution)? They want all projects using eu- funded e-Infrastructure
SRB/ FTP SFTP/ GSIFTP VFS/Saga client, e.g. Portal/Hermes File operations (list, upload, download, delete, rename) Bit pipe (byte IO stream) Authentication tokens (un/pw, x509?) Auth tokens only in memory on one server. Self contained. Piping bytes via client is bottleneck, single point of failure, concurrency issues). Current Cross Protocol File Transfer – Data is buffered through the client, this does Not Scale and is synchronous ! Client provides single interface to different remote file systems (Srb GsiFtp, Ftp, Sftp).
SRB/ FTP SFTP/ GSIFTP VFS/Saga client VFS workers JMS QUEUE behind WS-I interface Required / Suggested Architecture Asynchronous, no concurrency issues, no data buffered via client ! File operations (list, upload, download, delete, rename) Bit pipe (byte IO stream) Authentication tokens (un/pw, x509?) Move file transfers to different server (farm), increase bandwidth, concurrency. Passing auth tokens around in messages (strong security required) Development / testing.
Work to date Data transfer currently done via e.g. Hermes Client Commons VFS provides ftp/sftp/HTTP/HTTPS/webdav/gsiftp Will always need clients via interface e.g. Portal, Hermes, VFS client but have transfer via scalable third party service – Asynchronous, poll for progress – Architecture: underlying VFS code exists, deployed into service- oriented, scalable manner Standards-driven? – OGSA-DMI – JSDL GridSAM compute-focused
DataMINX DTS – Heavy Lifting Data Transfer Service This is just one possible implementation of this, GridSAM another? Under discussion last 4 days JMS-based scalability for asynchronously/in parallel moving data – DTS web service submits to JMS queue – DTS worker nodes (VFS clients) picks up JMS transfer msgs – Can specify in JMS queue direct machines to perform transfer Within J2EE environment Abstractions with target URIs – Through shared connection pool per machine – One connection to target URI
Other Possible Solution Paths GridSAM does some but not all gLite File Transfer Service – does this on a large scale Stork – Supports ftp/http/fsiftp/nest/srb/srm/csrm/unitree – But not web service – suitable? Alan W – Vbrowser – Hermes-esque? DW: Cloud-based (e.g. Amazon solution?) AH: Parallelisation in OGSA-DAI for compute, here is parallelisation for data – GridSAMs data transfer is not parallelised – Could have job that just moves data – but cannot guarantee network availability on worker nodes, and not architecturally ok If one web service supports a single protocol, just extend it
Issues Its a big problem with a big suggested solution – lots of developer work Need to think about failure use cases – Worker nodes fails – JMS gives you isolation from service failure through tested, transaction-based durability – Need to discuss and uncover other failure cases Specs – do they cover all the use cases? – JSDL/HPC File Staging Profile, OGSA-DMI? – Interfaces limited?
Next Steps (Within CW) Recommend further session (Mario, Steve C, Ally, David M, Peter T, Alex A, Gerson G, David W, Weijian F): – Have others critique the design work over last 4 days – Possible subdivision for detailed issues – High-level requirements discussion – Implementation/specification Go over issues with schema specs, possible ways forward Possible architectures that can assist the problem now – Stork!
Next Steps (Out of CW) Spec issues: – Schedule discussion within OGSA-DMI WG (Mario to organise) – HPC File Staging Profile/JSDL WGs (David M/Steve C to organise) – DW: attend the OGF PGI sessions – they will be observing & championing necessary changes to JSDL/HPC Profile (Steve C)