Presentation is loading. Please wait.

Presentation is loading. Please wait.

Rosa Filgueira – University of Edinburgh Iraklis Klamapnos- University of Edinburgh Yusuke Tanimura- AIST, Tsukuba Malcolm Atkinson- University of Edinburgh.

Similar presentations

Presentation on theme: "Rosa Filgueira – University of Edinburgh Iraklis Klamapnos- University of Edinburgh Yusuke Tanimura- AIST, Tsukuba Malcolm Atkinson- University of Edinburgh."— Presentation transcript:

1 Rosa Filgueira – University of Edinburgh Iraklis Klamapnos- University of Edinburgh Yusuke Tanimura- AIST, Tsukuba Malcolm Atkinson- University of Edinburgh

2  Introduction ◦ Problem description ◦ Hypothesis ◦ Rock Physics laboratory experiments ◦ Objective ◦ Proposal  Related developments ◦ Data transfer protocols ◦ Data transport systems  FAST ◦ Selecting the best data transfer protocol ◦ Data transfer experiments ◦ Implementation and evaluation  Future work and Questions

3  Large number of rock physics (RP) laboratories ◦ Runs many experiments (Experimentalists)  Large number of rock physicists ◦ Develops computational codes (Code builders)  Sharing experimental data among this community is still in its early days ◦ No facilities to transfer experimental data automatically in real time with their associated description (metadata)

4  Several tools for providing reliable and high performance data transfer capabilities ◦ Dropbox or Globus Online  Not optimized for the RP requirements

5  The RP community will benefit from tool ◦ Transfers data and metadata in near-real time ◦ Repository and DB accessible from a website  For experimentalists ◦ Collection and comparison of experiments from many labs  For code builders ◦ Find test data for running their models

6  Laboratory rock property measurements ◦ Properties of the rock sample are studied under different conditions  High-pressure vessels to apply pore pressures and stresses to cylindrical rock sample  Until the sample has failed, different features (e.g stress, porosity, temperature, etc,....) are recorded at several time intervals  In each interval, data transferred to a local computer machine (channel. 1 channel per rock)

7 Pressure VesselUCL- RP LaboratoryRock Samples

8 Initial target: 30 months Deploy under the sea- Mediterranean 8 rock samples- different features Different interval of times and data sizes

9  Each experiment can record data differently ◦ Events can be written in a new file or appended ◦ Files can be stored in the same directory or not ◦ Intervals for writing data can be shorts or long ◦ Number of rocks samples could be one or several ◦ Duration of an experiments can be short or long  Data intensive problem for transferring the data

10  To transfer RP experimental data from one location to another ◦ Automated data transfer until the end-experiment  Transfer experimental data  Near real time and non-real time  Synchronization  Incremental (File) and Directory ◦ Possible interruptions and fails ◦ Record and transfer the metadata

11  FAST: Flexible automated synchronization transfer ◦ Data and metadata in real time and non- real time ◦ Incremental (file) and directory sync ◦ Selection of the data-transfer protocol ◦ Compatible with all O.S ◦ Simple to set up and manage ◦ Monitors the transmission, detects errors and recovers from them. ◦ Data collected in a repository, metadata in DB, and web site for accessing them  Proposal is triggered by our work ◦ EFFORT project ◦ Using data provided by the Creep-2 project

12  File transfer Protocol (FTP) ◦ Control and data are un-encrypted ◦ Easy to use, lack of security  FTP security extension (FTPS) ◦ Control encrypted (TLS or STLS), but data might not be  Secure Copy (SCP) ◦ SSH for transferring data and authentication (more secure than previous ones) ◦ File transfer only ◦ Ideal for quick transfer of single files  SSH File Transfer Protocol (SFTP) ◦ Based in SSH-2: best for secure access (packet confirmation) ◦ File transfer, creating and delete remote directories and files ◦ Directory synchronization,  Rsync ◦ Incremental file transfer (delta algorithm) ◦ File and directory synchronization ◦ Can provide encrypted transfer by using SSH ◦ On-the-fly compression option ◦ Idea for back-ups

13  UDP-(UDT) ◦ UDP protocol for data-intensive applications ◦ UDT can transfer data a higher speed than TCP- based protocols  UDT Enabled Rsync (UDR) ◦ Uses Rsync for the transport mechanism (delta) ◦ Sends data over the UDT protocolIdeal for large data over long distance ◦ Ideal for large data over long distance

14  GridFTP: ◦ HP secure, reliable data rate via high bandwidth ◦ many-to-many ◦ difficult to use  Globus Online ◦ Uses GridFTP protocol ◦ Automates the management of files:  monitoring performance, retrying files, recovering from failes ◦ Do not support file synchronization.  Dropbox: ◦ Centralize cloud storage, file and directory synchronization ◦ Rsync-delta protocol ◦ Data stored on the Amazon S3 (Third party) ◦ One-to-one file transfer  BTSync ◦ Decentralized cloud storage, P2P file synchronization (No Third party). ◦ Connecting the devices to communicate with UDP ◦ Many-to-many file transfers  WinSCP ◦ SFTP and FTP client for Windows

15 Email from Globus Online Support We recently noticed that you are creating many CLI sessions to, each with a single blocking transfer. This is a suboptimal way to use Globus Online and in fact is causing us some resource usage issues.

16  Previous tools ◦ Different data-transfer protocols ◦ Some automated data synchronization  No one ◦ Select the best protocol depending on requirements ◦ Methods for tracking metadata and transferring it  Our work automatically ◦ Selects a protocol among FTPS, SFTP, Rsync, and UDR ◦ Injects a minimum of metadata ◦ GridFTP and P2P discarded: communications 1-to-1 ◦ FTPS instead of using FTP: minimum security level ◦ SFTP derives from SCP

17 FTPS, SFTP, Rsync and UDR

18  Two machines located in Edinburgh ◦ VLAN Network 100MB/s  Synthetic program to generate events  Data size written to files: 50KB, 500KB, 1MB, 10MB, 100MB, 500MB, 1GB and 10GB.  Measures: transfer rate and elapsed time  Repetition: 10 times

19 SFTP fastest < 500MB Rsync fastest >= 500MB ** without compression Elapsed Time File SizeRsyncUDRSFTPFTPSRsync-cUDR-c 50KB00000.1 500KB0. 1MB0. 50MB443471.05 500MB39424043781.05 1GB7879 82147180 10GB814845850101214951712

20  UDR has been specially designed ◦ Large data transfer over long distance  UDR vs Rsync by using two machines ◦ Located in different local networks  University of Edinburgh  1GbE  AIST-Tsukuba  10GbE  Generated Files: 1MB, 500MB, 1GB, 10GB and 30GB.

21 UDR fastest ** without compression Elapsed Time File sizeRsyncUDRRsync-cUDR-c 1MB0000 500MB3652015456 1GB7303779120 10GB672236430001140 30GB1630108075603360


23  Front-end: GUI using Java SWING  Back-end: Decision tree  Data and Metadata ◦ Data stored in a remote repository (NAS) ◦ Metadata collected in remote database (MySQL)  Science gateway (Web tool) connected with the repository and database ◦ Searching ◦ Visualizing ◦ Analyzing ◦ Download


25  FAST has been evaluated: ◦ By using synthetic programs for generating data  real time and non-real time  For each type of synchronization  Different data sizes, and different types of network locations  Short and Long term experiments  Stop and restart ◦ For transferring data from a real rock physic experiment  Laboratory- UCL (London) and Edinburgh  Days: 45 days  Interval: Every minute  Rock Samples: 1


27  Use FAST in the Creep-2 experiment  Implement FAST policies ◦ Data available in the repository for specific users during a reasonable period  Sharing data from many-to-many locations  Decision-tree ◦ Automating generation and maintenance ◦ Keep up-to-date the by measuring transfers  Use FAST in more rock physics laboratories  Use FAST in other disciplines

28  email:

Download ppt "Rosa Filgueira – University of Edinburgh Iraklis Klamapnos- University of Edinburgh Yusuke Tanimura- AIST, Tsukuba Malcolm Atkinson- University of Edinburgh."

Similar presentations

Ads by Google