Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reliable Data Movement Framework for Distributed Petascale Science Raj Kettimuthu Argonne National Laboratory and The University of Chicago.

Similar presentations


Presentation on theme: "Reliable Data Movement Framework for Distributed Petascale Science Raj Kettimuthu Argonne National Laboratory and The University of Chicago."— Presentation transcript:

1 Reliable Data Movement Framework for Distributed Petascale Science Raj Kettimuthu Argonne National Laboratory and The University of Chicago

2 09/08/2008Portland State University Outline l Introduction l Motivation l Data Transfer Problem l Requirements l Reliable Data Movement Framework l Future Directions

3 09/08/2008Portland State University Today’s Science Environments l Large-scale collaborative science is becoming increasingly common l Distributed community of users to access and analyze large amounts of data Fusion community’s International ITER project

4 09/08/2008Portland State University Simulation Science l In simulation science, the data sources are supercomputer simulations u For eg, DOE-funded climate modeling groups generate large reference simulations at supercomputer centers l Combustion, fusion, computational chemistry, and astrophysics communities have similar requirements for remote and distributed data analysis

5 09/08/2008Portland State University Experimental Science Data sources are facilities such as high energy and nuclear physics experiments and light sources. u For eg, LHC at CERN will produce petabytes of raw data per year for 15 years l DOE light sources can also produce large quantities of data that must be distributed, analyzed, and visualized l The international fusion experiment, ITER

6 09/08/2008Portland State University Science Environments l Raw simulation or observational data is just a starting point for most investigations l Understanding comes from further analysis, reduction, visualization, and exploration l Furthermore the data is a community asset that must be accessible to any member of a distributed collaboration Petascale resourceCompute ClusterScientist’s Desktop

7 09/08/2008Portland State University Network Capabilities l Scientist A wants to transfer 1 Terabyte of data to Scientist B l What is the fastest way to transfer the data? Scientist A in CaliforniaScientist B in New York net Inter

8 09/08/2008Portland State University Network Capabilities l Scientist A wants to transfer 1 Terabyte of data to Scientist B l What is the fastest way to transfer the data? Scientist A in CaliforniaScientist B in New York FedEx net Inter

9 09/08/2008Portland State University Network Capabilities l Until a few years ago, Tri-labs (Los Alamos, Lawrence Livermore and Sandia) transferred data via tapes sent thru fedex l To transfer 100 TB in 24 hours, need a sustained data rate > 9.5 Gbit/s l 10 Gbit/s networks are becoming increasingly common in scientific environments u DOE’s ESNet, UltraScience Net, Science Data Networks and Internet2 have 10Gb/s or higher links u Thanks to the advancement in networking technologies

10 09/08/2008Portland State University ESNET

11 09/08/2008Portland State University End-to-end problem l Now that high-speed networks are available, can we move data at network speeds on the network? l What if the speed of airplanes had increased by the same factor as computers over the last 50 years, namely five orders of magnitude?

12 09/08/2008Portland State University End-to-end problem l Now that high-speed networks are available, can we move data at network speeds on the network? l What if the speed of airplanes had increased by the same factor as computers over the last 50 years, namely five orders of magnitude? We would be able to cross US in less than a second

13 09/08/2008Portland State University End-to-end problem l Now that high-speed networks are available, can we move data at network speeds on the network? l What if the speed of airplanes had increased by the same factor as computers over the last 50 years, namely five orders of magnitude? We would be able to cross US in less than a second Yes. But it would still take two hours to get to downtown

14 09/08/2008Portland State University End-to-end problem l Data movement in distributed science environments is an end-to-end problem l A 10 Gbit/s network link between the source and destination does not guarantee an end-to-end data rate of 10 Gbit/s l Other factors such as storage system, disk, data rate supported by the end node l Deal with failures of various sorts u Firewalls can cause difficulties

15 09/08/2008Portland State University End-to-end data transfer 30 Gb/s Node 1 Node 2 Node 32 Node 1 Node 2 Node 32 1 Gbit/s Efficient and robust wide area data transport requires the management of complex systems at multiple levels. San Diego, CAUrbana, IL

16 09/08/2008Portland State University Requirements l Fast l Easy-to-use l Secure l Reliable l Extensible l Standard l Robust

17 09/08/2008Portland State University GridFTP l High-performance, reliable data transfer protocol optimized for high-bandwidth wide- area networks l Based on FTP protocol - defines extensions for high-performance operation and security l Standardized through Open Grid Forum (OGF) l GridFTP is the OGF recommended data movement protocol

18 09/08/2008Portland State University GridFTP l We (Globus Alliance) supply a reference implementation: u Server u Client tools u Development Libraries l Multiple independent implementations can interoperate u Fermi Lab and U. Virginia have home grown servers that work with ours

19 09/08/2008Portland State University GridFTP l Two channel protocol like FTP l Control Channel u Communication link (TCP) over which commands and responses flow u Low bandwidth; encrypted and integrity protected by default l Data Channel u Communication link(s) over which the actual data of interest flows u High Bandwidth; authenticated by default; encryption and integrity protection optional

20 09/08/2008Portland State University Globus GridFTP Features l GridFTP is Fast u Parallel TCP streams u Non TCP protocol such as UDT u Set optimal TCP buffer sizes u Order of magnitude greater l Cluster-to-cluster data movement u Co-ordinated data movement using multiple computers at each end u Another order of magnitude “Grid-enabled Particle Physics Event Analysis: Experiences Using a 10 Gb, High-latency Network for a High-Energy Physics Application”, FGCS Journal, August 2003

21 09/08/2008Portland State University Cluster-to-Cluster transfers Control node Data node

22 09/08/2008Portland State University Performance l Mem. transfer between Urbana, IL and San Diego, CA

23 09/08/2008Portland State University Performance l Disk transfer between Urbana, IL and San Diego, CA "The Globus Striped GridFTP Framework and Server”, ACM/IEEE conference on Supercomputing (SC'05)

24 09/08/2008Portland State University Security l Often there is need to authenticate clients and control access to the data l Globus GridFTP supports multiple security mechanisms to authenticate and authorize clients u Anonymous access u Username/password u SSH security u Grid Security Infrastructure (GSI)

25 09/08/2008Portland State University Easy-to-use l Simple to install u Configure; make gridftp install; u Installs only gridftp and its dependencies u Binaries available for many platforms l Various clients u Command-line client - globus-url-copy u Client libraries - well-defined API u Graphical User Interface

26 09/08/2008Portland State University GUI Client

27 09/08/2008Portland State University Requirements l Fast l Secure l Reliable l Extensible l Standard l Robust l Easy-to-use

28 09/08/2008Portland State University Modular Data Source or Sink Data Storage Interface Network I/O Module Data Processing Module Well defined interfaces Data Storage Interface (DSI) - POSIX file system - High Performance Storage System (HPSS) - Storage Resource Broker (SRB) net "Globus Data Storage Interface (DSI) - Enabling Easy Access to Grid Datasets”, Data Grids Workshop 2006

29 09/08/2008Portland State University Modular l Network I/O module u Simple Open/Close/Read/Write interface u Well-defined abstraction called drivers u Easy to plug-in external libraries u TCP, UDT, Phoebus l Data processing module u Compression (under development) u Checksum "The Globus eXtensible Input/Output System (XIO): A protocol independent IO system for the Grid”, IEEE IPDPS 2005

30 09/08/2008Portland State University GridFTP in production l Many Scientific communities rely on GridFTP u High Energy Physics - LHC computing Grid u Southern California Earthquake Center (SCEC), Earth Systems Grid (ESG), Relativistic Heavy Ion Collider (RHIC), European Space Agency, BBC use GridFTP for data movement l GridFTP facilitates an average of more than 3 million data transfers every day

31 09/08/2008Portland State University GridFTP Servers Around the World Created by Lydia Prieto ; G. Zarrate; Anda Imanitchi (Florida State University) using MaxMind's GeoIP technology (http://www.maxmind.com/app/ip-locate).http://www.maxmind.com/app/ip-locate

32 09/08/2008Portland State University GridFTP in production 30x speedup over 9688 miles 80 MB/s sustained over 4500 miles 1.5 terabyte moved from University of Wisconsin, Milwaukee to Hannover, Germany at a sustained rate of 80 megabyte/sec One terabyte moved from an Advanced Photon Source tomography beamline to Australia, at a rate 30x faster than standard FTP

33 09/08/2008Portland State University Handling failures l GridFTP server sends restart and performance markers periodically u Default every 5s - configurable l Helpful if there is any failure u No need to transfer the entire file again u Use restart markers and transfer only the missing pieces l GridFTP supports partial file transfers

34 09/08/2008Portland State University Server failure l Command-line client - globus-url-copy - support transfer retries u Use restart markers l Recover from server and connection failures l What if the client fails in the middle of a transfer?

35 09/08/2008Portland State University Globus Reliable File Transfer Service (RFT) l GridFTP client that provides more reliability l GridFTP - on demand transfer service u Not a queuing service l RFT u Queues requests u Orchestrates transfers on client’s behalf u Writes to persistent store u Recovers from GridFTP and RFT service failures

36 09/08/2008Portland State University RFT RFT Service Client SOAP Messages Notifications (Optional) GridFTP Server GridFTP Server CC DC Persistent Store

37 09/08/2008Portland State University RFT RFT Service Client SOAP Messages Notifications (Optional) GridFTP Server CC DC Persistent Store

38 09/08/2008Portland State University RFT RFT Service Client SOAP Messages Notifications (Optional) GridFTP Server GridFTP Server CC DC Persistent Store

39 09/08/2008Portland State University RFT RFT Service Client SOAP Messages Notifications (Optional) GridFTP Server GridFTP Server CC DC Persistent Store

40 09/08/2008Portland State University RFT RFT Service Client SOAP Messages Notifications (Optional) GridFTP Server GridFTP Server CC DC Persistent Store

41 09/08/2008Portland State University RFT Client SOAP Messages Notifications (Optional) GridFTP Server GridFTP Server CC DC Persistent Store

42 09/08/2008Portland State University RFT RFT Service Client SOAP Messages Notifications (Optional) GridFTP Server GridFTP Server CC DC Persistent Store

43 09/08/2008Portland State University RFT RFT Service Client SOAP Messages Notifications (Optional) GridFTP Server GridFTP Server CC DC Persistent Store

44 09/08/2008Portland State University RFT RFT Service Client SOAP Messages Notifications (Optional) GridFTP Server GridFTP Server CC DC Persistent Store

45 09/08/2008Portland State University Requirements l Fast l Secure l Reliable l Extensible l Standard l Robust l Easy-to-use GridFTP

46 09/08/2008Portland State University GridFTP Overlay Network BW sd BW sa BW ab BW bc BW cd If Min(BW sa, BW ab, BW bc, BW cd ) > BW sd, Overlay route yields better performance

47 09/08/2008Portland State University Best effort service l Data movement in distributed environments is on best effort basis l No Quality of Service (QoS) guarantees l Network is shared l Limited disk space u Destination might run out of space in the middle of a transfer l End node, network, disk can fail any time

48 09/08/2008Portland State University Managed Data Movement RFT Service (Co-Scheduling) GridFTP Server GridFTP Server CC DC Persistent Store CPU MemoryBW Resource Limiter CPU MemoryBW Resource Limiter Network Bandwidth Reservation Service Storage System Storage System Storage Reservation Manager Storage Reservation Manager GridFTP Connection Broker GridFTP Connection Broker

49 09/08/2008Portland State University Dynamic Selection of Protocols l Compose protocol stack based on user needs and underlying network capabilities End-point AEnd-point B net Special work End-point AEnd-point B net Inter Compression TCP Infiniband UDP based

50 09/08/2008Portland State University Acknowledgments l John Bresnahan l Mike Link l Ravi Madduri l Martin Feller l Gaurav Khanna l Liu Wantao

51 09/08/2008Portland State University Questions


Download ppt "Reliable Data Movement Framework for Distributed Petascale Science Raj Kettimuthu Argonne National Laboratory and The University of Chicago."

Similar presentations


Ads by Google