Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reliable Data Movement Framework for Distributed Science Environments Raj Kettimuthu Argonne National Laboratory and The University of Chicago.

Similar presentations


Presentation on theme: "Reliable Data Movement Framework for Distributed Science Environments Raj Kettimuthu Argonne National Laboratory and The University of Chicago."— Presentation transcript:

1 Reliable Data Movement Framework for Distributed Science Environments Raj Kettimuthu Argonne National Laboratory and The University of Chicago

2 07/14/2008PDPTA'08 Outline l Introduction l Motivation l Data Transfer Problem l Requirements l Reliable Data Movement Framework l Future Directions

3 07/14/2008PDPTA'08 Today’s Science Environments l Science environment today is very different l Large-scale collaborative science is becoming increasingly common l Need for distributed community of users to access and analyze large amounts of data reliably is a fundamental requirement l This requirement arises in both simulation and experimental sciences

4 07/14/2008PDPTA'08 Simulation Science l In simulation science, the data sources are supercomputer simulations u For eg, DOE-funded climate modeling groups generate large reference simulations at supercomputer centers u Many climate scientists need to extract and analyze subsets of this data in various ways l Combustion, fusion, computational chemistry, and astrophysics communities have similar requirements for remote and distributed data analysis

5 07/14/2008PDPTA'08 Experimental Science Data sources are facilities such as high energy and nuclear physics experiments and light sources. u For eg, the experimental program based upon the LHC at the CERN will produce petabytes of raw data per year for approximately 15 years u Thousands of physicists worldwide will participate in the production and analysis of simulated and derived data sets from this raw experimental data l DOE light sources can also produce large quantities of data that must be distributed, analyzed, and visualized l The international fusion experiment, ITER

6 07/14/2008PDPTA'08 Science Environments l Raw simulation or observational data is just a starting point for most investigations l Understanding comes from further analysis, reduction, visualization, and exploration l Analysis must often be performed on a different class of petascale resource, a smaller resource such as a cluster, or even a scientist’s desktop l Furthermore the data is a community asset that must be accessible to any member of a distributed collaboration

7 07/14/2008PDPTA'08 Network Capabilities l Scientist A is in California l Scientist B is in New York l They both are connected through the Internet l Scientist A wants to transfer 1 Terabyte of data to Scientist B l What is the fastest way to transfer the data?

8 07/14/2008PDPTA'08 Network Capabilities l Until a few years ago, Tri-labs (Los Alamos, Lawrence Livermore and Sandia) transferred data via tapes sent thru fedex l To transfer 100 TB in 24 hours, need a sustained data rate > 9.5 Gbit/s l 10 Gbit/s networks are becoming increasingly common in scientific environments u DOE’s ESNet, UltraScience Net, Science Data Networks and Internet2 has 10Gb/s or higher links u Thanks to the advancement in networking technologies

9 07/14/2008PDPTA'08 ESNET

10 07/14/2008PDPTA'08 End-to-end problem l Now that high-speed networks are available, can we move data at network speeds on the network? l What if the speed of airplanes had increased by the same factor as computers over the last 50 years, namely five orders of magnitude?

11 07/14/2008PDPTA'08 End-to-end problem l Data movement in distributed science environments is an end-to-end problem l A 10 Gbit/s network link between the source and destination does not guarantee an end-to-end data rate of 10 Gbit/s l Other factors such as storage system, disk, data rate supported by the end node l Deal with failures of various sorts u Firewalls can cause difficulties

12 07/14/2008PDPTA'08 End-to-end data transfer l Efficient and robust wide area data transport requires the management of complex systems at multiple levels. l For example, in a recent work, we required 32 hosts connected at 1 Gbit/s to drive a 30 Gbit/s connection. l Effective end-to-end data transfers thus demand a systems approach u Integrates file systems, computers, network interfaces, and network protocols u Encapsulated in easily usable and portable software

13 07/14/2008PDPTA'08 Requirements Fast Secure Reliable Extensible Standard Robust

14 07/14/2008PDPTA'08 GridFTP l High-performance, reliable data transfer protocol optimized for high-bandwidth wide- area networks l Based on FTP protocol - defines extensions for high-performance operation and security l Standardized through Open Grid Forum (OGF) l GridFTP is the OGF recommended data movement protocol

15 07/14/2008PDPTA'08 GridFTP l We (Globus Alliance) supply a reference implementation: u Server u Client tools u Development Libraries l Multiple independent implementations can interoperate u Fermi Lab and U. Virginia have home grown servers that work with ours

16 07/14/2008PDPTA'08 Requirements Fast Secure Reliable Extensible Standard Robust

17 07/14/2008PDPTA'08 GridFTP l Two channel protocol like FTP l Control Channel u Communication link (TCP) over which commands and responses flow u Low bandwidth; encrypted and integrity protected by default l Data Channel u Communication link(s) over which the actual data of interest flows u High Bandwidth; authenticated by default; encryption and integrity protection optional

18 07/14/2008PDPTA'08 Globus GridFTP Features l GridFTP is Fast u Parallel TCP streams u Non TCP protocol such as UDT u Set TCP buffer sizes u Order of magnitude greater l Cluster-to-cluster data movement u Co-ordinated data movement using multiple computers at each end u Another order of magnitude

19 07/14/2008PDPTA'08 Cluster-to-Cluster transfers Control node Data node

20 07/14/2008PDPTA'08 Performance l Mem. transfer between Urbana, IL and San Diego, CA

21 07/14/2008PDPTA'08 Performance l Disk transfer between Urbana, IL and San Diego, CA

22 07/14/2008PDPTA'08 Requirements Fast Secure Reliable Extensible Standard Robust

23 07/14/2008PDPTA'08 Security l Often there is need to authenticate clients and control access to the data l Globus GridFTP supports multiple security mechanisms to authenticate and authorize clients u Anonymous access u Username/password u SSH security u Grid Security Infrastructure (GSI)

24 07/14/2008PDPTA'08 Requirements Fast Secure Reliable Extensible Standard Robust

25 07/14/2008PDPTA'08 Modular Data Source or Sink Data Storage Interface Network I/O Module Data Processing Module Network Well defined interfaces Data Storage Interface - POSIX file system - High Performance Storage System (HPSS) - Storage Resource Broker (SRB) - Freeloader (under development)

26 07/14/2008PDPTA'08 Modular l Network I/O module u TCP u Easy to plug-in external libraries u UDT u Phoebus l Data processing module u Compression (under development) u Checksum

27 07/14/2008PDPTA'08 Requirements Fast Secure Reliable Extensible Standard Robust

28 07/14/2008PDPTA'08 GridFTP in production l GridFTP has been around for many years l Many Scientific communities rely on GridFTP u HEP community is basing its entire tiered data movement infrastructure for the LHC computing Grid on GridFTP u Southern California Earthquake Center (SCEC), Laser Interferometer Gravitational-Wave Observatory (LIGO), Earth Systems Grid (ESG), Relativistic Heavy Ion Collider (RHIC), Advanced Photon Source use GridFTP for data movement u European Space Agency, Disaster Recovery Center in Japan, British Broadcasting Corporation move large volumes of data using GridFTP l GridFTP facilitates an average of more than 3 million data transfers every day

29 07/14/2008PDPTA'08 Requirements Fast Secure Reliable Extensible Standard Robust

30 07/14/2008PDPTA'08 Handling failures l GridFTP server sends restart and performance markers periodically u Default every 5s - configurable l Helpful if there is any failure u No need to transfer the entire file again u Can start from the last restart marker l GridFTP supports partial file transfers

31 07/14/2008PDPTA'08 GridFTP clients l Globus-url-copy - commonly used command-line client l Lots of people have developed clients independent of the Globus Project u Uberftp l These clients support transfer retries and recover from server failures l What if the client fails in the middle of a transfer?

32 07/14/2008PDPTA'08 Globus Reliable File Transfer Service (RFT) l GridFTP client that provides more reliability l GridFTP - on demand transfer service u Not a queuing service l RFT u Queues requests u Orchestrates transfers on client’s behalf u Writes to persistent store u Recovers from GridFTP and RFT service failures

33 07/14/2008PDPTA'08 RFT RFT Service Client SOAP Messages Notifications (Optional) GridFTP Server GridFTP Server CC DC Persistent Store

34 07/14/2008PDPTA'08 Requirements Fast Secure Reliable Extensible Standard Robust

35 07/14/2008PDPTA'08 Best effort service l Data movement in distributed environments is still on best effort basis l No Quality of Service (QoS) guarantees l Network is shared l Limited disk space u Destination might run out of space in the middle of a transfer l End node, network, disk can fail any time

36 07/14/2008PDPTA'08 Better than best effort l Advances in network and storage reservations u Internet2 Dynamic Circuits Network u ESNet OSCARS u DOE sponsored LambdaStation and TeraPaths u Reserve bandwidth on the network u Storage Reservation Managers (SRM), NeST allows to reserve disk space

37 07/14/2008PDPTA'08 Better than best effort GridFTP Server System Info Provider File System GridFTP Info Provider CPU MemoryBW Resource Limiter Ad Control Channel Ad Nest Reservation Interface Bulk Transfer Service Network Reservation Service Data Point 1 Data Point 2 Data Point n … GridFTP Resource Allocation

38 07/14/2008PDPTA'08 Firewall GridFTP Source Server GridFTP Dest Server Client TCP 2811 DATA

39 07/14/2008PDPTA'08 Connection Broker GridFTP Source Server GridFTP Dest Server Client TCP 2811 CB DATA IP 4 tuple Temporary hole

40 07/14/2008PDPTA'08 Links and contacts l GridFTP is available in the Globus toolkit l Latest version available at http://www.globus.org/toolkit/downloads/4.2.0/ http://www.globus.org/toolkit/downloads/4.2.0/ l Documentation available at http://www.globus.org/toolkit/docs/4.2/4.2.0/data /gridftp/index.html http://www.globus.org/toolkit/docs/4.2/4.2.0/data /gridftp/index.html l Simple to install u Configure; make gridftp install; u Installs only gridftp and its dependencies u Binaries available for many platforms l Gridftp-user@globus.org, gridftp-dev@globus.org Gridftp-user@globus.org,gridftp-dev@globus.org l Kettimut@mcs.anl.gov Kettimut@mcs.anl.gov

41 07/14/2008PDPTA'08 Questions


Download ppt "Reliable Data Movement Framework for Distributed Science Environments Raj Kettimuthu Argonne National Laboratory and The University of Chicago."

Similar presentations


Ads by Google