Presentation is loading. Please wait.

Presentation is loading. Please wait.

CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.

Similar presentations


Presentation on theme: "CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003."— Presentation transcript:

1 CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003 Toulouse, France

2 CEOS Working Group on Information Systems and Services - 2 The Grid Problem o Flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resource From “The Anatomy of the Grid: Enabling Scalable Virtual Organizations” o Enable communities (“virtual organizations”) to share geographically distributed resources as they pursue common goals -- assuming the absence of… n central location, n central control, n omniscience, n existing trust relationships.

3 CEOS Working Group on Information Systems and Services - 3 The Data Grid Problem “Enable a geographically distributed community [of thousands] to perform sophisticated, computationally intensive analyses on Petabytes of data” o Sounds like a separate class of problem, but is actually a superset. o So all work done on “Grid Problems” applies to “DataGrid Problems”. We just need some additional tools.

4 CEOS Working Group on Information Systems and Services - 4 Globus Approach o Software toolkit addressing key technical areas n Offer a modular “bag of technologies” n Enable incremental development of grid-enabled tools and applications n Define and standardize grid protocols and APIs (Our software development supports this goal.) o Focus is on inter-domain issues, not clustering n Supports collaborative resource use spanning multiple organizations n Integrates cleanly with intra-domain services n Creates a “collective” service layer

5 CEOS Working Group on Information Systems and Services - 5 Major Data Grid Projects o Earth System Grid (DOE Office of Science) n DG technologies, climate applications o European Data Grid (EU) n DG technologies & deployment in EU o GriPhyN – Grid Physics Network (NSF ITR) n Investigation of “Virtual Data” concept o Particle Physics Data Grid (DOE Science) n DG applications for HENP experiments

6 CEOS Working Group on Information Systems and Services - 6

7 CEOS Working Group on Information Systems and Services - 7 Basic Data Grid Services 1. GridFTP: Data Transfer and Access n Common protocol for data movement – Secure, efficient, reliable, flexible, extensible, etc. – Grid Forum (Internet) Draft n Family of tools supporting this protocol – Wu-ftpd, ncftp, Globus Toolkit SDKs, etc. 2. Replica Management Architecture Simple scheme for managing: l multiple copies of files l collections of files

8 CEOS Working Group on Information Systems and Services - 8 GridFTP: Basic Approach o FTP is defined by several IETF RFCs o Start with most commonly used subset n Standard FTP: get/put etc., 3rd-party transfer o Implement standard but often unused features n GSS binding, extended directory listing, simple restart o Extend in various ways, while preserving interoperability with existing servers

9 CEOS Working Group on Information Systems and Services - 9 Features of GridFTP o Grid Security Infrastructure and Kerberos support: Robust and flexible authentication, integrity, and confidentiality o Third-party control of data transfer: user or application at one site initiates, monitors and controls a data transfer between two other sites o Parallel data transfer: On wide-area links, use multiple TCP streams in parallel between the same source and destination o Striped data transfer: Use multiple TCP streams to transfer data that is striped or interleaved across multiple servers

10 CEOS Working Group on Information Systems and Services - 10 Features of GridFTP (cont.) o Partial file transfer: Standard FTP allows transfer of the remainder of a file starting at an offset. GridFTP supports transfers of arbitrary subsets or regions of a file o Automatic negotiation of TCP buffer/window sizes: optimal settings for TCP buffer/window sizes can dramatically improve performance o Support for reliable and restartable data transfer: FTP standard includes basic features for restart that are not widely implemented. GridFTP exploits these features and extends them.

11 CEOS Working Group on Information Systems and Services - 11 GridFTP for Efficient WAN Data Transfer o Secure authentication o Parallel transfer gets job done quickly o Partial file access gets only required data o Up to 2.8Gb/s using a striped server architecture Parallel Transfer Fully utilizes bandwidth of network interface on single nodes. Striped Transfer Fully utilizes bandwidth of Gb+ WAN using multiple nodes. Parallel Filesystem

12 CEOS Working Group on Information Systems and Services - 12 Current Data delivery process ftp based o Pull – Semi anonymous ftp n Product ready n Email sent to user with instructions and password n User ftp via “anonymous” and with provided password n Ftp demon positions user to appropriate directory n User pull data o Push – routine data flows to high volume users n Account provided on remote system n When data available is pushed to remote system

13 CEOS Working Group on Information Systems and Services - 13 o For routine multiple usage customers n Establish “Certificate process” with customer – Self-signed certificate authority – Customer generates private/public key pair – Generate user certificate with public key – Add user certificate to list of trusted users n Customer must install GridFTP client – Globus toolkit data management client bundle – Gsincftp – Java Commodity Grid Kit for Windows Potential Future data delivery GRIDftp based

14 CEOS Working Group on Information Systems and Services - 14 o For routine multiple usage customers n Pull – – Product ready – Email notifies user that data is ready – User using GRIDftp and user certificate for authentication provided access and pulls data n Push – – Account provided on remote system with host certificate and our user certificate – These GRID certificate establish Virtual Organization between the two parties – When data available is GRIDftp used to pushed data to remote system Potential Future data delivery GRIDftp based

15 CEOS Working Group on Information Systems and Services - 15 o For single usage customers Process to – Establish “Certificate process” with customer – Customer must install GridFTP client Currently seems too complex (not worth the effort) Would like to have simplified method such as – Email a one time use “user certificate” – Integrated with browser built in GRIDftp client Potential Future data delivery GRIDftp based


Download ppt "CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003."

Similar presentations


Ads by Google