CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.

Slides:



Advertisements
Similar presentations
The Access Grid Ivan R. Judson 5/25/2004.
Advertisements

Building a CFD Grid Over ThaiGrid Infrastructure Putchong Uthayopas, Ph.D Department of Computer Engineering, Faculty of Engineering, Kasetsart University,
The Anatomy of the Grid: An Integrated View of Grid Architecture Carl Kesselman USC/Information Sciences Institute Ian Foster, Steve Tuecke Argonne National.
High Performance Computing Course Notes Grid Computing.
GridFTP: File Transfer Protocol in Grid Computing Networks
GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.
Introduction to Grid Computing The Globus Project™ Argonne National Laboratory USC Information Sciences Institute Copyright (c)
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
Data Grids: Globus vs SRB. Maturity SRB  Older code base  Widely accepted across multiple communities  Core components are tightly integrated Globus.
Security Management IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
Workload Management Massimo Sgaravatto INFN Padova.
Simo Niskala Teemu Pasanen
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Wireless Grid Computing A Prototype Wireless Grid Grant Gifford Mark Hempstead April 30, 2003.
Technology Overview. Agenda What’s New and Better in Windows Server 2003? Why Upgrade to Windows Server 2003 ?  From Windows NT 4.0  From Windows 2000.
Globus Striped GridFTP Framework and Server Raj Kettimuthu, ANL and U. Chicago.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
DataGrid Middleware: Enabling Big Science on Big Data One of the most demanding and important challenges that we face as we attempt to construct the distributed.
CoG Kit Overview Gregor von Laszewski Keith Jackson.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
1 School of Computer, National University of Defense Technology A Profile on the Grid Data Engine (GridDaEn) Xiao Nong
Why GridFTP? l Performance u Parallel TCP streams, optimal TCP buffer u Non TCP protocol such as UDT u Order of magnitude greater l Cluster-to-cluster.
Secure, Collaborative, Web Service enabled and Bittorrent Inspired High-speed Scientific Data Transfer Framework.
Topaz : A GridFTP extension to Firefox M. Taufer, R. Zamudio, D. Catarino, K. Bhatia, B. Stearn University of Texas at El Paso San Diego Supercomputer.
Moving Large Amounts of Data Rob Schuler University of Southern California.
File and Object Replication in Data Grids Chin-Yi Tsai.
Globus GridFTP and RFT: An Overview and New Features Raj Kettimuthu Argonne National Laboratory and The University of Chicago.
Major Grid Computing Initatives Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science The.
1 4/23/2007 Introduction to Grid computing Sunil Avutu Graduate Student Dept.of Computer Science.
Hands-On Microsoft Windows Server Implementing Microsoft Internet Information Services Microsoft Internet Information Services (IIS) –Software included.
Service - Oriented Middleware for Distributed Data Mining on the Grid ,劉妘鑏 Antonio C., Domenico T., and Paolo T. Journal of Parallel and Distributed.
Communicating Security Assertions over the GridFTP Control Channel Rajkumar Kettimuthu 1,2, Liu Wantao 3,4, Frank Siebenlist 1,2 and Ian Foster 1,2,3 1.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Authors: Ronnie Julio Cole David
LEGS: A WSRF Service to Estimate Latency between Arbitrary Hosts on the Internet R.Vijayprasanth 1, R. Kavithaa 2,3 and Raj Kettimuthu 2,3 1 Coimbatore.
Data Management and Transfer in High-Performance Computational Grid Environments B. Allcock, J. Bester, J. Bresnahan, A. L. Chervenak, I. Foster, C. Kesselman,
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
GridFTP GUI: An Easy and Efficient Way to Transfer Data in Grid
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
An Introduction to Networking
Cole David Ronnie Julio. Introduction Globus is A community of users and developers who collaborate on the use and development of open source software,
CEDPS Data Services Ann Chervenak USC Information Sciences Institute.
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
7. Grid Computing Systems and Resource Management
1 Overall Architectural Design of the Earth System Grid.
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
1 Active Directory Service in Windows 2000 Li Yang SID: November 2000.
USGS GRID Exploratory Status Review Stuart Doescher Mike Neiers USGS/EDC May
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
Globus Data Storage Interface (DSI) - Enabling Easy Access to Grid Datasets Raj Kettimuthu, ANL and U. Chicago DIALOGUE Workshop August 2, 2005.
Protocols and Services for Distributed Data- Intensive Science Bill Allcock, ANL ACAT Conference 19 Oct 2000 Fermi National Accelerator Laboratory Contributors:
PARALLEL AND DISTRIBUTED PROGRAMMING MODELS U. Jhashuva 1 Asst. Prof Dept. of CSE om.
Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Introduction to Data Management in EGI
Study course: “Computing clusters, grids and clouds” Andrey Y. Shevel
File Transfer Protocol
An Introduction to Computer Networking
Distributed Systems Bina Ramamurthy 11/30/2018 B.Ramamurthy.
Distributed Systems Bina Ramamurthy 12/2/2018 B.Ramamurthy.
Distributed Systems Bina Ramamurthy 4/22/2019 B.Ramamurthy.
Presentation transcript:

CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003 Toulouse, France

CEOS Working Group on Information Systems and Services - 2 The Grid Problem o Flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resource From “The Anatomy of the Grid: Enabling Scalable Virtual Organizations” o Enable communities (“virtual organizations”) to share geographically distributed resources as they pursue common goals -- assuming the absence of… n central location, n central control, n omniscience, n existing trust relationships.

CEOS Working Group on Information Systems and Services - 3 The Data Grid Problem “Enable a geographically distributed community [of thousands] to perform sophisticated, computationally intensive analyses on Petabytes of data” o Sounds like a separate class of problem, but is actually a superset. o So all work done on “Grid Problems” applies to “DataGrid Problems”. We just need some additional tools.

CEOS Working Group on Information Systems and Services - 4 Globus Approach o Software toolkit addressing key technical areas n Offer a modular “bag of technologies” n Enable incremental development of grid-enabled tools and applications n Define and standardize grid protocols and APIs (Our software development supports this goal.) o Focus is on inter-domain issues, not clustering n Supports collaborative resource use spanning multiple organizations n Integrates cleanly with intra-domain services n Creates a “collective” service layer

CEOS Working Group on Information Systems and Services - 5 Major Data Grid Projects o Earth System Grid (DOE Office of Science) n DG technologies, climate applications o European Data Grid (EU) n DG technologies & deployment in EU o GriPhyN – Grid Physics Network (NSF ITR) n Investigation of “Virtual Data” concept o Particle Physics Data Grid (DOE Science) n DG applications for HENP experiments

CEOS Working Group on Information Systems and Services - 6

CEOS Working Group on Information Systems and Services - 7 Basic Data Grid Services 1. GridFTP: Data Transfer and Access n Common protocol for data movement – Secure, efficient, reliable, flexible, extensible, etc. – Grid Forum (Internet) Draft n Family of tools supporting this protocol – Wu-ftpd, ncftp, Globus Toolkit SDKs, etc. 2. Replica Management Architecture Simple scheme for managing: l multiple copies of files l collections of files

CEOS Working Group on Information Systems and Services - 8 GridFTP: Basic Approach o FTP is defined by several IETF RFCs o Start with most commonly used subset n Standard FTP: get/put etc., 3rd-party transfer o Implement standard but often unused features n GSS binding, extended directory listing, simple restart o Extend in various ways, while preserving interoperability with existing servers

CEOS Working Group on Information Systems and Services - 9 Features of GridFTP o Grid Security Infrastructure and Kerberos support: Robust and flexible authentication, integrity, and confidentiality o Third-party control of data transfer: user or application at one site initiates, monitors and controls a data transfer between two other sites o Parallel data transfer: On wide-area links, use multiple TCP streams in parallel between the same source and destination o Striped data transfer: Use multiple TCP streams to transfer data that is striped or interleaved across multiple servers

CEOS Working Group on Information Systems and Services - 10 Features of GridFTP (cont.) o Partial file transfer: Standard FTP allows transfer of the remainder of a file starting at an offset. GridFTP supports transfers of arbitrary subsets or regions of a file o Automatic negotiation of TCP buffer/window sizes: optimal settings for TCP buffer/window sizes can dramatically improve performance o Support for reliable and restartable data transfer: FTP standard includes basic features for restart that are not widely implemented. GridFTP exploits these features and extends them.

CEOS Working Group on Information Systems and Services - 11 GridFTP for Efficient WAN Data Transfer o Secure authentication o Parallel transfer gets job done quickly o Partial file access gets only required data o Up to 2.8Gb/s using a striped server architecture Parallel Transfer Fully utilizes bandwidth of network interface on single nodes. Striped Transfer Fully utilizes bandwidth of Gb+ WAN using multiple nodes. Parallel Filesystem

CEOS Working Group on Information Systems and Services - 12 Current Data delivery process ftp based o Pull – Semi anonymous ftp n Product ready n sent to user with instructions and password n User ftp via “anonymous” and with provided password n Ftp demon positions user to appropriate directory n User pull data o Push – routine data flows to high volume users n Account provided on remote system n When data available is pushed to remote system

CEOS Working Group on Information Systems and Services - 13 o For routine multiple usage customers n Establish “Certificate process” with customer – Self-signed certificate authority – Customer generates private/public key pair – Generate user certificate with public key – Add user certificate to list of trusted users n Customer must install GridFTP client – Globus toolkit data management client bundle – Gsincftp – Java Commodity Grid Kit for Windows Potential Future data delivery GRIDftp based

CEOS Working Group on Information Systems and Services - 14 o For routine multiple usage customers n Pull – – Product ready – notifies user that data is ready – User using GRIDftp and user certificate for authentication provided access and pulls data n Push – – Account provided on remote system with host certificate and our user certificate – These GRID certificate establish Virtual Organization between the two parties – When data available is GRIDftp used to pushed data to remote system Potential Future data delivery GRIDftp based

CEOS Working Group on Information Systems and Services - 15 o For single usage customers Process to – Establish “Certificate process” with customer – Customer must install GridFTP client Currently seems too complex (not worth the effort) Would like to have simplified method such as – a one time use “user certificate” – Integrated with browser built in GRIDftp client Potential Future data delivery GRIDftp based