DataGrid Middleware: Enabling Big Science on Big Data One of the most demanding and important challenges that we face as we attempt to construct the distributed.

Slides:



Advertisements
Similar presentations
Giggle: A Framework for Constructing Scalable Replica Location Services Ann Chervenak, Ewa Deelman, Ian Foster, Leanne Guy, Wolfgang Hoschekk, Adriana.
Advertisements

The Globus Striped GridFTP Framework and Server Bill Allcock 1 (presenting) John Bresnahan 1 Raj Kettimuthu 1 Mike Link 2 Catalin Dumitrescu 2 Ioan Raicu.
The Replica Location Service In wide area computing systems, it is often desirable to create copies (replicas) of data objects. Replication can be used.
Globus DataGrid Overview Bill Allcock, ANL GridPP Meeting 30 June 2003.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
High Performance Computing Course Notes Grid Computing.
Data Grids Darshan R. Kapadia Gregor von Laszewski
GridFTP: File Transfer Protocol in Grid Computing Networks
MTA SZTAKI Hungarian Academy of Sciences Grid Computing Course Porto, January Introduction to Grid portals Gergely Sipos
Application of GRID technologies for satellite data analysis Stepan G. Antushev, Andrey V. Golik and Vitaly K. Fischenko 2007.
The Globus Toolkit Gary Jackson. Introduction The Globus Toolkit is a product of the Globus Alliance ( It is middleware for developing.
OCT1 Principles From Chapter One of “Distributed Systems Concepts and Design”
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
4b.1 Grid Computing Software Components of Globus 4.0 ITCS 4010 Grid Computing, 2005, UNC-Charlotte, B. Wilkinson, slides 4b.
UMIACS PAWN, LPE, and GRASP data grids Mike Smorul.
Globus Computing Infrustructure Software Globus Toolkit 11-2.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Globus 4 Guy Warner NeSC Training.
Technology Overview. Agenda What’s New and Better in Windows Server 2003? Why Upgrade to Windows Server 2003 ?  From Windows NT 4.0  From Windows 2000.
Thesis Proposal Ali Kaplan
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
Reliable Data Movement Framework for Distributed Science Environments Raj Kettimuthu Argonne National Laboratory and The University of Chicago.
DISTRIBUTED COMPUTING
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
1 School of Computer, National University of Defense Technology A Profile on the Grid Data Engine (GridDaEn) Xiao Nong
INFSO-RI Enabling Grids for E-sciencE The US Federation Miron Livny Computer Sciences Department University of Wisconsin – Madison.
Why GridFTP? l Performance u Parallel TCP streams, optimal TCP buffer u Non TCP protocol such as UDT u Order of magnitude greater l Cluster-to-cluster.
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.
ESP workshop, Sept 2003 the Earth System Grid data portal presented by Luca Cinquini (NCAR/SCD/VETS) Acknowledgments: ESG.
Secure, Collaborative, Web Service enabled and Bittorrent Inspired High-speed Scientific Data Transfer Framework.
1 Use of SRMs in Earth System Grid Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory.
Grid Technologies  Slide text. What is Grid?  The World Wide Web provides seamless access to information that is stored in many millions of different.
File and Object Replication in Data Grids Chin-Yi Tsai.
Reliable Data Movement using Globus GridFTP and RFT: New Developments in 2008 John Bresnahan Michael Link Raj Kettimuthu Argonne National Laboratory and.
Globus GridFTP and RFT: An Overview and New Features Raj Kettimuthu Argonne National Laboratory and The University of Chicago.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
The Globus Project: A Status Report Ian Foster Carl Kesselman
UDT as an Alternative Transport Protocol for GridFTP Raj Kettimuthu Argonne National Laboratory The University of Chicago.
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
Perspectives on Grid Technology Ian Foster Argonne National Laboratory The University of Chicago.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
Authors: Ronnie Julio Cole David
The Earth System Grid (ESG) Computer Science and Technologies DOE SciDAC ESG Project Review Argonne National Laboratory, Illinois May 8-9, 2003.
LEGS: A WSRF Service to Estimate Latency between Arbitrary Hosts on the Internet R.Vijayprasanth 1, R. Kavithaa 2,3 and Raj Kettimuthu 2,3 1 Coimbatore.
Data Management and Transfer in High-Performance Computational Grid Environments B. Allcock, J. Bester, J. Bresnahan, A. L. Chervenak, I. Foster, C. Kesselman,
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
GridFTP GUI: An Easy and Efficient Way to Transfer Data in Grid
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
CEDPS Data Services Ann Chervenak USC Information Sciences Institute.
7. Grid Computing Systems and Resource Management
A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC.
The Globus eXtensible Input/Output System (XIO): A protocol independent IO system for the Grid Bill Allcock, John Bresnahan, Raj Kettimuthu and Joe Link.
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
Globus Data Storage Interface (DSI) - Enabling Easy Access to Grid Datasets Raj Kettimuthu, ANL and U. Chicago DIALOGUE Workshop August 2, 2005.
Rights Management for Shared Collections Storage Resource Broker Reagan W. Moore
Protocols and Services for Distributed Data- Intensive Science Bill Allcock, ANL ACAT Conference 19 Oct 2000 Fermi National Accelerator Laboratory Contributors:
New Development Efforts in GridFTP Raj Kettimuthu Math & Computer Science Division, Argonne National Laboratory, Argonne, IL 60439, U.S.A.
A Sneak Peak of What’s New in Globus GridFTP John Bresnahan Michael Link Raj Kettimuthu (Presenting) Argonne National Laboratory and The University of.
PARALLEL AND DISTRIBUTED PROGRAMMING MODELS U. Jhashuva 1 Asst. Prof Dept. of CSE om.
Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.
Parallel Computing Globus Toolkit – Grid Ayaka Ohira.
Study course: “Computing clusters, grids and clouds” Andrey Y. Shevel
University of Technology
SDM workshop Strawman report History and Progress and Goal.
Large Scale Distributed Computing
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

DataGrid Middleware: Enabling Big Science on Big Data One of the most demanding and important challenges that we face as we attempt to construct the distributed computing machinery required to support SciDAC goals is the efficient, high-performance, reliable, secure, and policy-aware management of large-scale data movement. This problem is fundamental to application domains as diverse as experimental physics (high energy physics, nuclear physics, light sources), simulation science (climate, computational chemistry, fusion, astrophysics), and large-scale collaboration. In each case, we have highly distributed user communities that require high-speed access to valuable data, whether for visualization or analysis. The quantities of data involved (terabytes to petabytes), the scale of the demand (hundreds or thousands of users, data-intensive analyses, real-time constraints), and the complexity of the infrastructure that must be managed (networks, tertiary storage systems, network caches, computers, visualization systems) make the problem extremely challenging. The Problem: The general problem that we propose to focus on is the reliable, high-performance movement of data between Data Grid components. In tackling this problem, we focus on three key questions: 1. Data Movement: How do we move data efficiently and reliably between two Data Grid components? 2. Data Transfer Management: How do we coordinate end-to-end data transfers so as to meet performance goals and make efficient use of available resources? 3. Collective Data Management: How do we manage structure, schedule, and manage data movement operations to satisfy higher-level goals? The Problem: The general problem that we propose to focus on is the reliable, high-performance movement of data between Data Grid components. In tackling this problem, we focus on three key questions: 1. Data Movement: How do we move data efficiently and reliably between two Data Grid components? 2. Data Transfer Management: How do we coordinate end-to-end data transfers so as to meet performance goals and make efficient use of available resources? 3. Collective Data Management: How do we manage structure, schedule, and manage data movement operations to satisfy higher-level goals? The eXtensible Input / Output System (XIO): XIO provides an abstraction layer over the familiar Read/Write/Open/Close (RWOC) API. Any stream based data source can use the same API, by writing an appropriate driver. The figure shows how the drivers are arranged in a stack with the XIO framework handling transitions between drivers. No buffer copies are done for efficiency. There is exactly one transport driver and it must be the first driver on the stack. It brings data in and out of the XIO process space. We provide file, TCP, and UDP transport drivers by default. Transform drivers are optional and alter the data before Transport. GSI is an example transform driver. We are in the process of producing a MODE E (parallel TCP) and a GridFTP driver. The GridFTP will allow a 3 rd party application to access files under the control of a GridFTP server using RWOC semantics. FRAMEWORKFRAMEWORK FRAMEWORKFRAMEWORK Transform Transport XIO SOURCESOURCE SOURCESOURCE SINKSINK SINKSINK XIOXIO XIOXIO BUFFERBUFFER BUFFERBUFFER XIOXIO XIOXIO GridFTP Server Could be: File, TCP, UDT, SRB, HDF5, etc with appropriate XIO driver Completely Re-implemented GridFTP Server: There are 3 primary reasons we replaced the wuftpd based implementation: 1) Ease of Maintenance 2) Ease of Extensibility 3) Licensing issues. As shown in the figure below, the new server uses XIO. Again, there are no buffer copies for performance. Using XIO provides much greater extensibility. For instance, if a mass storage vendor writes an XIO driver to their system, it should be relatively easy to produce a GridFTP server to front their mass store. At SC2003 demonstrated using the new server over a non-TCP protocol via XIO. We used UDT from Bob Grossman’s group at UIC. Such aggressive protocols may be appropriate in an engineered, semi-private network. In addition, we achieved 9 Gbs disk to disk using 15 servers with TCP over a 10 GigE link. The Tools: Future Work: Future areas of emphasis include:  MODE E (Parallel TCP) and GridFTP XIO drivers  GridFTP servers for sources other than files (HPSS, SRB, etc)  Scalability of the Reliable File Transfer (RFT) service (millions of files)  Striping released and more flexible striping schemes  Windows Port of GridFTP  Collective services that integrate / coordinate these and other services  Improved scalability and robustness of Replica Location Service  Grid service design and implementation of Replica Location Service  MODE E (Parallel TCP) and GridFTP XIO drivers  GridFTP servers for sources other than files (HPSS, SRB, etc)  Scalability of the Reliable File Transfer (RFT) service (millions of files)  Striping released and more flexible striping schemes  Windows Port of GridFTP  Collective services that integrate / coordinate these and other services  Improved scalability and robustness of Replica Location Service  Grid service design and implementation of Replica Location Service Replica Location Service (RLS): is a distributed registry service that records the locations of data copies and allows discovery of replicas. The RLS has been released in Globus Toolkit Version 3.0. A performance and scalability study on RLS was conducted in Fall 2003 and Spring This study demonstrates that individual RLS servers perform well, scaling up to millions of entries and tens of simultaneous requesting processes. The study also demonstrates that soft state updates of the distributed RLI index scale well when using incremental updates or Bloom filter compression. In January 2004, we released the Copy and Registration Service, which is an OGSI Grid service that wraps around the GT3 RLS implementation. The service integrates file copy operations with RLS registration operations. Through the Global Grid Forum’s OGSA Replication Service Working Group, we are standardizing Grid service interfaces for Replica Location Services. Replica Location Service (RLS): is a distributed registry service that records the locations of data copies and allows discovery of replicas. The RLS has been released in Globus Toolkit Version 3.0. A performance and scalability study on RLS was conducted in Fall 2003 and Spring This study demonstrates that individual RLS servers perform well, scaling up to millions of entries and tens of simultaneous requesting processes. The study also demonstrates that soft state updates of the distributed RLI index scale well when using incremental updates or Bloom filter compression. In January 2004, we released the Copy and Registration Service, which is an OGSI Grid service that wraps around the GT3 RLS implementation. The service integrates file copy operations with RLS registration operations. Through the Global Grid Forum’s OGSA Replication Service Working Group, we are standardizing Grid service interfaces for Replica Location Services. NeST: To be effectively utilized, all Grid resources need to have a “manager” that can allocate, manage reservations, load balance, etc.. This is common and very well developed for compute nodes. Other resources, such as network and storage are not as developed. NeST addresses, at least in part, the storage resource management issue. As shown in the diagram, there are 3 major components. The dispatcher is the main scheduler and is responsible for controlling the flow of information between the other components. Data movement requests are sent to the transfer manager; all other requests such as resource management and directory operation requests are handled by the storage manager. The dispatcher also periodically consolidates information about resource and data availability in the NeST and can publish this information into a global scheduling system. The storage manager has four main responsibilities: virtualizing and controlling the physical storage of the machine, directly executing non-transfer requests, implementing and enforcing access control, and managing guaranteed storage space in the form of lots. For Further Information: Ian Foster (ANL, Univ. of Chicago) Carl Kesselman (ISI, Univ. of Southern Miron Livny (Univ. of Wisconsin – Madison) The figure shows the design of the Replica Location Grid Service, which is based on OGSI Service Groups and associates Data Services that are replicas according to the policies of the Replica Location Grid Service.