Globus Presented by: Yayati Kasralikar for CPA 5937.

Slides:



Advertisements
Similar presentations
The Anatomy of the Grid: An Integrated View of Grid Architecture Carl Kesselman USC/Information Sciences Institute Ian Foster, Steve Tuecke Argonne National.
Advertisements

High Performance Computing Course Notes Grid Computing.
Data Grids Darshan R. Kapadia Gregor von Laszewski
GridFTP: File Transfer Protocol in Grid Computing Networks
Grid computing Globus GridFTP & Replica Management Robert Nickel BTU - Mathematik 01.Februar 2002.
A Computation Management Agent for Multi-Institutional Grids
Seminar Grid Computing ‘05 Hui Li Sep 19, Overview Brief Introduction Presentations Projects Remarks.
Resource Management of Grid Computing
USING THE GLOBUS TOOLKIT This summary by: Asad Samar / CALTECH/CMS Ben Segal / CERN-IT FULL INFO AT:
Introduction to Grid Computing The Globus Project™ Argonne National Laboratory USC Information Sciences Institute Copyright (c)
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
4b.1 Grid Computing Software Components of Globus 4.0 ITCS 4010 Grid Computing, 2005, UNC-Charlotte, B. Wilkinson, slides 4b.
Grids and Globus at BNL Presented by John Scott Leita.
Globus Computing Infrustructure Software Globus Toolkit 11-2.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Resource Management Reading: “A Resource Management Architecture for Metacomputing Systems”
Grid Toolkits Globus, Condor, BOINC, Xgrid Young Suk Moon.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
1 School of Computer, National University of Defense Technology A Profile on the Grid Data Engine (GridDaEn) Xiao Nong
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.
Grid Security Issues Shelestov Andrii Space Research Institute NASU-NSAU, Ukraine.
File and Object Replication in Data Grids Chin-Yi Tsai.
The Anatomy of the Grid: An Integrated View of Grid Architecture Ian Foster, Steve Tuecke Argonne National Laboratory The University of Chicago Carl Kesselman.
The Grid System Design Liu Xiangrui Beijing Institute of Technology.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
The Globus Project: A Status Report Ian Foster Carl Kesselman
The Anatomy of the Grid Mahdi Hamzeh Fall 2005 Class Presentation for the Parallel Processing Course. All figures and data are copyrights of their respective.
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
1 4/23/2007 Introduction to Grid computing Sunil Avutu Graduate Student Dept.of Computer Science.
Globus Replica Management Bill Allcock, ANL PPDG Meeting at SLAC 20 Sep 2000.
Middleware for Grid Computing and the relationship to Middleware at large ECE 1770 : Middleware Systems By: Sepehr (Sep) Seyedi Date: Thurs. January 23,
Communicating Security Assertions over the GridFTP Control Channel Rajkumar Kettimuthu 1,2, Liu Wantao 3,4, Frank Siebenlist 1,2 and Ian Foster 1,2,3 1.
Grid Computing Environments Grid: a system supporting the coordinated resource sharing and problem-solving in dynamic, multi-institutional virtual organizations.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
Authors: Ronnie Julio Cole David
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
Data Management and Transfer in High-Performance Computational Grid Environments B. Allcock, J. Bester, J. Bresnahan, A. L. Chervenak, I. Foster, C. Kesselman,
Globus Toolkit Massimo Sgaravatto INFN Padova. Massimo Sgaravatto Introduction Grid Services: LHC regional centres need distributed computing Analyze.
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
Basic Grid Projects - Globus Sathish Vadhiyar Sources/Credits: Project web pages, publications available at Globus site. Some of the figures were also.
Globus – Part II Sathish Vadhiyar. Globus Information Service.
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
1 Observations on Architecture, Protocols, Services, APIs, SDKs, and the Role of the Grid Forum Ian Foster Carl Kesselman Steven Tuecke.
7. Grid Computing Systems and Resource Management
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
GraDS MacroGrid Carl Kesselman USC/Information Sciences Institute.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
1 I.Foster LCG Grid Technology: Introduction & Overview Ian Foster Argonne National Laboratory University of Chicago.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
Globus: A Report. Introduction What is Globus? Need for Globus. Goal of Globus Approach used by Globus: –Develop High level tools and basic technologies.
Protocols and Services for Distributed Data- Intensive Science Bill Allcock, ANL ACAT Conference 19 Oct 2000 Fermi National Accelerator Laboratory Contributors:
The Globus Toolkit The Globus project was started by Ian Foster and Carl Kesselman from Argonne National Labs and USC respectively. The Globus toolkit.
A Resource Management Architecture for Metacomputing Systems Karl Czajkowski Ian Foster Nicholas Karonis Carl Kesselman Stuart Martin Warren Smith Steven.
INTRODUCTION TO GRID & CLOUD COMPUTING U. Jhashuva 1 Asst. Professor Dept. of CSE.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
Clouds , Grids and Clusters
The Data Grid: Towards an architecture for Distributed Management
Peter Kacsuk – Sipos Gergely MTA SZTAKI
Globus —— Toolkits for Grid Computing
Grid Computing B.Ramamurthy 9/22/2018 B.Ramamurthy.
The Globus Toolkit™: Information Services
Distributed Systems Bina Ramamurthy 11/30/2018 B.Ramamurthy.
Distributed Systems Bina Ramamurthy 12/2/2018 B.Ramamurthy.
Introduction to Grid Technology
The Anatomy and The Physiology of the Grid
The Anatomy and The Physiology of the Grid
Distributed Systems Bina Ramamurthy 4/22/2019 B.Ramamurthy.
Presentation transcript:

Globus Presented by: Yayati Kasralikar for CPA 5937

Motivational Example Cancer image Data Mining Software Very large Database of cancer images cancer images cancer images R R cancer images Data Pre- processing Software High- performance machine

What is Grid? 1.Coordinates resources that are not subject to centralized control. 2.Uses standard, open, general-purpose protocols and interfaces. 3.Delivers nontrivial qualities of service. Let’s Examine some technologies: –Clusters –P2P Systems (e.g. Gnutella) –Web -Centralized Control Do not use Open and Standard protocols Not coordinated use resources

Why use Grid? A biochemist exploits 10,000 computers to screen 100,000 compounds in an hour. 1,000 physicists worldwide pool resources for peta-op analyses of petabytes of data. An insurance company mines data from partner hospitals for fraud detection. An application service provider offloads excess load to a compute cycle provider

Virtual Organization (VO) ? R R R R R R R R R R R R R R R RR VO A VO B VO C A dynamic set of individuals or institutions sharing resources for problem solving

Grid Characteristics Scale and Resource Selection –Particular applications selecting resources from a very large collection according to criteria such as connectivity,cost,security and reliability Heterogeneity at multiple levels –heterogeneity ranging from physical devices, system software to scheduling and usage Dynamic and unpredictable behavior –Behavior and performance of shared resources vary over time Multiple administrative domain. –Challenging security problem

Globus Initiative Provide basic infrastructure, Protocols, Services, APIs and SDKs for Grid Computing. –Protocols: Focus on externals(interactions) rather than internals(resource characteristics) (e,g. GRIP, IP) –Service: Protocol+Behavior (e.g. Information). –APIs and SDKs: Facilitate application developers to develop complex applications(e.g. GSS API,JDBC API,JNDI SDK). Application robustness, correctness, development and maintenance cost. Globus Toolkit: A community-based,open- architecture,open-source set of services and software libraries that supports Grids and Grid Applications.

Layered Grid Architecture Application Collective Resource Connectivity Fabric Internet Link Transport Grid Protocol Architecture Internet Protocol Architecture

Connectivity Layer Application Collective Resource Connectivity Fabric Grid Protocol Architecture Nexsus Interface Grid Security Infrastructure GSI

Resource Layer Application Collective Resource Connectivity Fabric Grid Resource Information Protocol (GRIP) GridFTP Grid Resource Access Management (GRAM) Grid Protocol Architecture Grid Resource Registration Protocol (GRRP) Data Transfer Grid Information Services Resource Management

Collective Layer Application Collective Resource Connectivity Fabric Data Replication Services Directory Services Grid Protocol Architecture Monitoring Services Scheduling and Brokering Services

Application Layer Application Collective Resource Connectivity Fabric Grid Protocol Architecture Languages & Frameworks Collective APIs and SDKs Resource APIs and SDKs Connectivity APIs Fabric Collective Service Protocols Resource Service Protocols Connectivity Protocols

Communication Services EP SP EP 012 Nexus communication mechanism Communication link Diverse Communication needs. IP does not meet these needs on the other hand MPI do not provide rich range of communication abstractions. Communication link and remote service request (RSR). –One-sided asynchronous RPC transfer data from SP to EP(s) and integrate it into the process containing the EP(s)

Resource Management Challenging resource management problems: site autonomy –resources are typically owned and operated by different organizations, in different administrative domains heterogeneous substrate –different sites may use different local resource management systems policy extensibility –A resource management solution must support the frequent development of new domain-specific management structures co-allocation –using resources simultaneously at several sites online control. –substantial negotiation can be required to adapt application requirements to resource availability

Resource Management Architecture GRAM LSFCondorNQE Application RSL Simple ground RSL Informatio n Service Local resource managers RSL specialization Broker Ground RSL Co-allocator Queries & Info

Resource Specification Language Based on the syntax for filter specifications in the LDAP. An RSL is constructed by combining simple parameter specifications and conditions with following operators: &: Specify conjunction | : Specify disjunction + : Combine two or more requests Resource brokers,co-allocators and resource managers can each define a set of parameters. Example: I want “5 nodes with at least 256MB memory, or 10 nodes with 64MB for myprog” RSL:&(executable=myprog)(|(&(count=5) (memory>=256)) (|(&(count=10) (memory>=64)))

Local Resource Management Globus Resource Allocation Manager (GRAM) provide local component for resource management. GRAM is responsible for: 1.Processing RSL specifications 2.Enabling remote monitoring and management of jobs 3.Periodically updates the information service. Two major software components of GRAM: 1.GateKeeper: create Grid service 2.Job Manager Instance(JMI): resource management and Job control

The Hour-Glass principle Simple well-defined interface form the neck. Uniform access to diverse local implementations and higher-level global services.

Grid Security Characteristics Single Sign on –Users must be able to authenticate just once to access to multiple grid resources. Delegation –Users must be able to endow a program with the ability to run on his/her behalf. Integration with local security Solutions –Interoperate with various local solutions. User-based trust relationships –Each of the resource providers must not interact with each other to configure security environment.

Security Policies: Grid Environment consists of multiple trust domains. Operations confined to a single trust domain are subject to local security policy only. Both local and global participants exists. For each trust domain, there exists a partial mapping from global to local. Operations between entities located in different trust domains require mutual authentication. An authenticated global subject mapped into a local subject is assumed to be equivalent to being locally authenticated as that local subject. All access control decisions are made locally on the basis of the local subject. A program or process is allowed to act on behalf of a user and be delegated a subset of the user's rights. Processes running on behalf of the same subject within the same trust domain may share a single set of credentials.

Globus Security Infrastructure Credentials User Proxy Globus Credentials Certificate User Kerberos Public Key GSI GRAM User Process GSI GRAM

Globus Security Scenario Site A (Kerberos) Site B (Unix) Site C (Kerberos) Computer User Computer Storage system Communication GSI-enabled FTP server Authorize Map to local id Access file Remote file access request GSI-enabled GRAM server GSI-enabled GRAM server Remote process creation requests Process Kerberos ticket Restricted proxy Process Restricted proxy Local id Authorize Map to local id Create process Generate credentials Same Single sign-on via “grid-id” & generation of proxy cred. Or: retrieval of proxy cred. from online repository User Proxy Proxy credential

Information Services Initial Discovery and ongoing monitoring of Resources Existing services such as LDAP and UDDI do not address the dynamic addition and deletion of resources. Two Fundamental entities in Grid Information Service: Highly distributed information providers. Specialized aggregate directory services. Both these entities speak two fundamental protocols.

Information Services discovery (GRIP) lookup (GRIP) registration (GRRP) P PPP D D VO-specific Aggregate Directories Initial Discovery and ongoing monitoring of Resources Existing services such as LDAP and UDDI do not address the dynamic addition and deletion of resources. Two Fundamental entities in Grid Information Service: Highly distributed information providers. Specialized aggregate directory services. Both these entities speak two fundamental protocols. Information Provider Services

Information Services - Protocols Grid Information Protocol (GRIP) –Used to access information about entities –GRIP supports both discovery and enquiry –GRIP is adopted from Lightweight Directory Access Protocol (LDAP) –LDAP defines data model,query language and wire protocol. Grid Registration Protocol (GRRP) –Define a notification mechanism to push simple information from one ‘element’ to another ‘element’. –It is a soft-state protocol which is resilient to failures. –GRRP message contains name of the service,type of notification service and timestamp.

Hierarchical Discovery Host:hn=R1,O=O1 Host:hn=R2,O=O1 Host:hn=R3,O=O1 Host:hn=R1,O=O2 Host:hn=R2,O=O2 Host:hn=R1 Host:hn=R2 Host:hn=R3 Host:hn=R1 Host:hn=R2 Host R1R2R3R1R3 R1O1 O2 VO Directory Information Provider Center 1 Directory Center 2 Directory Network of aggregate directories Each directory uses GRIP and act as a Information Provider

Data Transfer - GridFTP High-speed transport protocol which extends the popular FTP protocol. GridFTP Functionality: –GridFTP must support GSI –Third-party control of data transfer –Parallel data transfer –Stripped data transfer –Partial file transfer –Support for reliable and restartable data transfer. The implementation consists of two principal libraries: globus_ftp_control_library and globus_ftp_client_library

Replica Management Service Application Metadata Service Replica Management Service Replica Selection Service Information Services Attributes of desired data (1) Logical File Names (2) Sources and destination (6) Performance Measurements and Predictions (7) Location of Selected Replicas (8) Location of 1 or more replicas (4) (3) (5)

Replica Management Service Creating new copies of a complete or partial collection of files Registering them in a Replica Catalog Allow Applications to query the catalog Data are organized into files. –Logical File name Vs Physical File name. Key Architecture Decisions: –Separation of Replication and Metadata Information –Does not enforce Replication Semantics –Provide Rollback to keep the state consistent in case of failures –No distributed locking mechanism

Relationships to other technologies World Wide Web –Web technologies mainly support client-server architecture. Lack features (at least for now) for rich interaction and single-sign on security. ASP and SSP. –Provide outsource solutions which depend on specific customer. Lack dynamic configuration. Enterprise Computing –Static arrangements of sharing resources. P2P computing –Getting closer to Grid technology, but provide specific solutions rather than common protocols.

Other Grid Perspective Grid as a next-generation Internet Grid is a source of free cycles Grid requires new programming models Grid makes high-performance computers superfluous

References What Is The Grid? A Three Point Checklist. I. Foster, GRIDToday, July 22, 2002: Vol. 1 No. 6.What Is The Grid? A Three Point Checklist. Grid Computing on the Web Using the Globus Toolkit, G. Aloisio, M. Cafaro, P. Falabella, C. Kesselman, R. Williams HPCN Europe.Grid Computing on the Web Using the Globus Toolkit Computational Grids. I. Foster, C. Kesselman. Chapter 11 of "The Grid: Blueprint for a New Computing Infrastructure", Morgan-Kaufman, The Globus Project: A Status Report. I. Foster, C. Kesselman. Proc. IPPS/SPDP '98 Heterogeneous Computing Workshop, pp. 4-18, 1998.The Globus Project: A Status Report Globus: A Metacomputing Infrastructure Toolkit. I. Foster, C. Kesselman. Intl J. Supercomputer Applications, 11(2): , 1997.Globus: A Metacomputing Infrastructure Toolkit.

References Data Management and Transfer in High Performance Computational Grid Environments. B. Allcock, J. Bester, J. Bresnahan, A. L. Chervenak, I. Foster, C. Kesselman, S. Meder, V. Nefedova, D. Quesnal, S. Tuecke. Parallel Computing Journal, Vol. 28 (5), May 2002, pp Data Management and Transfer in High Performance Computational Grid Environments. Computational Grids. I. Foster, C. Kesselman. Chapter 2 of "The Grid: Blueprint for a New Computing Infrastructure", Morgan-Kaufman, 1999.Computational Grids. A Directory Service for Configuring High-Performance Distributed Computations. S. Fitzgerald, I. Foster, C. Kesselman, G. von Laszewski, W. Smith, S. Tuecke. Proc. 6th IEEE Symposium on High-Performance Distributed Computing, pp , 1997.A Directory Service for Configuring High-Performance Distributed Computations.

References Grid Information Services for Distributed Resource Sharing. K. Czajkowski, S. Fitzgerald, I. Foster, C. Kesselman. Proceedings of the Tenth IEEE International Symposium on High-Performance Distributed Computing (HPDC-10), IEEE Press, August 2001.Grid Information Services for Distributed Resource Sharing A Security Architecture for Computational Grids. I. Foster, C. Kesselman, G. Tsudik, S. Tuecke. Proc. 5th ACM Conference on Computer and Communications Security Conference, pp , 1998.A Security Architecture for Computational Grids. A Resource Management Architecture for Metacomputing Systems. K. Czajkowski, I. Foster, N. Karonis, C. Kesselman, S. Martin, W. Smith, S. Tuecke. Proc. IPPS/SPDP '98 Workshop on Job Scheduling Strategies for Parallel Processing, pg , 1998.A Resource Management Architecture for Metacomputing Systems.

Closing Remarks We will probably see the spread of 'computer utilities', which, like present electric and telephone utilities, will service individual homes and offices across the country." , Len Kleinrock We are a little late, but we are ready now!

Extra-1: A Model Architecture for Data Grids Metadata Catalog Replica Catalog Tape Library Disk Cache Attribute Specification Logical Collection and Logical File Name Disk ArrayDisk Cache Application Replica Selection Multiple Locations NWS Selected Replica GridFTP Control Channel Performance Information & Predictions Replica Location 1Replica Location 2Replica Location 3 MDS GridFTP Data Channel

Extra-2: Replica Catalog Structure: Logical File Parent Logical File Jan 1998 Logical Collection C02 measurements 1998 Replica Catalog Location jupiter.isi.edu Location sprite.llnl.gov Logical File Feb 1998 Size: Filename: Jan 1998 Filename: Feb 1998 … Filename: Mar 1998 Filename: Jun 1998 Filename: Oct 1998 Protocol: gsiftp UrlConstructor: gsiftp://jupiter.isi.edu/ nfs/v6/climate Filename: Jan 1998 … Filename: Dec 1998 Protocol: ftp UrlConstructor: ftp://sprite.llnl.gov/ pub/pcmdi Logical Collection C02 measurements 1999