Presentation is loading. Please wait.

Presentation is loading. Please wait.

Grid Computing 7700 Fall 2005 Lecture 5: Grid Architecture and Globus Gabrielle Allen

Similar presentations


Presentation on theme: "Grid Computing 7700 Fall 2005 Lecture 5: Grid Architecture and Globus Gabrielle Allen"— Presentation transcript:

1 Grid Computing 7700 Fall 2005 Lecture 5: Grid Architecture and Globus Gabrielle Allen allen@bit.csc.lsu.edu http://www.cct.lsu.edu/~gallen

2 Concrete Example  I have a source file Main.F on machine A, an input file on machine B. Main.F is written using MPI, it will need around 4GB of core memory to run, it will take several hours to complete, and will produce a large output file.  What functionality do we need?

3 Issues  How to select a machine to run it on?  How to provide an executable which can run on that machine?  How to move the input file?  How to start the executable?  How to monitor the job? When does it start? When does it finish?  How to move the output file back?  What about security?  How do we know if it didn’t work and how it failed?

4 How to Select a Machine  What properties of a machine are we interested in? –What resources does my executable require? 4 GB memory, “several hours of compute time” Enough diskspace for the output –What kind of environment do I need on the machine? OS limitations? MPI? (Which version?), Fortran? –What resources am I authorized to run on? –How quickly will it run? –How much will it cost/what is my allocation there? –How to find all this information? What should the user provide?

5 More Complicated  What if the program might need to read in data kept on machine C while it is running?  What about distributing across processors on different machines?  What if I have a lot of interconnected programs?  How do I find the output file afterwards?  What is it doesn’t work?

6

7 Questions  What kind of functionality do we need?  What tools exist to do this?  What kinds of features of distributed computing do they need to be designed?  What design issues to watch for?

8 Abstract Requirements  Single sign-on  Job submission, monitoring and management –submit a job to a resource on the grid –monitor the progress of a submitted job –retrieve results –cancel job  File transfer –move files from A to B, securely, reliably and efficiently  Resource discovery –locate resources or services with particular characteristics Less typical:  Metacomputing, workflow enactment, resource brokering,...

9 What do I have to choose from?  Globus Toolkit –version 2 is widely deployed; nearest thing to a de facto standard –horizontally integrated bag of tools –suits grid application developers better than end users –Brand new V4 based on web services  UNICORE –less widely deployed; few UK deployments –vertically integrated –suits end users better than application developers  Condor –high throughput computing –great for cycle harvesting  Web Services? –GT4 or roll your own using Web Services tools  Others –yes, there are others

10 Computationally intensive File access/transfer Bag of various heterogeneous protocols & toolkits Monolithic design Recognised internet, ignored Web Academic teams Generation Game Increased functionality, standardization Time Custom solutions Open Grid Services Architecture Web services Globus Toolkit Condor, Unicore Defacto standards GridFTP, GSI X.509, LDAP, FTP, … App-specific Services Data and knowledge intensive Open services-based architecture Builds on Web services GGF + OASIS+W3C Multiple implementations Global Grid Forum Industry participation (adapted from Ian Foster GGF7 Plenary)

11 Grid Architecture Fabric Connectivity Resource Collective Application

12 Fabric Layer  Contains the resources themselves which the Grid infrastructure needs to access  Fabric components implement local, resource specific operations to provide higher level Grid operations –NFS storage protocol –Kerberos security –PBS queuing system  Grid cannot provide more than local operations can support (e.g. advanced reservation)

13 Fabric Layer  Computational resources  Storage resources  Network resources  But also –Database resources –Code repository resources –Etc.

14 Fabric Layer  What is the minimum functionality? –Introspection mechanisms: Computational resources: hardware, software characteristics, state information such as current load and queue state Storage resources: hardware, software characteristics, available space Network resources: network characteristics and load –Resource management mechanisms Computational resources: starting programs, monitoring and controlling execution of resulting programs Storage resources: file put and get

15 Fabric Layer  What is desirable? –Introspection mechanisms: Storage resources: bandwidth utilization –Resource management mechanisms Computational resources: control over resources allocated to processes, advanced reservation Storage resources: 3rd party transfers, high performance transfers, put and get of file subsets, callback functionality Network resources: control of resources, prioritization, reservation

16 Connectivity Layer  Core communication and authentication protocols for needed network transactions  Exchange of data between fabric layer resources  Security  Requirements: transport, routing, naming  Assumed using protocols from TCP/IP stack (IP, ICMP, TCP, UCP, DNS, OSPF, RSVP, …), but could be others.

17 Connectivity Layer  Security requirements –Single sign-on to all resources –Delegation of rights –Integration with local security –Implementation of trust relations –Secure transport of data

18 Resource Layer  Protocols for secure negotiation, initiation, monitoring, control, accounting on individual resources  Concerned with individual resources (addressed in next layer)  Information protocols –Obtaining information about structure and state of a resource  Management protocols –Negotiating access for given resource requirements, performing operations (job starting, data access). Monitoring and controlling resources and processes.

19 Grid Architecture Fabric Connectivity Resource Collective Application

20 Resource Layer  Protocols for secure negotiation, initiation, monitoring, control, accounting on individual resources  Concerned with individual resources (addressed in next layer)  Information protocols –Obtaining information about structure and state of a resource  Management protocols –Negotiating access for given resource requirements, performing operations (job starting, data access). Monitoring and controlling resources and processes.

21 Collective Layer  Dealing with operations across collective resources  Build on relativity small number of resource/connectivity protocols  Examples –Directory services (to provide information about resources) –Co-allocation, scheduling, brokering services –Monitoring and diagnostic services –Data replication services –Community authorization and accounting services

22 What do I have to choose from?  Globus Toolkit –version 2 is widely deployed; nearest thing to a de facto standard –horizontally integrated bag of tools –suits grid application developers better than end users –Brand new V4 based on web services  UNICORE –less widely deployed; few UK deployments –vertically integrated –suits end users better than application developers  Condor –high throughput computing –great for cycle harvesting  Web Services? –GT4 or roll your own using Web Services tools  Others –yes, there are others

23 UNICORE  Packaged Software with GUI  Open source –http://unicore.sourceforge.net/  Designed for firewalls  Strict security model –explicit delegation  Abstract Job Object (AJO) –built-in workflow management  Resource Broker –can submit to Globus grids  Has notion of software resource  Few APIs –extend through plug-ins –starting to expose service interfaces  Serves the user http://www.unicore.org/

24 Condor: High-throughput computing Condor converts collections of workstations and clusters into a distributed high-throughput computing facility  Emphasis on policy management and reliability  High-throughput scheduler  Supports job checkpoint and migration –single processor jobs only  Remote system calls Condor-G lets Condor users add Globus-enabled resources to their private view of a Condor pool ("flock")  "glide-in" http://www.cs.wisc.edu/condor/

25 Legion/Avaki  Object based meta-system, providing a single integrated infrastructure  All components are objects (unlike GT) –Data abstraction, encapsulation, inheritance, polymorphism  API to core services  Core object types –Classes/metaclasses: managers and policy makers –Host objects: abstractions of processing resources (one or many) –Vault objects: persistent storage –Implementation objects and caches: “exectuables” –Binding agents: maps objects to physical addresses –Context objects: naming of objects

26 Globus Toolkit V2  GT2 “Implements Grid protocols for security, information discovery, resource management, data management, communication, fault detection and portability”  Bag of tools rather than a uniform programming model, aims to provide distinct services with well defined APIs  Assumes suitable software deployed on resources to provide basic fabric functionality (although some tools to help this are provided) –Discovering and packaging structure and state information

27 Globus Toolkit version 2  "Single sign-on" through Grid Security Infrastructure (GSI)  Remote execution of jobs –GRAM, job-managers, Resource Specification Language (RSL)  Grid-FTP –Efficient, reliable file transfer; third-party file transfers  MDS (Metacomputing Directory Service) –Resource discovery (GRIS and GIIS)  Co-allocation (DUROC) –Limited by support from scheduling infrastructure  Other GSI-enabled utilities –gsi-ssh, grid-cvs, etc.  Low-level APIs and command-line interfaces  Commodity Grid Kits (CoG-kits), Java, Perl, Python  Widespread deployment, lots of projects Diverse global services Core services Local OS A p p l i c a t i o n s

28 Globus Toolkit V2  Connectivity –Grid Security Infrastructure (GSI) protocols –Based on public-key-infrastructure (PKI) and Internet protocols –Single sign-in (authentication creates a proxy credential: a digitally signed certificate that grants the holder the right to perform operations on behalf of signer for a limited time) –Delegation (communication of a (restricted) proxy credential to a remote service) –Credential format is extension of X.509 certificate –Remote delegation protocol based on transport layer security (TLS) protocol (follow on to SSL) –High-level programming API extensions of generic sercurity service application programming interface (GSS-API)

29 Globus Toolkit V2  Resource Layer –Grid Resource Allocation and Management (GRAM) protocol –Monitoring and Discovery Service (MDS-2) –Grid File Transfer Protocol (GridFTP)

30 GRAM Protocol  Grid Resource Allocation and Management –Creation and management of remote computations –GSI for authentication, authorization, delegation –GRAM implementations map requests expressed in a Resource Specification Language (RSL) into commands understood by local schedulers and computers –Multiple GRAM implementations exist (with C, Java, Python interfaces) –GT2 implementation Based on HTTP protocol “gatekeeper” initiates remote computations “jobmanager” manages remote computation GRAM reporter monitors and publishes information

31 MDS-2  Monitoring and Discovery Service –Framework for discovering and accessing structure and status information about resources (and services) Data model for representing information Protocols for publishing and accessing information –GT2 implementation Based on LDAP (lightweight directory access protocol) Local registry to manage collection and publication of information at a single location Collective registry to support queries for information from multiple locations Caching for performance

32 GridFTP Protocol  Extended version of file transfer protocol –GSI security –Partial file access, high speed striping –Third party transfers –Separate control/data channels

33 Computationally intensive File access/transfer Bag of various heterogeneous protocols & toolkits Monolithic design Recognised internet, ignored Web Academic teams Generation Game Increased functionality, standardization Time Custom solutions Open Grid Services Architecture Web services Globus Toolkit Condor, Unicore Defacto standards GridFTP, GSI X.509, LDAP, FTP, … App-specific Services Data and knowledge intensive Open services-based architecture Builds on Web services GGF + OASIS+W3C Multiple implementations Global Grid Forum Industry participation (adapted from Ian Foster GGF7 Plenary)

34 Web Services  A Web service is a software system designed to support interoperable machine-to-machine interaction over a network.  It has an interface that is described in a machine-processable format such as WSDL.  Other systems interact with the Web service in a manner prescribed by its interface using messages (usually enclosed in a SOAP envelope).  These messages are typically conveyed using HTTP, and are normally comprised of XML  Software applications written in various programming languages and running on various platforms can use web services to exchange data over networks.  This interoperability (e.g., between Java and Python, or Windows and Linux applications) is due to the use of open standards.  OASIS and the W3C are the primary committees responsible for the architecture and standardization of web services.  Specifications for additional features under development.  Basically: Web service = TRANSPORT (HTTP) + MESSAGING (SOAP) + DESCRIPTION (WSDL) + DISCOVERY (UDDI) + MESSAGE (XML)

35 Service Oriented Architecture  Components are defined by service interfaces (e.g. Web Services)  Characterized by: –Abstract logical view of programs, databases etc –Services defined by exchanged messages (not by properties of the agents themselves) –Internal structure of agent is not relevant (can accommodate legacy systems) –Services defined by machine processable meta data (documented semantics) –Small number of operations –Services oriented towards network usage –Platform neutral (e.g. messages in XML)

36 Open Grid Services Architecture  Resulted from attempt to standardize GT protocols, influenced by uptake of web services and SoA ideas: –Modularize components for different grid functions –Uniform treatment of network entities (service orientation) –Standard IDLs aligned with Web services –Develop within standards body (Global Grid Forum)

37 Open Grid Services Architecture  Grid Service –A web service which is extended to include transient and stateful services  OGSI specification –Open Grid Services Infrastructure –Defines interfaces, behaviours and conventions for grid services –Now replaced by range of web service definitions  OGSA defines services and interfaces required in a working grid environment –GGF working groups are identifying required functions and then making OGSI compliant interfaces  Multiple implementations –GT3: reference implementation of OGSI and basic OGSA services –GT4: pure web services

38 GT4  Released April 2005  Service oriented architecture  Web services to describe and invoke most components  GT4 web service containers for deploying and managing GT4 services (Java, C, Python)  Most interfaces still need to be standardized

39

40 Coursework 3  Write one or two pages describing each of the following Globus components: –GRAM –MDS –GridFTP  Best documentation and relevant papers at http://www.globus.org

41 Required Reading  The Physiology of the Grid –See course page for link


Download ppt "Grid Computing 7700 Fall 2005 Lecture 5: Grid Architecture and Globus Gabrielle Allen"

Similar presentations


Ads by Google