Presentation is loading. Please wait.

Presentation is loading. Please wait.

Grid Computing and the Globus Toolkit

Similar presentations


Presentation on theme: "Grid Computing and the Globus Toolkit"— Presentation transcript:

1 Grid Computing and the Globus Toolkit
Jennifer M. Schopf Argonne National Lab National eScience Centre

2 What is a Grid? Resource sharing Coordinated problem solving
Computers, storage, sensors, networks, … Sharing always conditional: issues of trust, policy, negotiation, payment, … Coordinated problem solving Beyond client-server: distributed data analysis, computation, collaboration, … Dynamic, multi-institutional virtual orgs Community overlays on classic org structures Large or small, static or dynamic

3 Why Is this Hard or Different?
Lack of central control Where things run When they run Shared resources Contention, variability Communication Different sites implies different sys admins, users, institutional goals, and often “strong personalities”

4 So Why Do It? Computations that need to be done with a time limit
Data that can’t fit on one site Data owned by multiple sites Applications that need to be run bigger, faster, more

5 What Kinds of Applications?
Computation intensive Interactive simulation (climate modeling) Large-scale simulation and analysis (galaxy formation, gravity waves, event simulation) Engineering (parameter studies, linked models) Data intensive Experimental data analysis (e.g., physics) Image & sensor analysis (astronomy, climate) Distributed collaboration Online instrumentation (microscopes, x-ray) Remote visualization (climate studies, biology) Engineering (large-scale structural testing)

6 Key Common Feature The size and/or complexity of the problem requires that people in several organizations collaborate and share computing resources, data, instruments

7 The Globus Approach

8 The Role of the Globus Toolkit
A collection of solutions to problems that come up frequently when building collaborative distributed applications Heterogeneity A focus, in particular, on overcoming heterogeneity for application developers Standards We capitalize on and encourage use of existing standards (IETF, W3C, OASIS, GGF) GT also includes reference implementations of new/proposed standards in these organizations Heterogenity: Trying to support heterogenity: no preconditions: - different hardware (types of machines, different configurations, different “data producers”, etc.) - different software (different applications, different languages, different operating systems, etc.) Standards: Standards enable interoperability. An example: - If the community agrees to standardize certain patterns for interacting with a type of service, then services can interact with each other - WSRF

9 Higher-Level Services
Globus is an Hour Glass Higher-Level Services and Users Local sites have an their own policies, installs – heterogeneity! Queuing systems, monitors, network protocols, etc Globus unifies Build on Web services Use WS-RF, WS-Notification to represent/access state Common management abstractions & interfaces Standard GT4 Interfaces Local heterogeneity

10 On April 29, 2005 the Globus Alliance released the finest version of the Globus Toolkit to date!
Don’t take our word for it! Read the UK eScience Evaluation of GT4 (Reachable from under “News”)

11 How it Really Happens Implementations are provided by a mix of
Application-specific code “Off the shelf” tools and services Tools and services from the Globus Toolkit Tools and services from the Grid community (compatible with GT) Glued together by… Application development System integration

12 Links instruments, data,
A Typical eScience Use of Globus: Network for Earthquake Eng. Simulation Links instruments, data, computers, people

13 Without the Globus Toolkit
A Compute Server Simulation Tool B Compute Server Web Browser Web Portal Registration Service Camera Telepresence Monitor Data Viewer Tool Application Developer 10 Off the Shelf 12 Globus Toolkit Grid Community Camera Example application This user needs access to particular resources (computers, cameras and databases) In this case the application developer has a lot of work to do C Database service Chat Tool Data Catalog D Database service Credential Repository E Database service Certificate authority Users work with client applications Application services organize VOs & enable access to other services Collective services aggregate &/or virtualize resources Resources implement standard access & management interfaces

14 With the Globus Toolkit
Globus GRAM Compute Server Simulation Tool Globus GRAM Compute Server Web Browser CHEF Globus Index Service Camera Telepresence Monitor Data Viewer Tool Application Developer 2 Off the Shelf 9 Globus Toolkit 4 Grid Community Camera CHEF is a portal environment from ???? Key points: reuse rather than reinvent! The application developer can take a long vacation instead of working so hard implementing grid functionality interoperability: Globus DAI Database service CHEF Chat Teamlet Globus MCS/RLS Globus DAI Database service MyProxy Globus DAI Database service Certificate Authority Users work with client applications Application services organize VOs & enable access to other services Collective services aggregate &/or virtualize resources Resources implement standard access & management interfaces

15 Globus is Grid Infrastructure
Software for Grid infrastructure Service enable new & existing resources E.g., GRAM on computer, GridFTP on storage system, custom application service Uniform abstractions & mechanisms Tools to build applications that exploit Grid infrastructure Registries, security, data management, … Open source & open standards Each empowers the other Enabler of a rich tool & service ecosystem

16 The Globus Toolkit: “Standard Plumbing” for the Grid
Not turnkey solutions, but building blocks & tools for application developers & system integrators Some components (e.g., file transfer) go farther than others (e.g., remote job submission) toward end-user relevance Easier to reuse than to reinvent Compatibility with other Grid systems comes for free Today the majority of the GT public interfaces are usable by application developers and system integrators Relatively few end-user interfaces In general, not intended for direct use by end users (scientists, engineers, marketing specialists)

17 Globus is a Building Block
Basic components for grid functionality Highest-level services are often application specific, we let applications concentrate there Easier to reuse than to reinvent Compatibility with other Grid systems comes for free We provide basic infrastructure to get you one step closer

18 Standards

19 Leveraging Existing and Proposed Standards
WSRF and WS-N (GGF, OASIS) WS-Agreement, WSDL 2.0, WSDM GridFTP v1.0 (GGF) OGSI v1.0 (GGF) SSL/TLS v1 (from OpenSSL) (IETF) X.509 Proxy Certificates (IETF) SAML, XACML

20 GT Protocols Web service protocols WSDL, SOAP WS Addressing, WSRF, WSN
WS Security, SAML, XACML WS-Interoperability profile Non Web service protocols Standards-based, such as GridFTP Custom

21 WSRF & WS-Notification
Naming and bindings (basis for virtualization) Every resource can be uniquely referenced, and has one or more associated services for interacting with it Lifecycle (basis for fault resilient state management) Resources created by services following factory pattern Resources destroyed immediately or scheduled Information model (basis for monitoring & discovery) Resource properties associated with resources Operations for querying and setting this info Asynchronous notification of changes to properties Service Groups (basis for registries & collective svcs) Group membership rules & membership management Base Fault type

22 WS Core Enables Frameworks: E.g., Resource Management
Applications of the framework (Compute, network, storage provisioning, job reservation & submission, data management, application service QoS, …) WS-Agreement (Agreement negotiation) Service-Oriented Frameworks General functionality can be used, extended, specialized for specific purposes Both specifications and implementations Captures commonality that can be applied across resources/services Avoid least common denominator solutions Allows service consumers to operate at different levels of abstraction General operations against any service that implements the frameworks Specific operations against services that extend/specialize the frameworks WS Distributed Management (Lifecycle, monitoring, …) WS-Resource Framework & WS-Notification (*) (Resource identity, lifetime, inspection, subscription, …) Web services (WSDL, SOAP, WS-Security, WS-ReliableMessaging, …) * An evolution of Open Grid Services Infrastructure (OGSI)

23 WSRF vs XML/SOAP The definition of WSRF means that the Grid and Web services communities can move forward on a common base Why Not Just Use XML/SOAP? WSRF and WS-N are just XML and SOAP WSRF and WS-N are just Web services Benefits of following the specs: These patterns represent best practices that have been learned in many Grid applications There is a community behind them Why reinvent the wheel? Standards facilitate interoperability

24 Globus is a Tool A Grid development environment
Develop new OGSA-compliant Web Services Develop applications using Java or C/C++ Grid APIs Secure applications using basic security mechanisms A set of basic Grid functionality Services and clients Libraries Development tools and examples The prerequisites for many Grid community tools

25 GT Domain Areas Core runtime Security Execution management
Infrastructure for building new services Security Apply uniform policy across distinct systems Execution management Provision, deploy, & manage services Data management Discover, transfer, & access large data Monitoring Discover & monitor dynamic services

26 Globus Toolkit: Open Source Grid Infrastructure
Globus Toolkit v4 Data Replication Credential Mgmt Replica Location Grid Telecontrol Protocol Delegation Data Access & Integration Community Scheduling Framework WebMDS Python Runtime Community Authorization Reliable File Transfer Workspace Management Trigger C Runtime Authentication Authorization GridFTP Grid Resource Allocation & Management Index Java Runtime Security Data Mgmt Execution Mgmt Info Services Common Runtime

27 Our Goals for GT4 Usability, reliability, scalability, …
Web service components have quality equal or superior to pre-WS components Documentation at acceptable quality level Consistency with latest standards (WS-*, WSRF, WS-N, etc.) and Apache platform WS-I Basic Profile compliant WS-I Basic Security Profile compliant New components, platforms, languages And links to larger Globus ecosystem

28 WSRF vs XML/SOAP The definition of WSRF means that the Grid and Web services communities can move forward on a common base Why Not Just Use XML/SOAP? WSRF and WS-N are just XML and SOAP WSRF and WS-N are just Web services Benefits of following the specs: These patterns represent best practices that have been learned in many Grid applications There is a community behind them Why reinvent the wheel? Standards facilitate interoperability

29 GT2 Evolution To GT4 ALL of GT2 functionality is in GT4
What happened to the GT2 key protocols? Security: Adapted X.509 proxy certs to integrate with emerging WS standards GRAM: Took ad hoc protocols away and now use WS-RF standards (ManagedJobFactory and related service definitions) GridFTP: Using GridFTP standard from GGF, Reliable File Transfer (RFT) supplies the WSRF-compliant interface MDS/LDAP: Replaced LDAP extensions with WSRF standards for notification, subscription, and registration

30 GT2 vs GT4 Pre-WS Globus is in GT4 release
Both WS and pre-WS components (ala 2.4.3) are shipped These do NOT interact, but both can run on the same resource independently Basic functionality is the same Run a job Transfer a file Monitoring Security Code base is completely different

31 Why Use GT4? Performance and reliability Scalability Support
Literally millions of tests and queries run against GT4 services Scalability Many lessons learned from GT2 have been addressed in GT4 Support This is our active code base, much more attention Additional functionality New features are here Additional GRAM interfaces to schedulers, MDS Trigger service, GridFTP protocol interfaces, etc Easier to contribute to

32 4.0 is not a typical “.0” release, but the culmination of months of testing
3.0.2 4.0.2 4.0 was actually the 7th release of the WSRF-based code: Abstract toolkit, difficult to test in isolation Better testing and documentation was actually the 6th release of GT4 - Documentation and code was tested at each release by an engaged community The recent release included: Important bug fixes, including the WS MDS exploit and C message-level security fixes Improved support for Tomcat container A version of OGSA-DAI with a complete set of public interfaces MPICH-G2 support Updates of many contributions and tech previews Full Apache license 3.0.1 3.2.1 4.0.1 3.0.0 3.2.0 3.9.0 3.9.2 3.9.4 4.0.0 3.3.0 3.9.1 3.9.3 3.9.5 CVS trunk Stable release branch Development release Stable release

33 Versioning and Support
Evens are production (4.0.x, 4.2.x), Odds are development (4.1.x) We support this version and the one previous Currently we’re at so we support 3.2 and 4.0 There is also a development release

34 Several Possible Next Versions
4.0.3 – stable release 100% same interfaces, bug fixes only Perhaps in the fall? 4.1.1 – development release New functionality Likely 6-10 weeks? 4.2 - stable release When 4.1 has “enough” new functionality, and is stable 5.0 – substantial code base change With any luck, not for years :)

35 Testing Overview Nightly builds and tests TestGrid at USC/ISI
Stand up services for several weeks Perform stress tests TestGrid at LBNL Focus on WS Core performance and interoperability tests Performance and reliability testing is a major focus Component-specific approaches mostly Calls for Community Testing near release time - we welcome new testing help!

36 Tested Platforms Debian Fedora Core FreeBSD HP/UX IBM AIX Red Hat Sun Solaris SGI Altix (IA64 running Red Hat) SuSE Linux Tru64 Unix Apple MacOS X (no binaries) Windows – Java components only List of binaries and known platform-specific install bugs at docbook/ ch03.html

37 Documentation Overview
Current document significantly more detailed than earlier versions Tutorials available for those of you building a new service Globus® Toolkit 4: Programming Java Services (The Morgan Kaufmann Series in Networking), by Borja Sotomayor, Lisa Childers (Available through Amazon, £ or $20)

38 Grid Packaging Technology (GPT)
Collection of XML-based packaging tools Straight forward definition of complex dependency and compatibility relationships between packages Way for developers to define the packaging data and include it as part of their source code distribution Automatic generation of binary packages Developer tools Convert a source distribution into a GPT package Patch-n-build capability similar to RPM spec files so you can retain their own build system if needed User Tools Enable collections of packages to be built and/or installed Package manager for those systems that don't have one Developed at NCSA

39 Virtual Data Toolkit (VDT)
Grid middleware distribution focused on ease of use (and installation) Contents Globus Toolkit & Condor, Condor-G Virtual Data Tools (Chimera, Pegasus, RLS) Utilities (GSI-OpenSSH, MyProxy, MonaLisa, NetLogger, KX.509, etc.) GriPhyN Virtual Data System (containing Chimera and Pegasus) Uses PACMAN for distribution, install, configuration. Used by GriPhyN, iVDGL, Open Science Grid, LCG, UK NGS, and others

40 Installation in a nutshell
Quickstart guide is very useful admin/docbook/quickstart.html Verify your prereqs! Security – check spellings and permissions Globus is system software – plan accordingly

41 Now that you’ve done your installation… Lets talk about what you get!

42 Globus Toolkit: Open Source Grid Infrastructure
Globus Toolkit v4 Data Replication Credential Mgmt Replica Location Grid Telecontrol Protocol Delegation Data Access & Integration Community Scheduling Framework WebMDS Python Runtime Community Authorization Reliable File Transfer Workspace Management Trigger C Runtime Authentication Authorization GridFTP Grid Resource Allocation & Management Index Java Runtime Security Data Mgmt Execution Mgmt Info Services Common Runtime

43 GT4 Web Services Runtime
Supports both GT (GRAM, RFT, Delegation, etc.) & user-developed services Redesign to enhance scalability, modularity, performance, usability Leverages existing WS standards WS-I Basic Profile: WSDL, SOAP, etc. WS-Security, WS-Addressing Adds support for emerging WS standards WS-Resource Framework, WS-Notification Java, Python, & C hosting environments Java is standard Apache

44 What does Core give you? Reference implementation of WSRF and WS-N functions Naming and bindings (basis for virtualization) Every resource can be uniquely referenced and has one or more associated services for interacting Lifecycle (basis for resilient state management) Resources created by svcs following a factory pattern Resource destroyed immediately or scheduled Information model (basis for monitoring & discovery) Resource properties associated with resources Operations for querying and setting this info Asynchronous notification of changes to properties Service groups (basis for registries & collective svcs) Group membership rules and membership management Base fault type

45 Apache Axis Web Services Container
Good news for Java WS developers: GT4.0 works with standard Axis* and Tomcat* GT provides Axis-loadable libraries, handlers Includes useful behaviors such as inspection, notification, lifetime mgmt (WSRF) Others implement GRAM, etc. Major Globus contributions to Apache ~50% of WS-Addressing code ~15% of WS-Security code Many bug fixes WSRF code a possible next contribution GT bits App bits Security Addressing Axis * Modulo Axis and Tomcat release cycle issues

46 GT4 Web Services Runtime
Custom Web Services WS-Addressing, WSRF, WS-Notification WSRF Web GT4 WSRF Web WSDL, SOAP, WS-Security User Applications Administration Registry GT4 Container

47 WS Core Enables Frameworks: E.g., Resource Management
Applications of the framework (Compute, network, storage provisioning, job reservation & submission, data management, application service QoS, …) WS-Agreement (Agreement negotiation) Service-Oriented Frameworks General functionality can be used, extended, specialized for specific purposes Both specifications and implementations Captures commonality that can be applied across resources/services Avoid least common denominator solutions Allows service consumers to operate at different levels of abstraction General operations against any service that implements the frameworks Specific operations against services that extend/specialize the frameworks WS Distributed Management (Lifecycle, monitoring, …) WS-Resource Framework & WS-Notification (Resource identity, lifetime, inspection, subscription, …) Web services (WSDL, SOAP, WS-Security, WS-ReliableMessaging, …)

48 WSRF/WSNs Compared (Humphrey et al, HPDC 2005)
GT4-Java GT4-C pyGridWare WSRF::Lite WSRF.NET Languages supported Java C Python Perl C#/C++/VBasic, etc. WS-Security password profile Yes No In progress WS-Security X.509 profile WS-SecureConversation TLS/SSL Authorization Multiple Callout None Persistence of WS-Resources Not default Memory Footprint JVM + 10M 22 KB 12 MB Depends Memory size per WS-Resource Depends on resource state 70B 0 (file/DB) or 10B (process) Unmodified hosting environment Yes (Apache) Compliance with WS-I Basic Profile Compliance with WS-I Basic Security Profile Logging Log4J WSE diagnostics WS-ResourceLifetime WS-ResourceProperties WS-ServiceGroup WS-BaseFaults WS-BaseNotification Consumer WS-BrokeredNotification Partial WS-Topics WSRF/WSNs Compared (Humphrey et al, HPDC 2005)

49 (times in milliseconds)
GetRP Test Distributed client and service on same LAN (times in milliseconds) GT4 - Java GT4 - C pyGridWare WSRF::Lite WSRF.NET 149.67 No Security X509 Signing HTTPS This is Marty Humphrey’s slide GT4-C is fastest WSRF.NET and GT4-Java are comparable for no security and HTTPS WSRF.NET is fastest (except GT4-C) for X509 Message layer security is an order of magnitude slower than transport layer security Largely, we see these patterns repeat across all tests Mention that pyG does not implement https connection caching and that’s why their HTTPS perf is slow (but they are implementing it) 25.57 181.96 GT4 - Java GT4 - C pyGridWare WSRF::Lite WSRF.NET 17.1 140.5 55.6 81.39 10.05 8.23 N/A 2.34 14.8 11.46 12.91 2.85 GT4 - Java GT4 - C pyGridWare WSRF::Lite WSRF.NET

50 GT4 WS Core Performance (1) Message-level security (times in milliseconds) GT4 Java GT4 C GT4 Python WSRF.NET GetRP 181.96 14.77 140.50 81.39 SetRP 182.04 14.99 142.21 82.48 CreateR 188.46 14.98 132.26 96.22 DestroyR 182.03 15.76 136.12 86.89 Notify 219.51 N/A 244.93 101.57 (2) Transport-level security (times in milliseconds) GT4 Java GT4 C GT4 Python WSRF.NET getRP 11.46 2.85 149.67 12.91 setRP 11.47 2.86 150.79 12.3 createR 18.00 2.82 132.60 20.84 destroyR 14.92 2.71 149.21 16.05 Notify 29.26 9.67 169.07 45.0 “WSRF/WSNs Compared,” HPDC 2005.

51 Globus Toolkit: Open Source Grid Infrastructure
Globus Toolkit v4 Data Replication Credential Mgmt Replica Location Grid Telecontrol Protocol Delegation Data Access & Integration Community Scheduling Framework WebMDS Python Runtime Community Authorization Reliable File Transfer Workspace Management Trigger C Runtime Authentication Authorization GridFTP Grid Resource Allocation & Management Index Java Runtime Security Data Mgmt Execution Mgmt Info Services Common Runtime

52 Globus Security Control access to shared services
Address autonomous management, e.g., different policy in different work-groups Support multi-user collaborations Federate through mutually trusted services Local policy authorities rule Allow users and application communities to set up dynamic trust domains Personal/VO collection of resources working together based on trust of user/VO

53 Virtual Organization (VO) Concept
VO for each application or workload Carve out and configure resources for a particular use and set of users

54 GT4 Security Services (running Access Compute Center Rights Users
Authz Callout: SAML, XACML SSL/WS-Security with Proxy Certificates Services (running on user’s behalf) Rights Compute Center Access CAS or VOMS issuing SAML or X.509 ACs Local policy on VO identity or attribute authority VO Rights Users Rights’ KCA MyProxy

55 GT Authorization Framework

56 GT4 Security Public-key-based authentication
Transport- and message-level authentication Extensible authorization framework based on Web services standards SAML-based authorization callout Integrated policy decision engine XACML policy language, per-operation policies, pluggable Credential management service MyProxy (One time password support) Community Authorization Service Standalone delegation service Ability to map between Grid and local identity

57 Security Tools Basic Grid Security Mechanisms
Certificate Generation Tools Certificate Management Tools Getting users “registered” to use a Grid Getting Grid credentials to wherever they’re needed in the system Authorization/Access Control Tools Storing and providing access to system-wide authorization information

58 Other Security Services Include …
MyProxy Simplified credential management Web portal integration Single-sign-on support KCA & kx.509 Bridging into/out-of Kerberos domains SimpleCA Online credential generation PERMIS Authorization service callout

59 A Cautionary Note Grid security mechanisms are tedious to set up
If exposed to users, hand-holding is usually required These mechanisms can be hidden entirely from end users, but still used behind the scenes These mechanisms exist for good reasons. Many useful things can be done without Grid security It is unlikely that an ambitious project could go into production operation without security like this Most successful projects end up using Grid security, but using it in ways that end users don’t see much

60 GT4’s Use of Security Standards
Supported, Supported, Fastest, but slow but insecure so default

61 GT-XACML Integration eXtensible Access Control Markup Language
OASIS standard, open source implementations XACML: sophisticated policy language Globus Toolkit ships with XACML runtime Included in every client and server built on GT Turned-on through configuration … that can be called transparently from runtime and/or explicitly from application … … and we use the XACML-”model” for our Authz Processing Framework

62 Globus Certificate Service
An online service that issues low-quality GSI certificates Intended for people who want to experiment with Grid components that require certificates but do not have any other means of acquiring certificates. These certificates are not to be used on production systems. Not a true Certificate Authority (CA) No revoking or reissuing certificates No verification of identities The service itself is not especially secure.

63 Simple CA A convenient method of setting up a certificate authority (CA). The Certificate Authority can then be used to issue certificates for users and services that work with GSI and WS-Security. Simple CA is intended for operators of small Grid testing environments and users who are not part of a larger Grid. Most production Grids will not accept certificates that are not signed by a well-known CA, so the certificates generated by Simple CA will usually not be sufficient to gain access to production services.

64 MyProxy MyProxy is a remote service that stores user credentials.
Users can request proxies for local use on any system on the network. Web Portals can request user proxies for use with back-end Grid services. Grid administrators can pre-load credentials in the server for users to retrieve when needed. Greatly simplifies certificate management!

65 CAS: Community Authorization Service
CAS allows resource providers to specify course-grained access control policies in terms of communities as a whole. Fine-grained access control is delegated to the community. Resource providers maintain ultimate authority over their resources (including per-user control and auditing) but are spared most day-to-day policy administration tasks.

66 VOMS A community-level group membership system Database of user roles
Administrative tools Client interface voms-proxy-init Uses client interface to produce an attribute certificate (instead of proxy) that includes roles & capabilities signed by VOMS server Works with non-VOMS services, but gives more info to VOMS-aware services Allows VOs to centrally manage user roles

67 Globus Toolkit: Open Source Grid Infrastructure
Globus Toolkit v4 Data Replication Credential Mgmt Replica Location Grid Telecontrol Protocol Delegation Data Access & Integration Community Scheduling Framework WebMDS Python Runtime Community Authorization Reliable File Transfer Workspace Management Trigger C Runtime Authentication Authorization GridFTP Grid Resource Allocation & Management Index Java Runtime Security Data Mgmt Execution Mgmt Info Services Common Runtime

68 Execution Management (GRAM)
Common WS interface to schedulers Unix, Condor, LSF, PBS, SGE, … More generally: interface for process execution management Lay down execution environment Stage data Monitor & manage lifecycle Kill it, clean up A basis for application-driven provisioning

69 GRAM - Basic Job Submission and Control Service
A uniform service interface for remote job submission and control Includes file staging and I/O management Includes reliability features Supports basic Grid security mechanisms Available in Pre-WS and WS GRAM is not a scheduler. No scheduling No metascheduling/brokering Often used as a front-end to schedulers, and often used to simplify metaschedulers/brokers

70 GT4 WS GRAM 2nd-generation WS implementation optimized for performance, flexibility, stability, scalability Streamlined critical path Use only what you need Flexible credential management Credential cache & delegation service GridFTP & RFT used for data operations Data staging & streaming output Eliminates redundant GASS code

71 GRAM Intended for jobs where arbitrary programs, stateful monitoring, credential management, and file staging are important If the application is lightweight, with modest input/output, may be a better candidate for hosting directly as a WSRF service

72 GT4 WS GRAM Architecture
Service host(s) and compute element(s) Job events SEG GT4 Java Container Compute element GRAM services GRAM services Local job control Local scheduler Job functions sudo GRAM adapter Delegate Client Transfer request Delegation Delegate GridFTP User job RFT File Transfer FTP control FTP data Remote storage element(s) GridFTP

73 GT4 WS GRAM Architecture
Service host(s) and compute element(s) Job events SEG GT4 Java Container Compute element GRAM services GRAM services Local job control Local scheduler Job functions sudo GRAM adapter Delegate Client Transfer request Delegation Delegate GridFTP User job RFT File Transfer FTP control FTP data Remote storage element(s) GridFTP Delegated credential can be: Made available to the application

74 GT4 WS GRAM Architecture
Service host(s) and compute element(s) Job events SEG GT4 Java Container Compute element GRAM services GRAM services Local job control Local scheduler Job functions sudo GRAM adapter Delegate Client Transfer request Delegation Delegate GridFTP User job RFT File Transfer FTP control FTP data Remote storage element(s) GridFTP Delegated credential can be: Used to authenticate with RFT

75 GT4 WS GRAM Architecture
Service host(s) and compute element(s) Job events SEG GT4 Java Container Compute element GRAM services GRAM services Local job control Local scheduler Job functions sudo GRAM adapter Delegate Client Transfer request Delegation Delegate GridFTP User job RFT File Transfer FTP control FTP data Remote storage element(s) GridFTP Delegated credential can be: Used to authenticate with GridFTP

76 Submitting a Sample Job
Specify a remote host with –F globusrun-ws –submit –F host2 –c /bin/true The return code will be the job’s exit code if supported by the scheduler

77 Data Staging and Streaming
Simplest stage-in/stage-out example is stdout/stderr globusrun-ws –S –s –c /bin/date -S is short for “-submit” -s is short for –streaming The output will be sent back to the terminal, control will not return until the job is done

78 Resource Specification Language
For more complicated jobs, we’ll use RSL to specify the job <job> <executable>/bin/echo</executable> <argument>this is an example_string </argument> <argument>Globus was here</argument> <stdout>${GLOBUS_USER_HOME}/stdout</stdout> <stderr>${GLOBUS_USER_HOME}/stderr</stderr> </job>

79 Resource Specification Language
<job> <executable>/bin/echo</executable> <directory>/tmp</directory> <argument>12</argument> <environment><name>PI</name> <value>3.141</value></environment> <stdin>/dev/null</stdin> <stdout>stdout</stdout> <stderr>stderr</stderr> </job>

80 Resource Specification Language
<job> <executable>/bin/echo</executable> <directory>/tmp</directory> <argument>12</argument> <environment><name>PI</name> <value>3.141</value></environment> <stdin>/dev/null</stdin> <stdout>stdout</stdout> <stderr>stderr</stderr> </job>

81 Submitting Using XML If the file validates, submit using -submit
Create the file containing the RSL You may validate the RSL ahead of time globusrun-ws –validate –f rslfile.xml If the file validates, submit using submit

82 At Most Once Submission
You may specify a UUID with your job submission If you’re not sure the submission worked, you may submit the job again with the same UUID If the job has already been submitted, the new submission will have no effect If you do not specify a UUID, one will be generated for you

83 Staging Data GRAM’s RSL allows many fileStageIn/fileStageOut directives The transfers will be executed by RFT May specify additional RFT options using the RFTOptions tag There is no GASS cache staging option anymore

84 Staging Data – Stage In GRAM’s RSL allows many fileStageIn/fileStageOut directives <fileStageIn> <transfer> <sourceUrl> gsiftp://job.submitting.host:2811/bin/echo</sourceUrl> <destinationUrl>file:///${GLOBUS_USER_HOME}/my_echo</destinationUrl> </transfer> </fileStageIn>

85 Staging Data – Stage Out
<fileStageOut> <transfer> <sourceUrl>file://${GLOBUS_USER_HOME}/stdout </sourceUrl> <destinationUrl>gsiftp://job.submitting.host:2811/tmp/stdout</destinationUrl> </transfer> </fileStageOut>

86 Staging Data - Cleanup <fileCleanUp> <deletion>
<file>file://${GLOBUS_USER_HOME}/my_echo</file> </deletion> </fileCleanUp>

87 Staging Data - Credentials
The GridFTP servers you’re using may require different credentials than the GRAM service you’re submitting to The RSL allows you to specify separate credentials for the executable and staging components of the job

88 RSL Substitutions GRAM will perform some variable substitutions for you GLOBUS_USER_HOME GLOBUS_USER_NAME GLOBUS_SCRATCH_DIR GLOBUS_LOCATION SCRATCH_DIR will be a compute-node local high-speed storage if defined, or GLOBUS_USER_HOME if not

89 Batch Submission Your client does not have to stay attached to the execution of the job -batch will disconnect from the job and output an EPR You may redirect the EPR to a file with –o Use the EPR file with –monitor or -status You may also kill the job using -kill

90 Specifying Scheduler Options
RSL lets you specify various scheduler options what queue to submit to which project to select for accounting max CPU and wallclock time to spend min/max memory required All defined online under the schema document for GRAM

91 Choosing User Accounts
You may be authorized to use more than one account at the remote site By default, the first listed in the grid-mapfile will be used You may request a specific user account using the <localUserId> element

92 Multijobs You may specify more than one <job> element in a <multijob> At that point, you want to specify the <factoryEndpoint> in the RSL rather than the commandline Will be used by MPICH-G to support MPI jobs

93 WS GRAM Performance Time to submit a basic GRAM job Concurrent jobs
Pre-WS GRAM: < 1 second WS GRAM: 2 seconds Concurrent jobs Pre-WS GRAM: 300 jobs WS GRAM: 32,000 jobs

94 CondorG The Condor project has produced a “helper front-end” to GRAM
Managing sets of subtasks Reliable front-end to GRAM to manage computational resources Note: this is not Condor which promotes high-throughput computing, and use of idle resources

95 Chimera “Virtual Data”
Captures both logical and physical steps in a data analysis process. Transformations (logical) Derivations (physical) Builds a catalog. Results can be used to “replay” analysis. Generation of DAG (via Pegasus) Execution on Grid Catalog allows introspection of analysis process. Sloan Survey Data Galaxy cluster size distribution

96 Pegasus Workflow Transformation
Converts Abstract Workflow (AW) into Concrete Workflow (CW). Uses Metadata to convert user request to logical data sources Obtains AW from Chimera Uses replication data to locate physical files Delivers CW to DAGman Executes using Condor Publishes new replication and derivation data in RLS and Chimera (optional) Chimera Virtual Data Catalog Metadata Catalog t DAGman Replica Location Service Condor Compute Server Storage System Compute Server Storage System Storage System Compute Server Compute Server

97

98 Globus Toolkit: Open Source Grid Infrastructure
Globus Toolkit v4 Data Replication Credential Mgmt Replica Location Grid Telecontrol Protocol Delegation Data Access & Integration Community Scheduling Framework WebMDS Python Runtime Community Authorization Reliable File Transfer Workspace Management Trigger C Runtime Authentication Authorization GridFTP Grid Resource Allocation & Management Index Java Runtime Security Data Mgmt Execution Mgmt Info Services Common Runtime

99 GT4 Data Management Stage/move large data to/from nodes
GridFTP, Reliable File Transfer (RFT) Alone, and integrated with GRAM Locate data of interest Replica Location Service (RLS) Replicate data for performance/reliability Distributed Replication Service (DRS) Provide access to diverse data sources File systems, parallel file systems, hierarchical storage: GridFTP Databases: OGSA DAI

100 GridFTP A high-performance, secure, reliable data transfer protocol optimized for high-bandwidth wide-area networks FTP with well-defined extensions Uses basic Grid security (control and data channels) Multiple data channels for parallel transfers Partial file transfers Third-party (direct server-to-server) transfers Reusable data channels Command pipelining GGF recommendation GFD.20

101 GridFTP in GT4 Disk-to-disk on TeraGrid 100% Globus code IPv6 Support
No licensing issues Stable, extensible IPv6 Support XIO for different transports Striping  multi-Gb/sec wide area transport Pluggable Front-end: e.g., future WS control channel Back-end: e.g., HPSS, cluster file systems Transfer: e.g., UDP, NetBLT transport Future: integration of provisioning E.g., storage reservation

102 Striped Server Multiple nodes work together and act as a single GridFTP server An underlying parallel file system allows all nodes to see the same file system and must deliver good performance (usually the limiting factor in transfer speed) I.e., NFS does not cut it Each node then moves (reads or writes) only the pieces of the file that it is responsible for. This allows multiple levels of parallelism, CPU, bus, NIC, disk, etc. Critical if you want to achieve better than 1 Gbs without breaking the bank

103 Striped GridFTP Service
Parallel Transfer Fully utilizes bandwidth of network interface on single nodes. Striped Transfer Fully utilizes bandwidth of Gb+ WAN using multiple nodes. Parallel Filesystem A distributed GridFTP service that runs on a storage cluster Every node of the cluster is used to transfer data into/out of the cluster Head node coordinates transfers Multiple NICs/internal busses lead to very high performance Maximizes use of Gbit+ WANs

104

105 Typical Approach (without XIO)
Network Protocol Network Protocol Protocol API Network Protocol Disk Application POSIX IO Proprietary API Special Device

106 Globus XIO Approach Globus XIO Driver Driver Application Driver
Network Protocol Network Protocol Driver Network Protocol Disk Globus XIO Driver Application Driver Special Device

107 Drivers Make 1 API do many types of IO
Specific drivers for specific protocols/devices Transform Manipulate or examine data Do not move data outside of process space Compression, Security, Logging Transport Moves data across a wire TCP, UDP, File IO, Device IO Typically move data outside of process space

108 Stack Transport Transform
Example Driver Stack Transport Exactly one per stack Must be on the bottom Transform Zero or many per stack Control flows from user to the top of the stack, to the transport driver. Compression Logging TCP

109 Copying Files (in a nutshell)
globus-url-copy [options] srcURL dstURL guc gsiftp://localhost/foo file:///bar Client/server, using FTP stream mode guc –vb –dbg –tcp-bs –p 8 gsiftp://localhost/foo gsiftp://localhost/bar 3rd party transfer, MODE E guc https://host.domain.edu/foo ftp://host.domain.gov/bar from secure http to ftp server

110 The Options: Improving Performance
-p (parallelism or number of streams) rule of thumb 4-8, start with 4 -tcp-bs (TCP buffer size) use either ping or traceroute to determine the RTT between hosts buffer size = BW (Mbs) * RTT (ms) *1000/8/<(parallelism value – 1)> If that is still too complicated use 2MB -vb if you want performance feedback -dbg if you have trouble

111 Tuning GridFTP Many ways you can tune the performance
Two sources of data are MaxGridFTP.pdf

112 Exercise: Simple File Movement
grid-proxy-init echo test > /tmp/test look at servers started for each: guc gsiftp://hostname/tmp/test file:///tmp/test2 get (from server to client) guc file:///tmp/test2 gsiftp://hostname/tmp/test3 put (from client to server) guc gsiftp://hostname1/tmp/test3 gsiftp://hostname2/tmp/test4 Third party transfer (between two servers) guc –dcpriv gsiftp://localhost/dev/zero gsiftp://localhost/dev/null transfer with encryption on data channel

113 Troubleshooting Can I get connected?
telnet to the port: telnet hostname port 2811 is the default port You should get something like this: 220 GridFTP Server gridftp.mcs.anl.gov 0.17 (gcc32dbg, ) ready. ** Development Release ** If not, you have firewall problems, or xinetd config problems. You are never even starting the server.

114 Troubleshooting no proxy grid-proxy-destroy
guc gsiftp://localhost/dev/zero file:///dev/null add –dbg grid-proxy-init

115 Troubleshooting Bad source file grid-proxy-init
guc gsiftp://localhost:2812/tmp/junk file:///tmp/empty junk does not exist Note that an empty file named empty is created We need to fix this in globus-url-copy, but for now it is there

116 RFT - File Transfer Queuing
A WSRF service for queuing file transfer requests Server-to-server transfers Checkpointing for restarts Database back-end for failovers Allows clients to requests transfers and then “disappear” No need to manage the transfer Status monitoring available if desired

117 Reliable File Transfer: Third Party Transfer
Fire-and-forget transfer Web services interface Many files & directories Integrated failure recovery Has transferred 900K files RFT Client SOAP Messages Notifications (Optional) RFT Service GridFTP Server GridFTP Server Master DSI Protocol Interpreter Data Channel IPC Receiver Slave DSI IPC Link Data Channel Protocol Interpreter Master DSI Slave DSI IPC Receiver IPC Link

118 Replica Location Service
Identify location of files via logical to physical name map Distributed indexing of names, fault tolerant update protocols GT4 version scalable & stable Managing ~40 million files across ~10 sites Index Index Local DB Update send (secs) Bloom filter (secs) Bloom filter (bits) 10K <1 2 1 M 24 10 M 5 M 7 175 50 M

119 Reliable Wide Area Data Replication
LIGO Gravitational Wave Observatory Cardiff AEI/Golm Birmingham• Replicating >1 Terabyte/day to 8 sites >30 million replicas so far MTBF = 1 month

120 OGSA-DAI Grid Interfaces to Databases Extensible & Efficient framework
Data access Relational & XML Databases, semi-structured files Data integration Multiple data delivery mechanisms, data translation Extensible & Efficient framework Request documents contain multiple tasks A task = execution of an activity Group work to enable efficient operation Extensible set of activities > 30 predefined, framework for writing your own Moves computation to data Pipelined and streaming evaluation Concurrent task evaluation

121 OGSA-DAI Provide service-based access to structured data resources as part of Globus Specify a selection of interfaces tailored to various styles of data access—starting with relational and XML

122 The OGSA-DAI Framework
Application Client Toolkit OGSA-DAI service Data resources used to be called connection managers Engine readFile XPath SQLQuery GZip XSLT GridFTP Activities JDBC XMLDB File Data Resources MySQL SQL Server DB2 XIndice SWISS PROT Data- bases

123 Extensibility Example
OGSA-DAI service Engine SQLQuery SQLQuery Data resources used to be called connection managers Multiple SQL GDS JDBC MySQL SQL JDBC SQL JDBC SQL JDBC SQL JDBC

124 OGSA-DAI: A Framework for Building Applications
Supports data access, insert and update Relational: MySQL, Oracle, DB2, SQL Server, Postgres XML: Xindice, eXist Files – CSV, BinX, EMBL, OMIM, SWISSPROT,… Supports data delivery SOAP over HTTP FTP; GridFTP Inter-service Supports data transformation XSLT ZIP; GZIP Supports security X.509 certificate based security

125 OGSA-DAI: Other Features
A framework for building data clients Client toolkit library for application developers A framework for developing functionality Extend existing activities, or implement your own Mix and match activities to provide functionality you need Highly extensible Customise our out-of-the-box product Provide your own services, client-side support, and data-related functionality

126 Data Replication Service (tech preview)
Pull “missing” files to local site Site A Site B List of required Files Data Replication Service Reliable File Transfer Service Reliable File Transfer Service Data Replication Service Local Replica Catalog Replica Location Index GridFTP GridFTP Local Replica Catalog Replica Location Index

127 MCS - Metadata Catalog Service
A stand-alone metadata catalog service WSRF service interface Stores system-defined and user-defined attributes for logical files/objects Supports manipulation and query Integrated with OGSA-DAI OGSA-DAI provides metadata storage When run with OGSA-DAI, basic Grid authentication mechanisms are available

128 Globus Toolkit: Open Source Grid Infrastructure
Globus Toolkit v4 Data Replication Credential Mgmt Replica Location Grid Telecontrol Protocol Delegation Data Access & Integration Community Scheduling Framework WebMDS Python Runtime Community Authorization Reliable File Transfer Workspace Management Trigger C Runtime Authentication Authorization GridFTP Grid Resource Allocation & Management Index Java Runtime Security Data Mgmt Execution Mgmt Info Services Common Runtime

129 Monitoring and Discovery System (MDS4)
Grid-level monitoring system Aid user/agent to identify host(s) on which to run an application Warn on errors Uses standard interfaces to provide publishing of data, discovery, and data access, including subscription/notification WS-ResourceProperties, WS-BaseNotification, WS-ServiceGroup Functions as an hourglass to provide a common interface to lower-level monitoring tools

130 Schedulers, Portals, Warning Systems, etc.
Information Users : Schedulers, Portals, Warning Systems, etc. WS standard interfaces for subscription, registration, notification Standard Schemas (GLUE schema, eg) Cluster monitors (Ganglia, Hawkeye, Clumon, and Nagios) Queuing systems (PBS, LSF, Torque) Services (GRAM, RFT, RLS)

131 MDS4 Components Information providers Higher level services Clients
Monitoring is a part of every WSRF service Non-WS services are also be used Higher level services Index Service – a way to aggregate data Trigger Service – a way to be notified of changes Both built on common aggregator framework Clients WebMDS All of the tool are schema-agnostic, but interoperability needs a well-understood common language

132 Information Providers
Data sources for the higher-level services Some are built into services Any WSRF-compliant service publishes some data automatically WS-RF gives us standard Query/Subscribe/Notify interfaces GT4 services: ServiceMetaDataInfo element includes start time, version, and service type name Most of them also publish additional useful information as resource properties

133 Information Providers (2)
Other sources of data Any executables Other (non-WS) services Interface to another archive or data store File scraping Just need to produce a valid XML document

134 Information Providers: GT4 Services
Reliable File Transfer Service (RFT) Service status data, number of active transfers, transfer status, information about the resource running the service Community Authorization Service (CAS) Identifies the VO served by the service instance Replica Location Service (RLS) Note: not a WS Location of replicas on physical storage systems (based on user registrations) for later queries

135 Information Providers: Cluster and Queue Data
Interfaces to Hawkeye, Ganglia, CluMon, Nagios Basic host data (name, ID), processor information, memory size, OS name and version, file system data, processor load data Some condor/cluster specific data This can also be done for sub-clusters, not just at the host level Interfaces to PBS, Torque, LSF Queue information, number of CPUs available and free, job count information, some memory statistics and host info for head node of cluster

136 Higher-Level Services
Index Service Caching registry Trigger Service Warn on error conditions Archive Service Database store for history (in development) All of these have common needs, and are built on a common framework

137 Common Aggregator Framework
Basic framework for higher-level functions Subscribe to Information Provider(s) Do some action Present standard interfaces

138 Aggregator Framework Features
1) Common configuration mechanism Specify what data to get, and from where 2) Self cleaning Services have lifetimes that must be refreshed 3) Soft consistency model Published information is recent, but not guaranteed to be the absolute latest 4) Schema Neutral Valid XML document needed only

139 MDS4 Index Service Index Service is both registry and cache
Datatype and data provider info, like a registry (UDDI) Last value of data, like a cache In memory default approach DB backing store currently being developed to allow for very large indexes Can be set up for a site or set of sites, a specific set of project data, or for user-specific data only Can be a multi-rooted hierarchy No *global* index

140 MDS4 Trigger Service Subscribe to a set of resource properties
Evaluate that data against a set of pre-configured conditions (triggers) When a condition matches, action occurs is sent to pre-defined address Website updated Similar functionality in Hawkeye

141 WebMDS User Interface Web-based interface to WSRF resource property information User-friendly front-end to Index Service Uses standard resource property requests to query resource property data XSLT transforms to format and display them Customized pages are simply done by using HTML form options and creating your own XSLT transforms Sample page:

142

143 Working with TeraGrid Large US project across 9 different sites
Different hardware, queuing systems and lower level monitoring packages Starting to explore MetaScheduling approaches GRMS (Poznan) W. Smith (TACC) K. Yashimoto (SDSC) User Portal Need a common source of data with a standard interface for basic scheduling info

144 Data Collected Provide data at the subcluster level
Sys admin defines a subcluster, we query one node of it to dynamically retrieve relevant data Can also list per-host details Interfaces to Ganglia, Hawkeye, CluMon, and Nagios available now Other cluster monitoring systems can write into a .html file that we then scrape Also collect basic queuing data, some TeraGrid specific attributes

145

146 Scalability Experiments
MDS index Dual 2.4GHz Xeon processors, 3.5 GB RAM Sizes: 1, 10, 25, 50, 100 Clients 20 nodes also dual 2.6 GHz Xeon, 3.5 GB RAM 1, 2, 3, 4, 5, 6, 7, 8, 16, 32, 64, 128, 256, 384, 512, 640, 768, 800 Nodes connected via 1Gb/s network Each data point is average of 8 minutes Ran for 10 mins but first 2 spent getting clients up and running Error bars are SD over 8 mins Experiments by Ioan Raicu, U of Chicago, using DiPerf

147

148

149 Globus Toolkit: Open Source Grid Infrastructure
Globus Toolkit v4 Data Replication Credential Mgmt Replica Location Grid Telecontrol Protocol Delegation Data Access & Integration Community Scheduling Framework WebMDS Python Runtime Community Authorization Reliable File Transfer Workspace Management Trigger C Runtime Authentication Authorization GridFTP Grid Resource Allocation & Management Index Java Runtime Security Data Mgmt Execution Mgmt Info Services Common Runtime

150 The Globus Ecosystem Globus components address core issues relating to resource access, monitoring, discovery, security, data movement, etc. GT4 being the latest version A larger Globus ecosystem of open source and proprietary components provide complementary components A growing list of components These components can be combined to produce solutions to Grid problems We’re building a list of such solutions

151 Many Tools Build on, or Can Contribute to, GT4-Based Grids
Condor-G, DAGman MPICH-G2 GRMS Nimrod-G Ninf-G Open Grid Computing Env. Commodity Grid Toolkit GriPhyN Virtual Data System Virtual Data Toolkit GridXpert Synergy Platform Globus Toolkit VOMS PERMIS GT4IDE Sun Grid Engine PBS scheduler LSF scheduler GridBus TeraGrid CTSS NEES IBM Grid Toolbox

152 Global Community

153 Example Solutions Portal-based User Reg. System (PURSE)
VO Management Registration Service Service Monitoring Service TeraGrid TGCP Tool Lightweight Data Replicator GriPhyN Virtual Data System

154 Documenting The Grid Ecosystem
The Grid Ecosystem: Software Components for Grid Systems And Applications

155 Example Solutions Portal-based User Reg. System (PURSE)
VO Management Registration Service Service Monitoring Service TeraGrid TGCP Tool Lightweight Data Replicator GriPhyN Virtual Data System

156 Condor-G The Condor U Wisconsin Madison develops software for high-throughput computing on collections of distributed compute resources Condor-G is an interface to GRAM created by the Condor team that allows users to submit jobs to GRAM servers

157 GridShib Allows the use of Shibboleth-transported attributes for authorization in GT4 deployments And, more generally, SAML support 2 year project started December 1, 2004 Participants Von Welch, UIUC/NCSA (PI) Kate Keahey, UChicago/Argonne (PI) Frank Siebenlist, Argonne Tom Barton, UChicago Beta software released September 16, 2005 GridShib Objectives: Priority 1: Pull mode operation Globus services contact Shibboleth to obtain attributes about identified user Support both GT4.x Web Services and pre-WS code Priority 2: Push mode operation User obtains Shib attributes and push to service Allows role selection Priority 3: Online CAs Pseudonymous operation Integration with local authentication services (This slide from a presentation by Tom Barton)

158 Handle System The Handle System from CNRI (http://www.handle.net) is a general-purpose global name service enabling secure name resolution over the internet The Handle System-GT Integration Project leverages the Handle System for identifier and resolution services through tight integration with GT4’s Web services protocols

159 MPICH-G2 MPICH-G2, developed at Northern Illinois University and Argonne National Lab, is a grid-enabled implementation of the MPI v1.1 standard MPICH-G2 is implemented using the pre-WS GRAM component in GT4; integration with GT4 WS GRAM is expected in the near future

160 Nimrod/G Nimrod is a specialized parametric modeling system from Monash University Nimrod/G uses a simple declarative parametric modeling language to express parameter sweep experiments. Based on GT4 WS services, Nimrod/G enables the formulation, execution and monitoring of multiple individual parametric experiments

161 Ninf-G4 Ninf-G4, from AIST, is a reference implementation of the GGF standard GridRPC API Ninf-G4 is provides higher-level programming APIs for the development and execution of parallel applications on the Grid

162 PERMIS PERMIS is an EU-funded Privilege Management service that implements Role-Based Access Control Thanks to the work of the UK Grid Engineering Task Force, services running in a Java WS Core container can use PERMIS via GT4’s SAML authorization callouts

163 SRB SRB is a package from SDSC providing a uniform interface for connecting to network-based heterogeneous data resources GT4’s GridFTP includes an interface to SRB data sources, and vice versa

164 Sun Grid Engine Sun Grid Engine is an open source distributed resource management system from Sun Microsystems In a collaboration between the London e-Science Centre, Gridwise and MCNC, the Sun Grid Engine has been integrated with GT4

165 Tells Us About Your Grid Tools & Solutions
We list links to related projects on the “Related Software” of the Globus Toolkit web “Solutions” are documented on the Globus web If we’ve got details wrong or you have a GT4-related tool to list on our website, please send mail to

166 The Globus Commitment to Open Source
Globus was first established as an open source project in 1996 The Globus Toolkit is open source to: allow for inspection for consideration in standardization processes encourage adoption in pursuit of ubiquity and interoperability encourage contributions harness the expertise of the community The Globus Toolkit is distributed under the (BSD-style) Apache License version 2

167 The Future: Structure NSF Community Driven Improvement of Globus Software (CDIGS) project 5 years of funding for GT enhancement GlobDev Globus Development Envionrment

168 Why is Globus Software Open Source?
To allow for inspection For consideration in standardization processes To encourage adoption In pursuit of ubiquity and interoperability To encourage contributions Harness the expertise of the community

169 Open development requires open processes
Open Contribution But distributing code under an open source license does not guarantee open development! Open development requires open processes So we have created dev.globus to facilitate contributions

170 Based on Apache Jakarta
Governance Model Based on Apache Jakarta Individual development efforts organized as projects Consensus-based decision making Control over each project in the hands of its most active and respected contributors (committers) Globus Management Committee (GMC) providing overall guidance and conflict resolution

171 Common Infrastructure
Code repositories (CVS, SVN) Mailing lists *-dev, *-user, *-announce, *-commit for every project Issue tracking (bugzilla) Including roadmap info for future development Wikis Known interactions for people accessing your project

172 Sample

173

174

175

176

177 Technology Projects Common runtime projects Data projects
C Core Utilities, C WS Core, CoG jglobus, Core WS Schema, Java WS Core, XIO Data projects GridFTP, Reliable File Transfer, Replica Location, Data Replication Execution projects GRAM, Dynamic accounts, Virtual workspaces Information services projects MDS4 Security Projects C Security, CAS/SAML Utilities, Delegation Service, MyProxy

178 Non-Technolgy Projects
Distribution Projects Globus Toolkit Distribution Process was used for April release Documentation Projects GT Release Manuals Incubation Projects Incubation management project And any new projects wanting to join

179 Incubator Process in dev.globus
Entry point for new Globus projects Incubator Management Project (IMP) Oversees incubator process form first contact to becoming a Globus project Quarterly reviews of current projects Process being debugged by “Incubator Pioneers”

180 Incubator Process (1 of 3)
Project proposes itself as a Candidate A proposed name for the project; A proposed project chair, with contact info; A list of the proposed committers for the project; An overview of the aims of the project; An overview of any current user base or user community, if applicable; An overview of how the project relates to other parts of Globus; A summary of why the project would enhance and benefit Globus.

181 Incubator Process (2 of 3)
IMP meet, discuss, and accept project as a ProtoProject ProtoProject now part of the Incubator framework Get assigned a Mentor to help Member of IMP Bridge between Globus and new ProtoProject Opportunity to get up to speed on Globus Development process

182 Incubator Process (3 of 3)
Quarterly reviews by IMP determine Stay a ProtoProject Retire Escalate to a full Globus project Escalation when ProtoProject passes checklist Legal Meritocracy Alignment/Synergy Infrastructure

183 Current List of “Incubator Pioneers”
Metrics Project for usage stats and usage metrics for GT4 Lee Liming, University Chicago GridWay MetaScheduler that works over GT2 and GT4 GRAM, MDS Ignacio Lorente, University Madrid GridShib Integration of GSI and Shibboleth security Von Welch, NCSA Send to to discuss new project ideas!

184 You Can Begin Participating in Globus Development Today!
Monitor and comment on Globus development discussions; recent threads include: GT Backward Compatibility 4.2 Wish List for GRAM Submit bug fixes and other contributions Start your own Globus project!

185 GlobDev The current set of Globus components will be organized into several “Globus Projects” Projects release products Each project will have its own group of “Committers” Committers are responsible for governance on matters relating to their products The “Globus Management Committee” will provide overall guidance and conflict resolution approve the creation of new Globus Projects

186 The Future: Content We now have a solid and extremely powerful Web services base Next, we will build an expanded open source Grid infrastructure Virtualization New services for provisioning, data management, security, VO management End-user tools for application development Etc., etc. And of course responding to user requests for other short-term needs

187 Upcoming Tutorials Globus World Possibly a EU Globus week?
Joint with GGF Sept 11-15, DC Possibly a EU Globus week?

188 General Globus Help and Support
Globus-discuss list Each project has specific lists Bugzilla bugzilla.globus.org

189 Opportunities for Collaboration
Use of Globus software Feedback & involvement in design Development of new Globus components Help an existing project New information providers to enable use of GT to manage an entire Grid GRAM interfaces, etc Become an incubator project New applications and tools E.g., Grid operations, emergency response, ecogrid, bioinformatics, …

190 What’s This About a Globus Company?
Univa was announces yesterday (Dec 13, 2004) Steve Tuecke is CEO Both Carl Kesselman and Ian Foster are in advisory role Basic concept: “Redhat Linux for Globus” This will NOT affect the GT open source policy This WILL allow greater industrial involvement and investment in Grids

191 Credits Globus Toolkit v4 is the work of many talented Globus Alliance members, at Argonne Natl. Lab & U.Chicago USC Information Sciences Corporation National Center for Supercomputing Applns U. Edinburgh Swedish PDC Univa Corporation Other contributors at other institutions Supported by DOE, NSF, UK EPSRC, and other sources

192 Some Useful Pointers UK ETF GT4 report GRAM command line man page
UKeS pdf GRAM command line man page execution/wsgram/rn01re01.html GridFTP command line man page MDS4 TG site

193 For More Information Jennifer Schopf Globus Alliance GlobusWORLD 2006
Globus Alliance GlobusWORLD 2006 September 10-14 Washington, D.C. 2nd Edition

194

195 Short-Term Priorities: Security
Improve GSI error reporting & diagnostics Secure password, one-time password, Kerberos support for initial log on Trust roots, use of GridLogon Identity/attribute assertions in GT auth. callouts (e.g., Shib, PERMIS, VOMS, SAML) Extend CAS admin & policy support Security logging with management control for audit purposes

196 Short-Term Priorities: Data Management
Space & bandwidth management in GridFTP Concurrency in globus-url-copy Priorities in RFT Data replication service Enhance policy support in data services Physical file name creation service Scalable & distributed metadata manager

197 Short-Term Priorities: Execution Management
Implement GGF JSDL once finalized Advance reservation support Policy-driven restart of “persistent” jobs Improved information collection for jobs Improved management of job collections Credential refresh Development of workspace service Integration of virtual machines (Xen, VMware) and associated services Windows port of WS GRAM

198 Short-Term Priorities: Information Services
Many more information sources, including gateways to other systems Automated configuration of monitoring Specialized monitoring displays Performance optimization of registry Archiver service Helper tools to streamline integration of new information sources

199 Short-Term Priorities: WS Core
Streamlined container configuration Remote management interface Dynamic service deployment Service isolation: multiple service instances WS-Notification, subscription performance Full functionality in C WS Core Optimized WS-ServiceGroup support WS-SecureConversation support


Download ppt "Grid Computing and the Globus Toolkit"

Similar presentations


Ads by Google