Presentation is loading. Please wait.

Presentation is loading. Please wait.

Grids and the Globus Community Dr. Jennifer M. Schopf Argonne National Lab

Similar presentations


Presentation on theme: "Grids and the Globus Community Dr. Jennifer M. Schopf Argonne National Lab"— Presentation transcript:

1 Grids and the Globus Community Dr. Jennifer M. Schopf Argonne National Lab http://www.mcs.anl.gov/~jms/Talks/

2 2 What is a Grid? l Resource sharing –Computers, storage, sensors, networks, … –Sharing always conditional: issues of trust, policy, negotiation, payment, … l Coordinated problem solving –Beyond client-server: distributed data analysis, computation, collaboration, … l Dynamic, multi-institutional virtual orgs –Community overlays on classic org structures –Large or small, static or dynamic

3 3 Why Is this Hard or Different? l Lack of central control –Where things run –When they run l Shared resources –Contention, variability l Communication and coordination –Different sites implies different sys admins, users, institutional goals, and often socio- political constraints

4 4 So Why Do It? l Computations that need to be done with a time limit l Data that cant fit on one site l Data owned by multiple sites l Applications that need to be run bigger, faster, more

5 5 What Kinds of Applications? l Computation intensive –Interactive simulation (climate modeling) –Large-scale simulation and analysis (galaxy formation, gravity waves, event simulation) –Engineering (parameter studies, linked models) l Data intensive –Experimental data analysis (e.g., physics) –Image & sensor analysis (astronomy, climate) l Distributed collaboration –Online instrumentation (microscopes, x-ray) Remote visualization (climate studies, biology) –Engineering (large-scale structural testing)

6 6 Grid Infrastructure l Distributed management –Of physical resources –Of software services –Of communities and their policies l Unified treatment –Build on Web services framework –Use WS-RF, WS-Notification (or WS- Transfer/Man) to represent/access state –Common management abstractions & interfaces

7 7 Globus is… l A collection of solutions to problems that come up frequently when building collaborative distributed applications l Software for Grid infrastructure –Service enable new & existing resources –Uniform abstractions & mechanisms l Tools to build applications that exploit Grid infrastructure –Registries, security, data management, … l Open source & open standards –Each empowers the other l Enabler of a rich tool & service ecosystem

8 8 Globus is an Hour Glass l Local sites have their own policies, installs – heterogeneity! –Queuing systems, monitors, network protocols, etc l Globus unifies – standards! –Build on Web services –Use WS-RF, WS-Notification to represent/access state –Common management abstractions & interfaces Local heterogeneity Higher-Level Services and Users Standard Interfaces

9 9 Globus is a Building Block l Basic components for Grid functionality –Not turnkey solutions, but building blocks & tools for application developers & system integrators l Highest-level services are often application specific, we let aps concentrate there l Easier to reuse than to reinvent –Compatibility with other Grid systems comes for free l We provide basic infrastructure to get you one step closer

10 10 dev.globus l Governance model based on Apache Jakarta –Consensus based decision making l Globus software is organized as several dozen Globus Projects –Each project has its own Committers responsible for their products –Cross-project coordination through shared interactions and committers meetings l A Globus Management Committee –Overall guidance and conflict resolution

11 11 http://dev.globus.org Guidelines (Apache Jakarta) Infrastructure (CVS, email, bugzilla, Wiki) Projects Include …

12 12 Incubator Projects Globus Software: dev.globus.org Security Execution Mgmt Info Services Common Runtime Globus Projects Other MPICH G2 GridWay Data Mgmt Incubation Mgmt Cog WF LRMA GAARDS OGROGDTEUGP HOC-SAPURSE GridShib Introduce Dyn Acct WEEP Gavia JSC Gavia MS DDM Virt WkSp SGGC Metrics ServMark GridFTP Reliable File Transfer OGSA-DAI GRAM MDS4 CAS Data Rep Delegation Replica Location Java Runtime C Runtime Python Runtime GT4 C SecGT4 Docs MEDICUS GSI- OpenSSH MyProxy SwiftMonMan NetLogger GEMLCA

13 13 Globus Technology Areas l Core runtime –Infrastructure for building new services l Security –Apply uniform policy across distinct systems l Execution management –Provision, deploy, & manage services l Data management –Discover, transfer, & access large data l Monitoring –Discover & monitor dynamic services

14 14 Web Service Basics l Web Services are basic distributed computing technology that let us construct client-server interactions Borja Sotomayor, http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s02.html

15 15 Core Runtime Provides Web Service Basics l Web services are platform independent and language independent –Client and server program can be written in diff langs, run in diff envts and still interact l Web services describe themselves –Once located you can ask it how to use it l Web service is *not* a website –Web service is accessed by sw, not humans l Web services are ideal for loosely coupled systems –Unlike CORBA, EJB, etc.

16 16 WSDL: Web Services Description Language Define expected messages for a service, and their (input or output parameters) An interface groups together a number of messages (operations) Bind an Interface via a definition to a specific transport (e.g. HTTP) and messaging (e.g. SOAP) protocol The network location where the service is implemented, e.g. http://localhost:8080

17 17 Real Web Service Invocation Borja Sotomayor, http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s02.html Discover Describe Invoke

18 18 Web Services Server Applications l Web service – software that exposes a set of operations l SOAP Engine – handle SOAP requests and responses (Apache Axis) l Application Server – provides living space for applications that must be accessed by different clients (Tomcat) l HTTP server- also called a Web server, handles http messages Borja Sotomayor, http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s02.html Container

19 19 Lets talk about state l Plain Web services are stateless Borja Sotomayor, http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s03.html

20 20 However, Many Grid Applications Require State Borja Sotomayor, http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s03.html

21 21 Keep the Web Service and the State Separate l Instead of putting state in a Web service, we keep it in a resource l Each resource has a unique key Borja Sotomayor, http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s03.html

22 22 Web Service + Resource = WS-Resource Address of a WS-resource is called an end- point reference Resources Can Be Anything Stored

23 23 Need For Standards l Web services are self describing using WSDL l But wed really like is a common way to –Name and do bindings –Start and end services –Query, subscription, and notification –Share error messages

24 24 Standard Interfaces l Service information l State representation –Resource –Resource Property l State identification –Endpoint Reference l State Interfaces –GetRP, QueryRPs, GetMultipleRPs, SetRP l Lifetime Interfaces –SetTerminationTime –ImmediateDestruction l Notification Interfaces –Subscribe, Notify l ServiceGroups Web Service GetRP GetMultRPs SetRP QueryRPs Subscribe SetTerm Time Destroy Client

25 25 WSRF & WS-Notification l Naming and bindings (basis for virtualization) –Every resource can be uniquely referenced, and has one or more associated services for interacting with it l Lifecycle (basis for fault resilient state management) –Resources created by services following factory pattern –Resources destroyed immediately or scheduled l Information model (basis for monitoring & discovery) –Resource properties associated with resources –Operations for querying and setting this info –Asynchronous notification of changes to properties l Service Groups (basis for registries & collective svcs) –Group membership rules & membership management l Base Fault type

26 26 WSRF vs XML/SOAP l The definition of WSRF means that the Grid and Web services communities can move forward on a common base l Why Not Just Use XML/SOAP? –WSRF and WS-N are just XML and SOAP –WSRF and WS-N are just Web services l Benefits of following the specs: –These patterns represent best practices that have been learned in many Grid applications –There is a community behind them –Why reinvent the wheel? –Standards facilitate interoperability

27 27 Leveraging Existing and Proposed Standards l WSRF and WS-N (GGF, OASIS) l WS-Agreement, WSDL 2.0, WSDM l GridFTP v1.0 (GGF) l OGSI v1.0 (GGF) l SSL/TLS v1 (from OpenSSL) (IETF) l X.509 Proxy Certificates (IETF) l SAML, XACML

28 28 Globus and Web Services WSDL, SOAP, WS-Security WS-A, WSRF, WS-Notification Globus WSRF Web Services Registry and Admin Globus Container (e.g., Apache Axis) User Applications Globus Core: Java, C (fast, small footprint), Python

29 29 Globus and Web Services WSDL, SOAP, WS-Security WS-A, WSRF, WS-Notification Custom WSRF Services Globus WSRF Web Services Registry and Admin Globus Container (e.g., Apache Axis) User Applications Globus Core: Java, C (fast, small footprint), Python

30 30 Globus and Web Services WSDL, SOAP, WS-Security Custom Web Services WS-A, WSRF, WS-Notification Custom WSRF Services Globus WSRF Web Services Registry and Admin Globus Container (e.g., Apache Axis) User Applications Globus Core: Java, C (fast, small footprint), Python

31 31 Globus Technology Areas l Core runtime –Infrastructure for building new services l Security –Apply uniform policy across distinct systems l Execution management –Provision, deploy, & manage services l Data management –Discover, transfer, & access large data l Monitoring –Discover & monitor dynamic services

32 32 Grid Security Concerns l Control access to shared services –Address autonomous management, e.g., different policy in different work groups l Support multi-user collaborations –Federate through mutually trusted services –Local policy authorities rule l Allow users and application communities to set up dynamic trust domains –Personal/VO collection of resources working together based on trust of user/VO

33 33 Globus Security Tools l Basic Grid Security Mechanisms l Certificate Generation Tools l Certificate Management Tools –Getting users registered to use a Grid –Getting Grid credentials to wherever theyre needed in the system l Authorization/Access Control Tools –Storing and providing access to system- wide authorization information

34 34 Globuss Use of Security Standards Supported, Supported, Fastest, but slow but insecure so default

35 35 Execution Management: GRAM l GRAM: Grid Resource Allocation Manager l A uniform service interface for remote job submission and control –Unix, Condor, LSF, PBS, SGE, … l More generally: interface for process execution management –Lay down execution environment –Stage data –Monitor & manage lifecycle –Kill it, clean up

36 36 GRAM4 (aka WS GRAM) l 2nd-generation WS implementation optimized for performance, flexibility, stability, scalability l Streamlined critical path –Use only what you need l Flexible credential management –Credential cache & delegation service l GridFTP & RFT used for data operations –Data staging & streaming output –Eliminates redundant GASS code l GRAM is not a scheduler. –Used as a front-end to schedulers,

37 37 GridWay Meta-Scheduler l Scheduler virtualization layer on top of Globus services –A LRM-like environment for submitting, monitoring, and controlling jobs –A way to submit jobs to the Grid, without having to worry about the details of exactly which local resource will run the job –A policy-driven job scheduler, implementing a variety of access and Grid-aware load balancing policies –Accounting GridWay: http://www.gridway.org

38 38 Application-Infrastructure decoupling PBS GridWay SGE $> CLI Results.C,.java DRMAA.C,.java Infrastructure Grid Middleware Applications Globus Grid Meta- Scheduler standard API (OGF DRMAA) Command Line Interface open source job execution management resource brokering Globus services Standard interfaces end-to-end (e.g. TCP/IP) highly dynamic & heterogeneous high fault rate GridWay: http://www.gridway.org

39 39 GT4 Data Management l Stage/move large data to/from nodes –GridFTP, Reliable File Transfer (RFT) –Alone, and integrated with GRAM l Locate data of interest –Replica Location Service (RLS) l Replicate data for performance/reliability –Distributed Replication Service (DRS) l Provide access to diverse data sources –File systems, parallel file systems, hierarchical storage: GridFTP –Databases: OGSA DAI

40 40 GridFTP in GT4 l A high-performance, secure, reliable data transfer protocol optimized for high-bw wide-area networks l GSI support for security l 3 rd party and partial file transfer support l IPv6 Support l XIO for different transports l Parallelism and striping multi-Gb/sec wide area transport Disk-to-disk on TeraGrid

41 41 Reliable File Transfer RFT Service RFT Client SOAP Messages Notifications (Optional) Data Channel Protocol Interpreter Master DSI Data Channel Slave DSI IPC Receiver IPC Link Master DSI Protocol Interpreter Data Channel IPC Receiver Slave DSI Data Channel IPC Link GridFTP Server l Fire-and-forget transfer l Web services interface l Many files & directories l Integrated failure recovery l Has transferred 900K files

42 42 Replica Location Service l Identify location of files via logical to physical name map l Distributed indexing of names, fault tolerant update protocols l New WS-RF version available l Managing ~40 million files across ~10 sites Index Local DB Update send (secs) Bloom filter (secs) Bloom filter (bits) 10K<121 M 22410 M 5 M717550 M

43 43 OGSA-DAI l Grid Interfaces to Databases –Data access >Relational & XML Databases, semi-structured files –Data integration >Multiple data delivery mechanisms, data translation l Extensible & Efficient framework –Request documents contain multiple tasks >A task = execution of an activity >Group work to enable efficient operation –Extensible set of activities >> 30 predefined, framework for writing your own –Moves computation to data –Pipelined and streaming evaluation –Concurrent task evaluation

44 44 Monitoring and Discovery System (MDS4) l Grid-level monitoring system –Aid user/agent to identify host(s) on which to run an application –Warn on errors l Uses standard interfaces to provide publishing of data, discovery, and data access, including subscription/notification –WS-ResourceProperties, WS- BaseNotification, WS-ServiceGroup l Functions as an hourglass to provide a common interface to lower-level monitoring tools

45 45 Standard Schemas (GLUE schema, eg) Information Users : Schedulers, Portals, Warning Systems, etc. Cluster monitors (Ganglia, Hawkeye, Clumon, Nagios) Services (GRAM, RFT, RLS) Queuing systems (PBS, LSF, Torque) WS standard interfaces for subscription, registration, notification

46 46 Globus Technology Areas l Core runtime –Infrastructure for building new services l Security –Apply uniform policy across distinct systems l Execution management –Provision, deploy, & manage services l Data management –Discover, transfer, & access large data l Monitoring –Discover & monitor dynamic services

47 47 Non-Technology Projects l Incubation Projects –Incubation management project –And any new projects wanting to join l Distribution Projects –Globus Toolkit Distribution l Documentation Projects –GT Release Manuals

48 48 Incubator Projects Globus Software: dev.globus.org Security Execution Mgmt Info Services Common Runtime Globus Projects Other MPICH G2 GridWay Data Mgmt Incubation Mgmt Cog WF LRMA GAARDS OGROGDTEUGP HOC-SAPURSE GridShib Introduce Dyn Acct WEEP Gavia JSC Gavia MS DDM Virt WkSp SGGC Metrics ServMark GridFTP Reliable File Transfer OGSA-DAI GRAM MDS4 CAS Data Rep Delegation Replica Location Java Runtime C Runtime Python Runtime GT4 C SecGT4 Docs MEDICUS GSI- OpenSSH MyProxy SwiftMonMan NetLogger GEMLCA

49 49 Incubator Process in dev.globus l Entry point for new Globus projects l Incubator Management Project (IMP) –Oversees incubator process form first contact to becoming a Globus project –Quarterly reviews of current projects http://dev.globus.org/wiki/Incubator/ Incubator_Process

50 50 24 Active Incubator Projects l CoG Workflow l Distributed Data Management (DDM) l Dynamic Accounts l Grid Authentication and Authorization with Reliably Distributed Services (GAARDS) l Gavia-Meta Scheduler l Gavia- Job Submission Client l Grid Development Tools for Eclipse (GDTE) l Grid Execution Mgmt. for Legacy Code Apps. (GEMLCA) l Open GRid OCSP (Online Certificate Status Protocol) l Portal-based User Registration Service (PURSe) l ServMark l SJTU GridFTP GUI Client (SGGC) l Swift l UCLA Grid Portal Software (UGP) l Workflow Enactment Engine Project (WEEP) l Virtual Workspaces l GridShib l Higher Order Component Service Architecture (HOC- SA) l Introduce l Local Resource Manager Adaptors (LRMA) l MEDICUS (Medical Imaging and Computing for Unified Information Sharing) l Metrics l MonMan l NetLogger

51 51 Active Committers from 28 Institutions l Aachen Univ. (Germany) l Argonne National Laboratory l CANARIE (Canada) l CertiVeR l Childrens Hospital Los Angeles l Delft Univ. (The Netherlands) l Indiana Univ. l Kungl. Tekniska Högskolan (Sweden) l Lawrence Berkeley National Lab l Univ. of Marburg (Germany) l Univ. of Muenster (Germany) l Univ. Politecnica de Catalunya (Spain) l Univ. of Rochester l USC Information Sciences Institute l Univ. of Victoria (Canada) l Univ. of Vienna (Austria) l Univ. of Westminster (UK) l Univa Corp. l Leibniz Supercomputing Center (Germany) l NCSA l National Research Council of Canada l Ohio State Univ. l Semantic Bits l Shanghai Jiao Tong University (China) l Univ. of British Columbia (Canada) l UCLA l Univ. of Chicago l Univ. of Delaware

52 52 Incubator Projects Globus Software: dev.globus.org Security Execution Mgmt Info Services Common Runtime Globus Projects Other MPICH G2 GridWay Data Mgmt Incubation Mgmt Cog WF LRMA GAARDS OGROGDTEUGP HOC-SAPURSE GridShib Introduce Dyn Acct WEEP Gavia JSC Gavia MS DDM Virt WkSp SGGC Metrics ServMark GridFTP Reliable File Transfer OGSA-DAI GRAM MDS4 CAS Data Rep Delegation Replica Location Java Runtime C Runtime Python Runtime GT4 C SecGT4 Docs MEDICUS GSI- OpenSSH MyProxy SwiftMonMan NetLogger GEMLCA

53 53 Incubator Projects Globus Software: dev.globus.org Security Execution Mgmt Info Services Common Runtime Globus Projects Other MPICH G2 GridWay Data Mgmt Incubation Mgmt Cog WF LRMA GAARDS OGROGDTEUGP HOC-SAPURSE GridShib Introduce Dyn Acct WEEP Gavia JSC Gavia MS DDM Virt WkSp SGGC Metrics ServMark GridFTP Reliable File Transfer OGSA-DAI GRAM MDS4 CAS Data Rep Delegation Replica Location Java Runtime C Runtime Python Runtime GT4 C SecGT4 Docs MEDICUS GSI- OpenSSH MyProxy SwiftMonMan NetLogger GEMLCA

54 54 GT4 Distribution l Usability, reliability –All components meet a quality standard –Testing, logging, coding standards –Documentation at acceptable quality level –Guarantee that interfaces wont change within a major version (4.0.1 == 4.0.any) l Consistency with latest standards (WS-*, WSRF, WS-N, etc.) and Apache platform –WS-I Basic Profile compliant –WS-I Basic Security Profile compliant

55 55 Versioning and Support l Versioning –Evens are production (4.0.x, 4.2.x), –Odds are development (4.1.x) l We support this version and the one previous –Currently stable version 4.0.5 –We support 3.2.x and 4.0.x –Weve also got the 4.1.3 dev release available

56 56 Several Next Versions l 4.0.6 – stable release –100% same interfaces –When we get enough bug fixes l 4.1.4 – development release(s) –New functionality –Expected every 6-8 weeks (mid September) l 4.2.0 - stable release –Tested, documented 4.1.x branch –Likely Q1 2008 –Discussed on gt-dev@globus.org l 5.0 – substantial code base change –With any luck, not for years :)

57 57 Tested Platforms l Debian l Fedora Core l FreeBSD l HP/UX l IBM AIX l Red Hat l Sun Solaris l SGI Altix (IA64 running Red Hat) l SuSE Linux l Tru64 Unix l Apple MacOS X (no binaries) l Windows – Java components only List of binaries and known platform-specific install bugs at http://www.globus.org/toolkit/docs/4.0/admin/ docbook/ ch03.html

58 58 Globus User Community l Large & diverse –10s of national Grids, 100s of applications, 1000s of users; probably much more –Every continent except Antarctica –Applications ranging across many fields –Dozens (at least) of commercial deployments l Successful –Many production systems doing real work –Many applications producing real results –Hundreds of papers published because of grid deployments l Smart, energetic, demanding –Constant stream of new use cases & tools

59 59 Global Community

60 60 Examples of Production Scientific Grids l APAC (Australia) l China Grid l China National Grid l DGrid (Germany) l EGEE l NAREGI (Japan) l Open Science Grid l Taiwan Grid l TeraGrid l ThaiGrid l UK Natl Grid Service

61 61 How Can You Contribute? Create a New Project l Do you have a project youd like to contribute? l Does your software solve a problem you think the Globus community would be interested in? l Contact incubator-committers@globus.orgincubator-committers@globus.org

62 62 How Can You Contribute? Help an Existing Project l Contribute code, documentation, design ideas, and feature requests l Joining the mailing lists –*-dev, *-user, *-commit for each project –See the project wiki page at dev.globus.org l Chime in at any time l Regular contributors can become committers, with a role in defining project directions http://dev.globus.org/wiki/How_to_contribute

63 63 Globus Next Steps l Expanded open source Grid infrastructure –Updates for current standards –New services for data management, security, VO management, troubleshooting –End-user tools for application development –Virtualization l Some infrastructure work –Outside projects joining Globus –Expanded outreach: outreach@globus.org l And of course responding to user requests for other short-term needs

64 64 For More Information l Jennifer Schopf –jms@mcs.anl.govjms@mcs.anl.gov –http://www.mcs.anl.gov/~jms l Globus Alliance –http://www.globus.org l Dev.globus –http://dev.globus.org l Upcoming Events –http://dev.globus.org/wiki/Outreach


Download ppt "Grids and the Globus Community Dr. Jennifer M. Schopf Argonne National Lab"

Similar presentations


Ads by Google