Presentation is loading. Please wait.

Presentation is loading. Please wait.

Globus: Where Are We Going For the Next 10 Years? Jennifer M. Schopf Argonne National Laboratory UK National eScience Centre.

Similar presentations

Presentation on theme: "Globus: Where Are We Going For the Next 10 Years? Jennifer M. Schopf Argonne National Laboratory UK National eScience Centre."— Presentation transcript:

1 Globus: Where Are We Going For the Next 10 Years? Jennifer M. Schopf Argonne National Laboratory UK National eScience Centre

2 Globus: Where Are We Going Next? A Pragmatic Viewpoint Jennifer M. Schopf Argonne National Laboratory UK National eScience Centre

3 Grids l Youve all just spent the last 4 days learning about hands on approaches to Grids l By now you should know a little about several different ways to use Grid technology l But we all know what we have isnt enough, and if you talk to current users youll hear this as well

4 Globus Futures l Users l Virtual Environments l Higher level services u Security – l GridShib l Handle u Scheduling l GridWay l WorkFlow with Chimera/Pegasus u Data l Open everything

5 Im Going To Talk About High Level Concepts l If you want to know what is planned for each individual component – check out their roadmaps! l – search by project name, and component roadmap


7 Users and Applications l Growing and growing l Not every application is well suited to Grids, those that are u Collaborate u Share data or resources l Todays applications may not be as willing to be cutting edge as the old standbys

8 User Communities l Traditionally, we started with the physicists u Hard core users (heroic users) u Large computational problems u Already had strong national and international collaborations l This is growing and changing as understanding of how the resources can be used to further science and research are better understood

9 CeSC (Cambridge) The eScience Centres

10 CeSC (Cambridge) The High Energy Physicists EGEE ATLAS CMS D0 Star QCD Lattice Grid GridPP

11 Bio-Medical Community CeSC (Cambridge) CancerGrid eDiamond myGrid Integrative Biology Mouse Atlas

12 CeSC (Cambridge) e-Science Institute Support Services Grid Operations Support Centre Digital Curation Centre National eScience Centre National Grid Service

13 CeSC (Cambridge) New Application-Focused Centres Arts and Humanities e-Science Support Centre National Centre for e-Social Science National Institute for Environmental e-Science

14 We Need the Users l Too many tools have been built without regard to what a user needs and wants l If youre building tools u Talk to users, early and often l If youre a user- u Tell the toolmakers what you like and dont like u e constructive u Offer to alpha test


16 Virtualization l Vision of the Grid: u Plug in and get the services you need u Just like electricity u Doesnt matter what resource is supplying it, or where it is, just use the juice l Concrete example, a use might ask.. u Run my job, finish by lunch u Get a data set that has these attributes u Tell me when that simulation will finish

17 Where are we today? l Run my job, finish by lunch becomes u Run my job on this exact machine u With these data files transferred in using this protocol u I think it will take 2 hours, the queues have been slow lately, so I should make sure I send this off by 9am, or earlier if I want to be safe in having results for 2

18 Where are we today? l Get a data set that has these attributes becomes u Given a set of attributes, give me a set of logical file names u Given those, map them to physical file names u Given physical placements of the file, figure out which one is easiest to access u Copy the file to my machine

19 Where are we today (cont) l General agreement we have basic functionality u Tell me what this set of resources look like u Run this job on that resource u Transfer this file u Globus (among others) does give these basic building blocks (mostly) l General agreement general functionality isnt enough by far

20 Virtualization is happening l General concept of service-oriented grids is being accepted l Service Level Agreements (SLAs) are coming into place- this is a first step l Higher-level tools to separate user from basic resources are more common l We need to be careful to virtualize what a user is comfortable with u Hide security hassles u Tell users where their job is running in case of failures

21 Virtual Workspace Project l Kate Keahey, ANL l Virtual Workspace u Abstraction of an execution environment u Dynamically available to authorized clients l Abstraction captures u Resource quota for execution environment on deployment l CPU or memory share) u Software configuration aspects of the environment l OS installation or provided services l Workspace Service allows a Grid client to dynamically deploy and manage workspaces

22 Workspaces l Workspaces can be implemented and deployed in many ways u Boot images u Virtual machines u Simply dynamically provide access to already deployed workspaces by creating Unix accounts on the fly. l Our infrastructure focuses on the deployment and management of virtual machines l Describe and environment using the workspace meta-data l Deploy it on a specified resource quota l Environments that can be deployed in this way range from atomic workspaces to clusters and more complex constructs. l Incubator Project: Virtual Workspaces

23 Dynamic Accounts l Additional work has been done to provide basic services for creating dynamic accounts l Service to allow a Grid client to dynamically assign Unix accounts on a remote resource l Based on PKI credentials and the authorization information they carry l Incubator project: Dynamic Accounts


25 Higher Level Services l Security – u GridShib u Handle l Scheduling u GridWay u WorkFlow with Chimera/Pegasus l Data

26 GridShib l Von Welch, NCSA l Allows the use of Shibboleth-transported attributes for authorization in GT4 deployments u And, more generally, SAML support l 2 year project started December 1, 2004 l Beta software released September 16, 2005 l Now an Incubator Project as well

27 Globus Toolkit Handle System Integration l Frank Seibenelist, ANL l The Handle System u CNRI ( u General-purpose global name service u Secure name resolution over the internet l The Handle System-GT Integration Project u Leverage the Handle System for identifier and resolution services u Tight integration with GT4s Web services protocols l Incubator project: gt-hs

28 GridWay MetaScheduler l Ignacio M. Llorente, Universidad Complutense Madrid l Automatically perform job scheduling steps l Provide the runtime mechanisms needed for dynamically adapting execution l Runs over pre-WS GRAM, WS-GRAM, MDS2, MDS4 l Basic scheduling, flexible policies l Incubator Project

29 Chimera Virtual Data l Captures both logical and physical steps in a data analysis process. u Transformations (logical) u Derivations (physical) l Builds a catalog. l Results can be used to replay analysis. u Generation of DAG (via Pegasus) u Execution on Grid l Catalog allows introspection of analysis process. Galaxy cluster size distribution Sloan Survey Data

30 Pegasus Workflow Transformation Converts Abstract Workflow (AW) into Concrete Workflow (CW) u Uses Metadata to convert user request to logical data sources u Obtains AW from Chimera u Uses replication data to locate physical files u Delivers CW to DAGman u Executes using Condor u Publishes new replication and derivation data in RLS and Chimera (optional) Chimera Virtual Data Catalog Replica Location Service Metadata Catalog Storage System Compute Server DAGman Condor t

31 Data Services Being Developed l Scheduling GridFTP u Allowing GridFTP to control how many requests come to a server, and how theyre handled l Data Replica Management u One tool to locate a replica and transfer it u Tools to verify that replicas are the same

32 Globus Futures l Users l Virtual Environments l Higher level services u Security u Scheduling u Data l Open everything u Open Source u Open Contribution

33 Why is Globus Software Open Source? l To allow for inspection u For consideration in standardization processes l To encourage adoption u In pursuit of ubiquity and interoperability l To encourage contributions u Harness the expertise of the community

34 Open Contribution l But distributing code under an open source license does not guarantee open development! l Open development requires open processes l So we have created dev.globus to facilitate contributions u

35 Governance Model l Based on Apache Jakarta u Individual development efforts organized as projects u Consensus-based decision making l Control over each project in the hands of its most active and respected contributors (committers) l Globus Management Committee (GMC) providing overall guidance and conflict resolution

36 Common Infrastructure l Code repositories (CVS, SVN) l Mailing lists u *-dev, *-user, *-announce, *-commit for every project l Issue tracking (bugzilla) u Including roadmap info for future development l Wikis l Known interactions for people accessing your project

37 Incubator l Process in which an outside project becomes part of Globus u Overseen by the Incubator Management Project (IMP) l Outside project proposes itself as a candidate u IMP meet and discuss, and accept project as a ProtoProject l ProtoProject now part of the Incubator framework u Get assigned a Mentor to help u Opportunity to get up to speed on Globus Development process u Quarterly reviews

38 Current Incubator Projects l Incubator Management l Dynamic Accounts l GridShib l GridWay l gt-hs l Metrics l OGCE l Virtual Data System l Virtual Workspaces

39 Futures Summary l Users l Virtual Environments l Higher level services l Open everything

40 For More Information Jennifer M. Schopf l l l Support from NeSC/JISC, NSF, DOE This talk (not there yet)


42 Short-Term Priorities: Security l Improve GSI error reporting & diagnostics l Secure password, one-time password, Kerberos support for initial log on l Trust roots, use of GridLogon l Identity/attribute assertions in GT auth. callouts (e.g., Shib, PERMIS, VOMS, SAML) l Extend CAS admin & policy support l Security logging with management control for audit purposes

43 Short-Term Priorities: Data Management l Space & bandwidth management in GridFTP l Concurrency in globus-url-copy l Priorities in RFT l Data replication service l Enhance policy support in data services l Physical file name creation service l Scalable & distributed metadata manager

44 Short-Term Priorities: Execution Management l Implement GGF JSDL once finalized l Advance reservation support l Policy-driven restart of persistent jobs l Improved information collection for jobs l Improved management of job collections l Credential refresh l Development of workspace service l Integration of virtual machines (Xen, VMware) and associated services l Windows port of WS GRAM

45 Short-Term Priorities: Information Services l Many more information sources, including gateways to other systems l Automated configuration of monitoring l Specialized monitoring displays l Performance optimization of registry l Archiver service l Helper tools to streamline integration of new information sources

Download ppt "Globus: Where Are We Going For the Next 10 Years? Jennifer M. Schopf Argonne National Laboratory UK National eScience Centre."

Similar presentations

Ads by Google