Ian Foster Computation Institute Argonne National Lab & University of Chicago Services for Science.

Slides:



Advertisements
Similar presentations
LEAD Portal: a TeraGrid Gateway and Application Service Architecture Marcus Christie and Suresh Marru Indiana University LEAD Project (
Advertisements

Open Grid Forum 19 January 31, 2007 Chapel Hill, NC Stephen Langella Ohio State University Grid Authentication and Authorization with.
CVRG Presenter Disclosure Information Tahsin Kurc, PhD Center for Comprehensive Informatics Emory University CardioVascular Research Grid Core Infrastructure.
This product includes material developed by the Globus Project ( Introduction to Grid Services and GT3.
Service Oriented Grid Architecture Hui Li ICT in Business Colloquium, LIACS Mar 1 st, 2006 Note: Part of this presentation is based on Dr. Ian Foster’s.
Microsoft Research Faculty Summit Ian Foster Computation Institute University of Chicago & Argonne National Laboratory.
Ian Foster Computation Institute Argonne National Lab & University of Chicago Education in the Science 2.0 Era.
Ian Foster Computation Institute Argonne National Lab & University of Chicago Cyberinfrastructure and the Role of Grid Computing Or, “Science 2.0”
The Global Storage Grid Or, Managing Data for “Science 2.0” Ian Foster Computation Institute Argonne National Lab & University of Chicago.
Building Scientific Workflows with Taverna and BPEL: a Comparative Study in caGrid Wei Tan 1, Paolo Missier 2, Ravi Madduri 1, Ian Foster 1 1 University.
CaGrid Service Metadata Scott Oster - Ohio State
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
Ian Foster Computation Institute Argonne National Lab & University of Chicago Service-Oriented Science: Scaling eScience Impact Or, “Science 2.0”
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Globus 4 Guy Warner NeSC Training.
Kate Keahey Argonne National Laboratory University of Chicago Globus Toolkit® 4: from common Grid protocols to virtualization.
Scientific Workflows on the Grid. Goals Enhance scientific productivity through: Discovery and application of datasets and programs at petabyte scale.
Using Globus to Scale an Application Case Study 4: Scientific Workflow for Computational Economics Tiberiu Stef-Praun, Gabriel Madeira, Ian Foster, Robert.
State of Service Oriented Science Tools Open Source Grid Cluster Conference Oakland.
TeraGrid Information Services John-Paul “JP” Navarro TeraGrid Grid Infrastructure Group “GIG” Area Co-Director for Software Integration and Information.
Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings Department of Biomedical Informatics Ohio State University.
A Swift Talk about Globus Technology: What Can It Do for Me? OOI Cyberinfrastructure Design Meeting, San Diego, October The Globus Team (presented.
Digital Object Architecture
Globus Data Replication Services Ann Chervenak, Robert Schuler USC Information Sciences Institute.
Department of Biomedical Informatics Service Oriented Bioscience Cluster at OSC Umit V. Catalyurek Associate Professor Dept. of Biomedical Informatics.
Swift: a scientist’s gateway to campus clusters, grids and supercomputers Swift project: Presenter contact:
Swift: implicitly parallel dataflow scripting for clusters, clouds, and multicore systems Michael Wilde -
Ian Foster Computation Institute Argonne National Lab & University of Chicago Globus and Service Oriented Architecture.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
Swift Fast, Reliable, Loosely Coupled Parallel Computation Ben Clifford, Tibi Stef-Praun, Mike Wilde Computation Institute University.
The GriPhyN Virtual Data System Ian Foster for the VDS team.
Ian Foster Computation Institute Argonne National Lab & University of Chicago From the heroic to the logistical Non-traditional applications for parallel.
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.
GRAM5 - A sustainable, scalable, reliable GRAM service Stuart Martin - UC/ANL.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
OOI CI LCA REVIEW August 2010 Ocean Observatories Initiative OOI Cyberinfrastructure Architecture Overview Michael Meisinger Life Cycle Architecture Review.
Middleware Support for Virtual Organizations Internet 2 Fall 2006 Member Meeting Chicago, Illinois Stephen Langella Department of.
Grid Services Overview & Introduction Ian Foster Argonne National Laboratory University of Chicago Univa Corporation OOSTech, Baltimore, October 26, 2005.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
Shannon Hastings Multiscale Computing Laboratory Department of Biomedical Informatics.
Introduce Grid Service Authoring Toolkit Shannon Hastings, Scott Oster, Stephen Langella, David Ervin Ohio State University Software Research Institute.
Communicating Security Assertions over the GridFTP Control Channel Rajkumar Kettimuthu 1,2, Liu Wantao 3,4, Frank Siebenlist 1,2 and Ian Foster 1,2,3 1.
Data and storage services on the NGS Mike Mineter Training Outreach and Education
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Grid Services I - Concepts
Ian Foster Computation Institute Argonne National Lab & University of Chicago Cyberinfrastructure and the Role of Grid Computing Or, “Science 2.0”
Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.
CaGrid Overview and Core Services caGrid Knowledge Center February 2011.
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Ian Foster Computation Institute Argonne National Lab & University of Chicago Scaling eScience Impact.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
Cole David Ronnie Julio. Introduction Globus is A community of users and developers who collaborate on the use and development of open source software,
1 Service Creation, Advertisement and Discovery Including caCORE SDK and ISO21090 William Stephens Operations Manager caGrid Knowledge Center February.
GraDS MacroGrid Carl Kesselman USC/Information Sciences Institute.
Ian Foster Computation Institute Argonne National Lab & University of Chicago Grid Enabling Open Science.
Grid Rapid Application Virtualization Interface (gRAVI) - Service Oriented Science Ravi K Madduri, Argonne National Laboratory/ University of Chicago Joshua.
Globus.org/genomics Globus Galaxies Science Gateways as a Service Ravi K Madduri, University of Chicago and Argonne National Laboratory
CEDPS Services Area Update CEDPS Face-to-Face Meeting ANL October 2007.
Ian Foster Computation Institute Argonne National Lab & University of Chicago Application Hosting Services — Enabling Science 2.0 —
International Planetary Data Alliance Registry Project Update September 16, 2011.
Ian Foster Computation Institute Argonne National Lab & University of Chicago Towards an Open Analytics Environment (A “Data Cauldron”)
Beyond simple features: Do complex feature types need complex service descriptions?' B.N. Lawrence (1,2), D. Lowe (1,2), S. Pascoe (1,2) and A. Woolf (1).
Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois
Ravi K Madduri, Argonne National Laboratory/ University of Chicago
StratusLab Final Periodic Review
StratusLab Final Periodic Review
The Anatomy and The Physiology of the Grid
Presentation transcript:

Ian Foster Computation Institute Argonne National Lab & University of Chicago Services for Science

2 Thanks! l DOE Office of Science l NSF Office of Cyberinfrastructure l National Institutes of Health l Colleagues at Argonne, U.Chicago, USC/ISI, OSU, Manchester, and elsewhere

3 Scientific Communication, ~1600 BraheKepler

4

5 Scientific Communication, ~2000 Data Archives User Analysis tools Gateway Figure: S. G. Djorgovski Discovery tools Service-Oriented Science

6 caBIG: sharing of infrastructure, applications, and data. Data Integration! Services & Cancer Biology Globus

7 Raffaele Montella, U. Napoli Parthenope Services & Environmental Science

8

9 Service-Oriented Science People create services (data, code, instr.) … which I discover (& decide whether to use) … & compose to create a new function... & then publish as a new service.  I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!  I hope that this “someone else” can manage security, reliability, scalability, … !! “Service-Oriented Science”, Science, 2005

10 Creating Services People create services (data, code, instr.) … which I discover (& decide whether to use) … & compose to create a new function... & then publish as a new service.  I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!  I hope that this “someone else” can manage security, reliability, scalability, … !! “Service-Oriented Science”, Science, 2005

11 Anatomy of a Service op1opN (meta)data Implementation(s) Clients Registry Management Clients … Service Attribute Authority Attribute Authority Persistence

12 Exposing Service State State Interfaces  Get resource property  Get multiple RPs  Set RP state  Query a set of RPs  Subscribe: notifications  Manage termination time  Immediate destruction Resource property Resource Service GetRP GetMultRPs SetRP QueryRPs Subscribe SetTermTime Destroy EPR State representation u Resource, resource property State identification u Endpoint Reference

13 Apache Tomcat Service Container Globus Toolkit Web Services Container RPs Resource Service GetRP GetMultRPs SetRP QueryRPs Subscribe SetTermTime Destroy EPR ResourceHome RPs Resource Service GetRP GetMultRPs SetRP QueryRPs Subscribe SetTermTime Destroy EPR ResourceHome RPs Resource Service GetRP GetMultRPs SetRP QueryRPs Subscribe SetTermTime Destroy EPR ResourceHome PIP PDP WorkManagerDB Conn Pool JNDI Directory Security Persistence Management State Authorization Globus Toolkit Version 4: Software for Service-Oriented Systems, LNCS 3779, 2-13, 2005 Globus

14 Creating Services (~2005) “This full-day tutorial provides an introduction to programming Java services with the latest version of the Globus Toolkit version 4 (GT4). The tutorial teaches how to build a Java Service that makes use of GT4 mechanisms for state management, security, registry and related topics.”

15 Appln Service Create Index service Store Repository Service Advertize Discover Invoke; get results Introduce Container Transfer GAR Deploy Argonne/U.Chicago & Ohio State University Creating Services in 2008 Introduce and gRAVI l Introduce u Define service u Create skeleton u Discover types u Add operations u Configure security l Grid Remote Application Virtualization Infrastructure u Wrap executables Globus

Creating Services Introduce + gRAVI Shannon Hastings Scott Oster David Ervin Stephen Langella Kyle Chard Ravi Madduri

17 Discovering Services People create services (data or functions) … which I discover (& decide whether to use) … & compose to create a new function... & then publish as a new service.  I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!  I hope that this “someone else” can manage security, reliability, scalability, … !! “Service-Oriented Science”, Science, 2005

18  The ultimate arbiter?  Types, ontologies  Can I use it?  Billions of services Discovering Services Assume success Semantics Permissions Reputation A B

19 Discovery (1): Registries Globus

20 Discovery (2): Standardized Vocabularies Core Services Grid Service Uses Terminology Described In Cancer Data Standards Repository Enterprise Vocabulary Services References Objects Defined in Service Metadata Publishes Subscribes to and Aggregates Queries Service Metadata Aggregated In Registers To Discovery Client API Index Service Globus

21

22 Discovery (3): Tagging & Social Networking GLOSS: Generalized Labels Over Scientific data Sources (Foster, Nestorov)

23 Discovery (3): Tagging & Social Networking David de Roure, Carole Goble, et al.

24 Composing Services People create services (data or functions) … which I discover (& decide whether to use) … & compose to create a new function... & then publish as a new service.  I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!  I hope that this “someone else” can manage security, reliability, scalability, … !! “Service-Oriented Science”, Science, 2005

25 Composing Services: E.g., BPEL Workflow System Data uchicago.edu Analytic osu.edu Analytic duke.edu <BPEL Workflow Doc> <Workflow Inputs> <Workflow Results> BPEL Engine link caBiG: BPEL work: Ravi Madduri et al. link See also Kepler & Taverna Globus

26 Composing Services Globus

Composing Services Taverna + GT4 Taverna team Wei Tan Ravi Madduri

28 Publishing Services People create services (data or functions) … which I discover (& decide whether to use) … & compose to create a new function... & then publish as a new service.  I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!  I hope that this “someone else” can manage security, reliability, scalability, … !! “Service-Oriented Science”, Science, 2005

29 Publishing Services Description  Syntax, semantics State  Availability, load, … Policies  Who, what, when, … Hosting  Location, scalability, …

30 Defining Community: Membership and Laws l Identify VO participants and roles u For people and services l Specify and control actions of members u Empower members  delegation u Enforce restrictions  federate policy A 12 B 12 A B Access granted by community to user Site admission- control policies Effective Access Policy of site to community

31 Authorization: SAML & XACML VOMSShibbolethLDAP PERMIS … GT4 Client GT4 Server PDP Attributes Authorization Decision PIP SAML XACML Globus

32 Hosting Services People create services (data or functions) … which I discover (& decide whether to use) … & compose to create a new function... & then publish as a new service.  I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!  I hope that this “someone else” can manage security, reliability, scalability, … !! “Service-Oriented Science”, Science, 2005

33 The Importance of “Hosting” and “Management” Tell me about this star Tell me about these 20K stars Support 1000s of users E.g., Sloan Digital Sky Survey, ~10 TB; others much bigger

34 The Two Dimensions of Service-Oriented Science l Decompose across network Clients integrate dynamically u Select & compose services u Select “best of breed” providers u Publish result as new services l Decouple resource & service providers Function Resource Data Archives Analysis tools Discovery tools Users Fig: S. G. Djorgovski

35 Dynamic Provisioning: Falkon Applied to Stacking l Purpose u On-demand “stacks” of random locations within ~10TB dataset l Challenge u Rapid access to K “random” files u Time-varying load l Solution u Dynamic acquisition of compute, storage = + S4S4 Sloan Data Web page or Web Service Joint work with Ioan Raicu & Alex Szalay Globus

Dynamic Provisioning with Falkon: Release after 15 Seconds Idle

Dynamic Provisioning with Falkon: Release after 180 Seconds Idle

Building Scalable Service Implementations Functional MRI Ben Clifford, Mihael Hatigan, Mike Wilde, Yong Zhao Globus

AIRSN Program Definition (Run snr) functional ( Run r, NormAnat a, Air shrink ) { Run yroRun = reorientRun( r, "y" ); Run roRun = reorientRun( yroRun, "x" ); Volume std = roRun[0]; Run rndr = random_select( roRun, 0.1 ); AirVector rndAirVec = align_linearRun( rndr, std, 12, 1000, 1000, "81 3 3" ); Run reslicedRndr = resliceRun( rndr, rndAirVec, "o", "k" ); Volume meanRand = softmean( reslicedRndr, "y", "null" ); Air mnQAAir = alignlinear( a.nHires, meanRand, 6, 1000, 4, "81 3 3" ); Warp boldNormWarp = combinewarp( shrink, a.aWarp, mnQAAir ); Run nr = reslice_warp_run( boldNormWarp, roRun ); Volume meanAll = strictmean( nr, "y", "null" ) Volume boldMask = binarize( meanAll, "y" ); snr = gsmoothRun( nr, boldMask, "6 6 6" ); } (Run or) reorientRun (Run ir, string direction) { foreach Volume iv, i in ir.v { or.v[i] = reorient(iv, direction); }

Virtual Node(s) SwiftScript Abstract computation Virtual Data Catalog SwiftScript Compiler SpecificationExecution Virtual Node(s) Provenance data Provenance data Provenance collector launcher file1 file2 file3 App F1 App F2 Scheduling Execution Engine (Karajan w/ Swift Runtime) Swift runtime callouts C CCC Status reporting Provisioning Falkon Resource Provisioner Amazon EC2 Dynamic Provisioning: Swift Architecture Yong Zhao, Mihael Hatigan, Ioan Raicu, Mike Wilde, Ben Clifford, et al. Globus

41 Services for Science l They’re new u A new approach to communicating u A (not-so new) approach to structuring systems l They’re real u Excellent infrastructure and tools (Globus, Introduce, gRAVI, Taverna, Swift, etc.) u Substantial numbers of services out there l They’re challenging u Sociology: incentives, rewards u Infrastructure: hosting u Provenance: justifying “results” u Scaling: services, requests