Presentation is loading. Please wait.

Presentation is loading. Please wait.

Accessing Grid Resources via Portals and Workflow Tools Accessing Grid Resources via Portals and Workflow Tools Sriram Krishnan, Ph.D.

Similar presentations


Presentation on theme: "Accessing Grid Resources via Portals and Workflow Tools Accessing Grid Resources via Portals and Workflow Tools Sriram Krishnan, Ph.D."— Presentation transcript:

1 Accessing Grid Resources via Portals and Workflow Tools Accessing Grid Resources via Portals and Workflow Tools Sriram Krishnan, Ph.D. sriram@sdsc.edu

2 Condor poolSGE Cluster PBS Cluster Globus Application Services Security Services (GAMA) State Mgmt GemstonePMV/VisionKepler NBCR Grid

3 User Interfaces: Gemstone

4 User Interfaces: AutoDockTools (ADT), PMV

5 User Interfaces: What is a Portal? “A portal is a web based application that commonly provides personalization, single sign on, content aggregation from different sources and hosts the presentation layer of Information Systems”(JSR 168) Grid/Science Portals build upon the familiar Web portal model, such as Yahoo or Amazon, to deliver the benefits of Grid computing to virtual communities of users, providing a single access point to Grid services and resources.

6 User Interfaces: Portals Pros –Ubiquitous access to applications –No need to install complex software Cons –Limited interaction with local desktop tools –Interfaces may not be rich enough for complex tasks such as visualization –Not very easy to make highly interactive interfaces

7 User Interfaces: The CAMERA Labs Portal

8 CAMERA Labs Demo

9 Portal Technology Built on top of the GridSphere Portal Framework –http://www.gridsphere.orghttp://www.gridsphere.org JSR 168 Portlet API compliant –Similar to Servlet API in providing reusable Web applications –Ratified in August 2003 by vendors including BEA, Sun, IBM, Oracle, Plumtree, etc

10 What is a Portlet? Standardized packaging model to share portlet applications among portal vendors Builds off Servlet API and spec. so no major surprises for existing Java portal developers Supports window states and mode settings like desktop environment API provides useful methods for storing per user data and configuration settings

11 What makes GridSphere different? Already many other OS portals out there: –Jetspeed2, uPortal, StringBeans, Exo, Liferay, JBoss A handy template build system using Apache Ant: –ant new-project Lightweight: no EJB, based on popular, robust libraries –e.g. Hibernate for persistence Visual UI tags and beans makes presentation development much easier Support for the Grid!! –GridPortlets offered as add-on webapp –Provides Library and collection of portlets for: Credential support, job launch (GRAM), data transfer (GridFTP) Used by several CyberInfrastructure projects like BIRN, NBCR, GEON, CAMERA –Lots of reusable software!

12 Advanced Usage: Workflows Need for automation of processes (scientific or otherwise) –An end-to-end application is typically more than a single application run –Must be reproducible and maintainable –Should be easy to compose from individual components

13 client travel agent airline A airline B bank/CC delivery buy a ticket ticket s arrive confirm Workflow Scenario: Business

14 Scientific Workflows: Phylogeny Analysis Local Disk Multiple Sequence Alignment Phylogeny Analysis Tree Visualization

15 Scientific Workflow Systems Combination of –data integration, analysis, and visualization steps –larger, automated "scientific process" Mission of scientific workflow systems –Promote “scientific discovery” by providing tools and methods to generate scientific workflows –Create an extensible and customizable graphical user interface for scientists from different scientific domains –Support computational experiment creation, execution, sharing, reuse and provenance –Design frameworks which define efficient ways to connect to the existing data and integrate heterogeneous data from multiple resources

16 Why not just a Python script? End-users who define, reuse, modify, and specialize workflows would find visual interfaces much easier than scripts –Typically also possible to compile scripts from designed workflows Other advantages: –Modular reuse, application interoperability –Debugging and monitoring –Automated data management (e.g. provenance) –Validation (e.g. data, structural, semantic typing) From integrated modeling to execution, optimization, and archival

17 Ptolemy II: A laboratory for investigating design KEPLER: A problem-solving environment for Scientific Workflow KEPLER = “Ptolemy II + X” for Scientific Workflows Kepler: A Scientific Workflow System 1st Beta release (June 2, 2006) www.kepler-project.org Builds upon the open-source Ptolemy II framework

18 Actor-Oriented Design Actor –Encapsulation of parameterized actions –Interface defined by ports and parameters Port –Communication between input and output data –Without call-return semantics Model of computation –Communication semantics among ports –Flow of control –Implementation is a framework Actors: Processing Components

19 Available Actors Generic Web Service Client and Web Service Harvester Customizable RDBMS query and update Command-line wrapper tools (local, ssh, scp, ftp, etc.) Some Grid actors – Globus Job runner, GridFTP-based file access, Proxy Certificate Generator SRB support Imaging, Visualization Support Textual and Graphical Output Some domain-specific actors for Geosciences and Bio- informatics

20 Directors: Definition of Workflow Semantics Implement different computational models Define the semantics of –execution of actors and workflows –interactions between actors Kepler is extending Ptolemy directors with specialized ones for Web service based workflows, and distributed workflows Process Networks Rendezvous Publish and Subscribe Continuous Time Finite State Machines Dataflow Time Triggered Synchronous/reactive model Discrete Event Wireless

21 Dataflow as a Computation Model Dataflow: Abstract representation of how data flows in the system A dataflow program: a graph –Nodes represent operations, edges represent data paths Sound, simple, powerful model of parallel computation –NOT having a locus of control makes it simple! –Naturally distributed model of computation: – Asynchronous: Many actors can be ready to fire simultaneously – Execution ("firing") of a node starts when (matching) data is available at a node's input ports. – Locally controlled events – Events correspond to the “firing” of an actor – Actor: – A single instruction – A sequence of instructions – Actors fire when all the inputs are available

22 Vergil is the GUI for Kepler Actor ontology and semantic search for actors Search -> Drag and drop -> Link via ports Metadata-based search for datasets Actor Search Data Search

23 Actor Search Kepler Actor Ontology Used in searching actors and creating conceptual views (= folders) Currently more than 200 Kepler actors added!

24 Kepler Provenance Framework OPTIONAL! –Modeled as a separate concern in the system –Listens to the execution and saves information customized by a set of parameters Context: who, what, where, when, and why that is associated with the run Input data and its associated metadata Workflow outputs and intermediate data products Workflow definition (entities, parameters, connections): a specification of what exists in the workflow and can have a context of its own Information about the workflow evolution -- workflow trail Types of Provenance Information: –Data provenance Intermediate and end results including files and db references –Process provenance Keep the workflow definition with data and parameters used in the run –Error and execution logs –Workflow design provenance

25 Kepler Provenance Recording Utility Parametric and customizable –Different report formats –Variable levels of detail Verbose-all, verbose-some, medium, on error –Multiple cache destinations Saves information on –User name, Date, Run, etc…

26 Kepler Basics: Hello World Demo

27 Advanced Kepler: MEME-MAST Workflow

28 Advantages of Scientific Workflow Systems Formalization of the scientific process Easy to share, adapt and reuse –Deployable, customizable, extensible Management of complexity and usability –Support for hierarchical composition –Interfaces to different technologies from a unified interface –Can be annotated with domain-knowledge Tracking provenance of the data and processes –Keep the association of results to processes –Make it easier to validate/regenerate results and processes –Enable comparison between different workflow versions Execution monitoring and fault tolerance Interaction with multiple tools and resources at once

29 Summary Presented access to Grid applications via Portals and Workflow tools References –PMV, ADT: http://mgltools.scripps.edu/http://mgltools.scripps.edu/ –CAMERA: http://camera.calit2.nethttp://camera.calit2.net –GridSphere: http://www.gridsphere.orghttp://www.gridsphere.org –Kepler: http://www.kepler-project.orghttp://www.kepler-project.org

30 Acknowledgements CAMERA labs portal built in conjunction with the rest of the CAMERA team Several slides borrowed from Kepler tutorials presented by Ilkay Altintas [altintas@sdsc.edu]


Download ppt "Accessing Grid Resources via Portals and Workflow Tools Accessing Grid Resources via Portals and Workflow Tools Sriram Krishnan, Ph.D."

Similar presentations


Ads by Google