Presentation is loading. Please wait.

Presentation is loading. Please wait.

National Aeronautics and Space Administration Jet Propulsion Laboratory Supporting Science Through Workflows: Infrastructure, Architecture and Modeling.

Similar presentations


Presentation on theme: "National Aeronautics and Space Administration Jet Propulsion Laboratory Supporting Science Through Workflows: Infrastructure, Architecture and Modeling."— Presentation transcript:

1 National Aeronautics and Space Administration Jet Propulsion Laboratory Supporting Science Through Workflows: Infrastructure, Architecture and Modeling David Woollard NASA Jet Propulsion Laboratory University of Southern California

2 National Aeronautics and Space Administration Jet Propulsion Laboratory D.M. Woollard. Supporting Science Through Workflows.2 Agenda »Motivation »Classification of in silico Experimentation »Research Problem »Related Work »Introduction to Workflow Systems »Research Goals »Methodology »Refactoring existing software »Domain Specific Software Architecture »Evaluation »Conclusions & Future Work

3 National Aeronautics and Space Administration Jet Propulsion Laboratory D.M. Woollard. Supporting Science Through Workflows.3 Motivation The nature of scientific investigations has changed. Two major trend lines: –Simulation via computer has for many replaced in vivo and in vitro science. –Collaborations are growing (system of systems science). New discoveries in materials science, chemistry, physics, planetary science, and even social sciences are made via in silico experimentation.

4 National Aeronautics and Space Administration Jet Propulsion Laboratory D.M. Woollard. Supporting Science Through Workflows.4 in silico Experimentation Discovery is a phase is which a scientist rapidly prototypes, tests hypotheses, and develops a methodology Discovery Production Distribution Theory Practice Development Execution Lone Researcher [Kepner 03]

5 National Aeronautics and Space Administration Jet Propulsion Laboratory D.M. Woollard. Supporting Science Through Workflows.5 in silico Experimentation engineeringProduction is the engineering of replicating an experiment on large volumes of data. Discovery Production Distribution Production Systems We will focus on Production Systems in this talk.

6 National Aeronautics and Space Administration Jet Propulsion Laboratory D.M. Woollard. Supporting Science Through Workflows.6 in silico Experimentation Papers Federated Data Digital LibrariesDistribution is a phase in which data is dispersed to peers for review and further experimentation including: Papers Federated Data Digital Libraries Discovery Production Distribution

7 National Aeronautics and Space Administration Jet Propulsion Laboratory D.M. Woollard. Supporting Science Through Workflows.7 The Role of Technology In silico science, especially system of systems science, is facilitated by the Grid. “The sharing that we are concerned with is not primarily file exchange but rather direct access to computers, software, data, and other resources, as is required by a range of collaborative problem-solving and resource- brokering strategies emerging in industry, science, and engineering.” The Anatomy of the Grid (2001)

8 National Aeronautics and Space Administration Jet Propulsion Laboratory D.M. Woollard. Supporting Science Through Workflows.8 Research Problem in silicoScientists harness complex hardware and software systems in order to conduct scientific research in silico. inefficientcostly Meeting these production requirements causes scientists to engineer a production system or a software engineer to rewrite scientific code. This is both inefficient and costly. production systemsOnce algorithms and processes are established, production systems are created to produce large volumes of data. complex engineering taskDesigning a production system is a complex engineering task as well as a complex scientific task.

9 National Aeronautics and Space Administration Jet Propulsion Laboratory D.M. Woollard. Supporting Science Through Workflows.9 Introduction to Workflows Production Systems Grid Systems Virtual Organizations Grid Systems have traditionally focused on creating Virtual Organizations. workflows In Grids, workflows orchestrate processing tasks in production systems. actors, tasks, data, and rules Workflows are a processing model that incorporate actors, tasks, data, and rules. Workflows T1 T2 T3 T4 T0 Workflow management systems execute tasks on data once the task’s dependencies are satisfied based on rules.

10 National Aeronautics and Space Administration Jet Propulsion Laboratory D.M. Woollard. Supporting Science Through Workflows.10 Workflow System Model

11 National Aeronautics and Space Administration Jet Propulsion Laboratory D.M. Woollard. Supporting Science Through Workflows.11 Workflows Workflows Everywhere Condor-G Pegasus Wings Taverna Grid Workflow Yawl DAG-Man Triana ICENI VDS GridAnt GrADS GridFlow Unicore Gridbus Askalon Kepler Karajan SciFlow OODT

12 National Aeronautics and Space Administration Jet Propulsion Laboratory D.M. Woollard. Supporting Science Through Workflows.12 Bottom-up Taxonomy Yu & Buyya presented a taxonomy [Yu & Buyya 05] –Based on workflow properties like model representation and scheduling policy –Illustration of divergence in the field No taxonomy by interface to task code.

13 National Aeronautics and Space Administration Jet Propulsion Laboratory D.M. Woollard. Supporting Science Through Workflows.13 Insights from an Architect Each production workflow task is a complex software application with two primary stakeholders: the scientist and the engineer. Software architectures are a system’s blueprint–its form, elements, and rationale [Perry & Wolf, 92]. appropriate views components connectorstopologyAn architecture provides appropriate views for each stakeholder in addition to encapsulation of computation and communication. These are the architecture’s components, connectors and topology. First- class connectorsexplicit interfacesReification of architectural elements in code is a method of bridging the gap between design and implementation. First- class connectors and explicit interfaces are such reifications.

14 National Aeronautics and Space Administration Jet Propulsion Laboratory D.M. Woollard. Supporting Science Through Workflows.14 Research Goals Develop a Domain Specific Software Architecture (DSSA) for tasks in scientific workflows. Develop a methodology for refactoring existing scientific code into this DSSA. Minimize overhead (computation time and memory footprint). Maximize science code reuse.

15 National Aeronautics and Space Administration Jet Propulsion Laboratory D.M. Woollard. Supporting Science Through Workflows.15 Agenda »Motivation »Classification of in silico Experimentation »Research Problem »Related Work »Introduction to Workflow Systems »Research Goals »Methodology »Refactoring existing software »Domain Specific Software Architecture »Evaluation »Conclusions & Future Work

16 National Aeronautics and Space Administration Jet Propulsion Laboratory D.M. Woollard. Supporting Science Through Workflows.16 Decomposing Software scientific modulesDecomposition, the first step in the approach, is a process in which scientific modules are identified and control flow determined. functionsScientific modules are like functions - they have internal scope and a single entry and exit point. In graph theoretic terms, the call dominancy tree for the basic blocks in the module only have one source and one sink. tunableThe proper level of decomposition is dependant on both scientific functionality and engineering requirements. Therefore, it should be “tunable.” Decomposition Architecting Deployment

17 National Aeronautics and Space Administration Jet Propulsion Laboratory D.M. Woollard. Supporting Science Through Workflows.17 “Injecting” Architecture Decomposition Architecting Deployment “architected”In the second part of the approach, these modules must be “architected” into a workflow task with connectors to services at appropriate levels (to satisfy production requirements). wrappersWe use Prism-MW wrappers to encapsulate and componentized these decomposed modules. This provides us with a standard interface and utilities at the module level for employing event-based communication. Exogenous Connector style invoking connectorWe use the Exogenous Connector style [Lau et. al.] to mimic the original control and data flow in the workflow task and augment these connectors with a specialized version of the invoking connector.

18 National Aeronautics and Space Administration Jet Propulsion Laboratory D.M. Woollard. Supporting Science Through Workflows.18 Deploying to the Grid Decomposition Architecting Deployment Deployment is the last step in our approach. We currently deploy the resulting workflow component into the OODT Science Data System environment. This is a grid workflow management system used at JPL. We should note that this choice is purely for the sake of developer convenience, the approach such be deployable to any target workflow management system.

19 National Aeronautics and Space Administration Jet Propulsion Laboratory D.M. Woollard. Supporting Science Through Workflows.19 SWSA Architecture Scientific Workflow Software Architecture (SWSA), a domain specific software architecture for workflow tasks.

20 National Aeronautics and Space Administration Jet Propulsion Laboratory D.M. Woollard. Supporting Science Through Workflows.20 Preliminary Evaluation We chose a canonical scientific application (matrix multiplication) implemented in both Fortran and C Six different metrics were taken: –Execution time for: Base application Wrapper (no data exchanged) Wrapper (data exchanged) –Memory Footprint Base application Wrapper (no data exchanged) Wrapper (data exchanged)

21 National Aeronautics and Space Administration Jet Propulsion Laboratory D.M. Woollard. Supporting Science Through Workflows.21 Preliminary Evaluation Refactoring Methodology Example: Molecular Dynamics Simulation Performance results are very promising: Time Overhead: 1.85% Code Reuse: 96.77%

22 National Aeronautics and Space Administration Jet Propulsion Laboratory D.M. Woollard. Supporting Science Through Workflows.22 Conclusions & Future Work Scientific Workflow Software Architecture (SWSA) improves upon existing workflow systems by providing: –A methodology for accessing services. –A separation of concerns between scientific algorithms and production features of code. –A clean separation of roles between the scientist and the engineer. Satisfies the “cult of performance.” Future Work –Extended evaluation on more advanced simulation codes. –Expansion of the the architecture to support parallel codes.

23 National Aeronautics and Space Administration Jet Propulsion Laboratory D.M. Woollard. Supporting Science Through Workflows.23 Thank You Portions of this research were conducted at the Jet Propulsion Laboratory managed by the California Institute of Technology under a contract with the National Aeronautics and Space Administration. For more information, please see: D. Woollard, N. Medvidovic, Y. Gil, and C. Mattmann. “Scientific Software as Workflows: From Discovery to Distribution.” To appear in IEEE Software Special Issue on Developing Scientific Software, 2008. D. Woollard, D. Freeborn, E. Kay-Im, S. LaVoie. “Case Studies in Science Data Systems: Meeting Software Challenges in Competitive Environments.” To appear in Proceedings of the 10th International Conference on Space Operations (SpaceOps-2008), AIAA press, Heidelberg, Germany, May 2008. D. Woollard. “Supporting Scientific Workflows Through First-Class Connectors.” Qualifying Examination Report. University of Southern California. May, 2007. D. Woollard, C. Mattmann, and N. Medvidovic "Injecting Software Architectural Constraints into Legacy Scientific Applications." USC Center for Software Engineering Technical Report, USC-CSE-2007-701, January 2007.


Download ppt "National Aeronautics and Space Administration Jet Propulsion Laboratory Supporting Science Through Workflows: Infrastructure, Architecture and Modeling."

Similar presentations


Ads by Google