Using Kepler to Perform Parameter Studies in Subsurface Sciences Jared Chase Scientific Data Management CET All Hands Meeting 11/28/2007

Slides:



Advertisements
Similar presentations
Designing Services for Grid-based Knowledge Discovery A. Congiusta, A. Pugliese, Domenico Talia, P. Trunfio DEIS University of Calabria ITALY
Advertisements

A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
H.G.Essel: Go4 - J. Adamczewski, M. Al-Turany, D. Bertini, H.G.Essel, S.Linev CHEP 2004 Go4 v2.8 Analysis Design.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
1 OBJECTIVES To generate a web-based system enables to assemble model configurations. to submit these configurations on different.
Current Progress on the CCA Groundwater Modeling Framework Bruce Palmer, Yilin Fang, Vidhya Gurumoorthi, Computational Sciences and Mathematics Division.
Object-Oriented Analysis and Design
MotoHawk Training Model-Based Design of Embedded Systems.
ProActive Task Manager Component for SEGL Parameter Sweeping Natalia Currle-Linde and Wasseim Alzouabi High Performance Computing Center Stuttgart (HLRS),
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
1 Richard White Design decisions: architecture 1 July 2005 BiodiversityWorld Grid Workshop NeSC, Edinburgh, 30 June - 1 July 2005 Design decisions: architecture.
6th Biennial Ptolemy Miniconference Berkeley, CA May 12, 2005 Distributed Computing in Kepler Ilkay Altintas Lead, Scientific Workflow Automation Technologies.
NextGRID & OGSA Data Architectures: Example Scenarios Stephen Davey, NeSC, UK ISSGC06 Summer School, Ischia, Italy 12 th July 2006.
Distributed Application Management Using PLuSH Jeannie Albrecht, Christopher Tuttle, Alex C. Snoeren, and Amin Vahdat UC San Diego CSE {jalbrecht, ctuttle,
Integrated Scientific Workflow Management for the Emulab Network Testbed Eric Eide, Leigh Stoller, Tim Stack, Juliana Freire, and Jay Lepreau and Jay Lepreau.
Astrophysics, Biology, Climate, Combustion, Fusion, Nanoscience Working Group on Simulation-Driven Applications 10 CS, 10 Sim, 1 VR.
© , Michael Aivazis DANSE Software Issues Michael Aivazis California Institute of Technology DANSE Software Workshop September 3-8, 2003.
UvA, Amsterdam June 2007WS-VLAM Introduction presentation WS-VLAM Requirements list known as the WS-VLAM wishlist System and Network Engineering group.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
End-to-End Design of Embedded Real-Time Systems Kang G. Shin Real-Time Computing Laboratory EECS Department The University of Michigan Ann Arbor, MI
Slide 1 of 9 Presenting 24x7 Scheduler The art of computer automation Press PageDown key or click to advance.
Workflow API and workflow services A case study of biodiversity analysis using Windows Workflow Foundation Boris Milašinović Faculty of Electrical Engineering.
A Semantic Workflow Mechanism to Realise Experimental Goals and Constraints Edoardo Pignotti, Peter Edwards, Alun Preece, Nick Gotts and Gary Polhill School.
Android Core Logging Application Keith Schneider Introduction The Core Logging application is part of a software suite that is designed to enable geologic.
Biology.sdsc.edu CIPRes in Kepler: An integrative workflow package for streamlining phylogenetic data analyses Zhijie Guan 1, Alex Borchers 1, Timothy.
Špindlerův Mlýn, Czech Republic, SOFSEM Semantically-aided Data-aware Service Workflow Composition Ondrej Habala, Marek Paralič,
ArcGIS Workflow Manager An Introduction
January, 23, 2006 Ilkay Altintas
Apache Airavata GSOC Knowledge and Expertise Computational Resources Scientific Instruments Algorithms and Models Archived Data and Metadata Advanced.
Christopher Jeffers August 2012
An Extensible Python User Environment Jeff Daily Karen Schuchardt, PI Todd Elsethagen Jared Chase H41G-0956 Website Acknowledgements.
Role of Deputy Director for Code Architecture and Strategy for Integration of Advanced Computing R&D Andrew Siegel FSP Deputy Director for Code Architecture.
1 Validation & Verification Chapter VALIDATION & VERIFICATION Very Difficult Very Important Conceptually distinct, but performed simultaneously.
Zhonghua Qu and Ovidiu Daescu December 24, 2009 University of Texas at Dallas.
Towards a Provenance Architecture Karen Schuchardt PNNL.
material assembled from the web pages at
1 Overview of the Application Hosting Environment Stefan Zasada University College London.
A framework to support collaborative Velo: Knowledge Management for Collaborative (Science | Biology) Projects A framework to support collaborative 1.
Syzygy Design overview Distributed Scene Graph Master/slave application framework I/O Device Integration using Syzygy Scaling down: simulators and other.
Introduction to EPA’s Multimedia Integrated Modeling System Software Suite: A New Framework for Models-3 Steve Fine (EPA/NOAA), Steve Howard (EPA/NOAA),
Accelerating Scientific Exploration Using Workflow Automation Systems Terence Critchlow (LLNL) Ilkay Altintas (SDSC) Scott Klasky(ORNL) Mladen Vouk (NCSU)
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
Shannon Hastings Multiscale Computing Laboratory Department of Biomedical Informatics.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
_______________________________________________________________CMAQ Libraries and Utilities ___________________________________________________Community.
Framework for MDO Studies Amitay Isaacs Center for Aerospace System Design and Engineering IIT Bombay.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Presented by Scientific Annotation Middleware Software infrastructure to support rich scientific records and the processes that produce them Jens Schwidder.
Update on the CCA Groundwater Simulation Framework: the BOCCA Experience Bruce Palmer, Yilin Fang, Vidhya Gurumoorthi, James Fort, Tim Scheibe Computational.
Land Ice Verification and Validation (LIVV) Kit Weak scaling behavior for a large dome- shaped test case. It shows that the scaling behavior of a new run.
Your name here SPA: Successes, Status, and Future Directions Terence Critchlow And many, many, others Scientific Process Automation PNNL.
Presented by Jens Schwidder Tara D. Gibson James D. Myers Computing & Computational Sciences Directorate Oak Ridge National Laboratory Scientific Annotation.
The EDGeS project receives Community research funding 1 Porting Applications to the EDGeS Infrastructure A comparison of the available methods, APIs, and.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
A Software Framework for Distributed Services Michael M. McKerns and Michael A.G. Aivazis California Institute of Technology, Pasadena, CA Introduction.
Scientific Workflow systems: Summary and Opportunities for SEEK and e-Science.
Toward interactive visualization in a distributed workflow Steven G. Parker Oscar Barney Ayla Khan Thiago Ize Steven G. Parker Oscar Barney Ayla Khan Thiago.
Recording Actor Provenance in Scientific Workflows Ian Wootten, Shrija Rajbhandari, Omer Rana Cardiff University, UK.
Progress on Component-Based Subsurface Simulation I: Smooth Particle Hydrodynamics Bruce Palmer Pacific Northwest National Laboratory Richland, WA.
Clotho in Kepler Help sharing Clotho’s awesomeness to the world Use scientific workflow to create, reuse, share and extend Clotho’s operations.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
Satisfying Requirements BPF for DRA shall address: –DAQ Environment (Eclipse RCP): Gumtree ISEE workbench integration; –Design Composing and Configurability,
A computer contains two major sets of tools, software and hardware. Software is generally divided into Systems software and Applications software. Systems.
Holding slide prior to starting show. Lessons Learned from the GECEM Portal David Walker Cardiff University
DEIMOS – 17 January Ontogrid Presentation OntoGrid Business Case, User Requirements Analysis and Test Set Definition For Quality Analysis Platform.
Simulation of O2 offline processing – 02/2015 Faculty of Electrical Engineering, Mechanical Engineering and Naval Architecture Eugen Mudnić.
GWE Core Grid Wizard Enterprise (
SDM workshop Strawman report History and Progress and Goal.
Overview of Workflows: Why Use Them?
Presentation transcript:

Using Kepler to Perform Parameter Studies in Subsurface Sciences Jared Chase Scientific Data Management CET All Hands Meeting 11/28/2007

2 Project Descriptions/Goals CHIPPS (Tim Scheibe, Environmental Technology) Project Goal: To develop an integrated multiscale modeling framework with the capability of directly linking different process models at continuum, pore, and sub-pore scales. SALSSA (Karen Schuchardt, Applied Computer Science) Project Goal: To develop a process integration framework that combines and extends leading edge technologies for process automation, data and metadata management, and large-scale data visualization. GWACCAMOLE (Bruce Palmer, High Performance Computing) Project Goal: To apply a component-based framework to the development of a new hybrid model for performing subsurface simulations that will combine different physical models into a single coherent simulation.

3 Different Scale Models Continuum PoreSub-Pore

4 Calcite Precipitation Problem Interested in understanding (and ultimately controlling) the distribution of solid minerals that form from reaction of two dissolved chemicals (solutes). This study will allow us to gain understanding of the impact of either high- or low-permeability inclusions along a mixing pathway on the effectiveness of mixing. The results of modeling studies such as this will be used to design mesoscale laboratory experiments to validate our conclusions, which will in turn be used to design field-scale pilot and full-scale implementation strategies.

5 Hybrid Multiscale Modeling Benchmark Problem

6

7 Project SALSSA’s Goals and Requirements Create a System that … Automates and integrates research processes. Provides records for verifiability. Shares and documents: data, results, tools, and hopefully processes. Can be used by all types of users; model developers on down to experimentalists. Has longevity so scientists can modify the system to suit their needs.

8 Numerical Model Configuration Fixed 1 Uncertain 1 Initial Grid Generation 2 Data Preparation And Management 12 Grid Parameter Specification 3 Mathematical Model Definition 11 Run Numerical Model 4 Continuum Workflow Grid Refinement 9 Output Visualization 5 Output Analysis 6 Comparative Analysis (with results of previous runs and/or observational data) Done? Stop Yes No – Refine Grid Parameter Modification 10 No – Modify Parameter(s) No – Modify Model Qualitative / Quantitative Comparisons 7 Horizontal Flow Simulation 8 Summary Graphics 14 Simulation Data Management (I/O Documentation and Storage) 13

9 Calcite Precipitation Use Case Create stomp study 1.First run a single job with both the porosity and permeability the same to serve as a base case. 2.Next run a set of jobs where the fine sand (material 2) becomes progressively less permeable (decrease value by 10, 100, 1000). keep porosity the same as case #1. 3.Starting with settings from #1, increase permeability by 10, 100,1000. hold porosity the same. 4.Starting with settings from #1, keep permeability the same but decrease porosity by.05 for a couple of iterations. Again this applies to the find sand. 5. Take result where we decreased permeability by 10 and use it to create a new study. Its not clear to me why you would start a new study. Maybe its just an artificial case of notion of making a new study? We could also use the case of switching to a finer grid as the cause for a new study if you think its less artificial. Stomp Run:[Permeability (init) = Porosity (init)] Stomp Run:[Permeability = Permeability (init) * 0.1] Stomp Run:[Permeability = Permeability (init) * 0.01] Stomp Run:[Permeability = Permeability (init) * 0.001] Stomp Run:[Porosity = Porosity (init) * 0.05] Stomp Run:[Porosity = Porosity (init) * 0.10] Stomp Run:[Porosity = Porosity (init) * 0.15] Stomp Run:[Permeability = Permeability (init) * 10] Stomp Run:[Permeability = Permeability (init) * 100] Stomp Run:[Permeability = Permeability (init) * 1000] Stomp Study Create New Stomp Study ?

10 SALSSA Components and Architecture Data Services Provenance Store Content Store Archive Content management via Alfresco User services Pluggable metadata extraction Provenance in Sesame RDF store Provenance and data management RDF, file transfer Organizer User ToolsEditors Central organizing tool Long term interactive workflows Data organization & access Automated Workflow Job Execution Parameter Studies Job Monitoring Data Archiving Analysis “Jobs” Update Messages Provenance Recording (RDF) Analysis & Visualization Multiple viz tools Techplot, Visit… Parallel Visualization Hybrid visualizations Data Analysis Translation/ Analysis workflows

11 Applying Kepler to Subsurface Research Workflow Using Kepler as an End User Tool Approach End users are able to add components and tools. End users can manage their own processes using Kepler. End users would create their own workflows using pre-made higher level actor abstractions. Conclusion Kepler/Virgil is NOT suitable for end users. Most of this pertains to “workflow designers” as well. Complex Types Type Checking Recording of Provenance Animation Creating Actors Managing technology Multiple Instances of Kepler Robustness

12 Applying Kepler to Subsurface Research Workflow Using Kepler for Job Execution Execute Parameter Studies and Sensitivity Studies Launch and monitor multiple jobs using various queuing systems: SGE, LSF (mpp2), fork. Monitor each job within the workflow. Notify other tools of job state. Move input/output files. Workflow Provenance Capture Working to define an API specific for provenance capture.

13 Issues To Address In Kepler Performance You can only have one instance of Kepler running on the client machine at one time. Kepler takes up a lot of memory. Possibly there could be a mechanism for packaging just the parts you use. Kepler take a long time to startup. Building Workflows No simple plug in model (ala spring). A mechanism to reuse/extend existing code instead of writing new custom classes (i.e. a framework for connecting existing components instead of framework to develop components). Better documentation for actors so that the end user does is not required to read code to understand components and know which you can hook up. Components at too low of a level. There is a need for high level components for job launching/monitoring/file movement. Support for parameter studies including a component for load balancing across machines. A system built for extensibility to complex and semantic data types. A set of actors should be built for easy iteration and parameter studies. More control is needed within execution domains. (i.e. Using PN Directors inside composite actors when a PN Director is used in the parent workflow.

14 OrganizerOrganizer

15 Stomp Input Wizard

16 DemoDemo Workflow Parameters numInstances: the number of jobs that the workflow will execute. InputData: Input data for each of the jobs.

17 Next Generation Setup Stomp User works within a “Study” where a Study can be represented as a graph of processes and data inputs/outputs. Some processes are triggered by the user, others appear as by-products of user actions. Stomp.in parameters Launch Stomp1.in Job outputs Some Analysis graphicsLaunchparameters Some Analysis More data graphics more Analysis 1. Baseline computation Setup Stomp branch 2. Vary permeability in material 2 … Stomp1.in Stomp2.in Job outputs Stomp2.in Job outputs 3. Vary other parameters…

18 Future SALSSA Work Deploy current tools to INEL to support experimental work on calcite precipitation problem Apply current parameter study workflow to the CCA-based SPH code under development by Bruce Palmer Integrate SciDAC Visualization and Analysis capabilities Work with SDM center to developer higher level Kepler components Job launching, file movement, realtime monitoring A workflow environment that combines interactive and automated workflow into one environment with appropriate user abstractions Connect all steps into a meta workflow through provenance User control over details of view Different views of data lineage, processing steps Extend Stomp UI wrapper to support more input options Support Hybrid model and the additional processing required for setup, execution, and analysis

19 AcknowledgmentAcknowledgment Funding for this research is provided by the U. S. Department of Energy through the following programs: Office of Science, Biological and Environmental Research and Advanced Scientific Computing Research, Scientific Discovery through Advanced Computing (SciDAC) program. Office of Science, Biological and Environmental Research, Environmental Remediation Sciences Program (ERSP).