Presentation on theme: "Using Kepler to Perform Parameter Studies in Subsurface Sciences Jared Chase Scientific Data Management CET All Hands Meeting 11/28/2007"— Presentation transcript:
Using Kepler to Perform Parameter Studies in Subsurface Sciences Jared Chase Scientific Data Management CET All Hands Meeting 11/28/2007 http://subsurface.pnl.gov
2 Project Descriptions/Goals CHIPPS (Tim Scheibe, Environmental Technology) Project Goal: To develop an integrated multiscale modeling framework with the capability of directly linking different process models at continuum, pore, and sub-pore scales. SALSSA (Karen Schuchardt, Applied Computer Science) Project Goal: To develop a process integration framework that combines and extends leading edge technologies for process automation, data and metadata management, and large-scale data visualization. GWACCAMOLE (Bruce Palmer, High Performance Computing) Project Goal: To apply a component-based framework to the development of a new hybrid model for performing subsurface simulations that will combine different physical models into a single coherent simulation.
4 Calcite Precipitation Problem Interested in understanding (and ultimately controlling) the distribution of solid minerals that form from reaction of two dissolved chemicals (solutes). This study will allow us to gain understanding of the impact of either high- or low-permeability inclusions along a mixing pathway on the effectiveness of mixing. The results of modeling studies such as this will be used to design mesoscale laboratory experiments to validate our conclusions, which will in turn be used to design field-scale pilot and full-scale implementation strategies.
7 Project SALSSA’s Goals and Requirements Create a System that … Automates and integrates research processes. Provides records for verifiability. Shares and documents: data, results, tools, and hopefully processes. Can be used by all types of users; model developers on down to experimentalists. Has longevity so scientists can modify the system to suit their needs.
8 Numerical Model Configuration Fixed 1 Uncertain 1 Initial Grid Generation 2 Data Preparation And Management 12 Grid Parameter Specification 3 Mathematical Model Definition 11 Run Numerical Model 4 Continuum Workflow Grid Refinement 9 Output Visualization 5 Output Analysis 6 Comparative Analysis (with results of previous runs and/or observational data) Done? Stop Yes No – Refine Grid Parameter Modification 10 No – Modify Parameter(s) No – Modify Model Qualitative / Quantitative Comparisons 7 Horizontal Flow Simulation 8 Summary Graphics 14 Simulation Data Management (I/O Documentation and Storage) 13
9 Calcite Precipitation Use Case Create stomp study 1.First run a single job with both the porosity and permeability the same to serve as a base case. 2.Next run a set of jobs where the fine sand (material 2) becomes progressively less permeable (decrease value by 10, 100, 1000). keep porosity the same as case #1. 3.Starting with settings from #1, increase permeability by 10, 100,1000. hold porosity the same. 4.Starting with settings from #1, keep permeability the same but decrease porosity by.05 for a couple of iterations. Again this applies to the find sand. 5. Take result where we decreased permeability by 10 and use it to create a new study. Its not clear to me why you would start a new study. Maybe its just an artificial case of notion of making a new study? We could also use the case of switching to a finer grid as the cause for a new study if you think its less artificial. Stomp Run:[Permeability (init) = Porosity (init)] Stomp Run:[Permeability = Permeability (init) * 0.1] Stomp Run:[Permeability = Permeability (init) * 0.01] Stomp Run:[Permeability = Permeability (init) * 0.001] Stomp Run:[Porosity = Porosity (init) * 0.05] Stomp Run:[Porosity = Porosity (init) * 0.10] Stomp Run:[Porosity = Porosity (init) * 0.15] Stomp Run:[Permeability = Permeability (init) * 10] Stomp Run:[Permeability = Permeability (init) * 100] Stomp Run:[Permeability = Permeability (init) * 1000] Stomp Study 1. 2. 3. 4. 5. Create New Stomp Study ?
10 SALSSA Components and Architecture Data Services Provenance Store Content Store Archive Content management via Alfresco User services Pluggable metadata extraction Provenance in Sesame RDF store Provenance and data management RDF, file transfer Organizer User ToolsEditors Central organizing tool Long term interactive workflows Data organization & access Automated Workflow Job Execution Parameter Studies Job Monitoring Data Archiving Analysis “Jobs” Update Messages Provenance Recording (RDF) Analysis & Visualization Multiple viz tools Techplot, Visit… Parallel Visualization Hybrid visualizations Data Analysis Translation/ Analysis workflows
11 Applying Kepler to Subsurface Research Workflow Using Kepler as an End User Tool Approach End users are able to add components and tools. End users can manage their own processes using Kepler. End users would create their own workflows using pre-made higher level actor abstractions. Conclusion Kepler/Virgil is NOT suitable for end users. Most of this pertains to “workflow designers” as well. Complex Types Type Checking Recording of Provenance Animation Creating Actors Managing technology Multiple Instances of Kepler Robustness
12 Applying Kepler to Subsurface Research Workflow Using Kepler for Job Execution Execute Parameter Studies and Sensitivity Studies Launch and monitor multiple jobs using various queuing systems: SGE, LSF (mpp2), fork. Monitor each job within the workflow. Notify other tools of job state. Move input/output files. Workflow Provenance Capture Working to define an API specific for provenance capture.
13 Issues To Address In Kepler Performance You can only have one instance of Kepler running on the client machine at one time. Kepler takes up a lot of memory. Possibly there could be a mechanism for packaging just the parts you use. Kepler take a long time to startup. Building Workflows No simple plug in model (ala spring). A mechanism to reuse/extend existing code instead of writing new custom classes (i.e. a framework for connecting existing components instead of framework to develop components). Better documentation for actors so that the end user does is not required to read code to understand components and know which you can hook up. Components at too low of a level. There is a need for high level components for job launching/monitoring/file movement. Support for parameter studies including a component for load balancing across machines. A system built for extensibility to complex and semantic data types. A set of actors should be built for easy iteration and parameter studies. More control is needed within execution domains. (i.e. Using PN Directors inside composite actors when a PN Director is used in the parent workflow. http://www.mail- email@example.com/msg00381.html)
16 DemoDemo Workflow Parameters numInstances: the number of jobs that the workflow will execute. InputData: Input data for each of the jobs.
17 Next Generation Setup Stomp User works within a “Study” where a Study can be represented as a graph of processes and data inputs/outputs. Some processes are triggered by the user, others appear as by-products of user actions. Stomp.in parameters Launch Stomp1.in Job outputs Some Analysis graphicsLaunchparameters Some Analysis More data graphics more Analysis 1. Baseline computation Setup Stomp branch 2. Vary permeability in material 2 … Stomp1.in Stomp2.in Job outputs Stomp2.in Job outputs 3. Vary other parameters…
18 Future SALSSA Work Deploy current tools to INEL to support experimental work on calcite precipitation problem Apply current parameter study workflow to the CCA-based SPH code under development by Bruce Palmer Integrate SciDAC Visualization and Analysis capabilities Work with SDM center to developer higher level Kepler components Job launching, file movement, realtime monitoring A workflow environment that combines interactive and automated workflow into one environment with appropriate user abstractions Connect all steps into a meta workflow through provenance User control over details of view Different views of data lineage, processing steps Extend Stomp UI wrapper to support more input options Support Hybrid model and the additional processing required for setup, execution, and analysis
19 AcknowledgmentAcknowledgment Funding for this research is provided by the U. S. Department of Energy through the following programs: Office of Science, Biological and Environmental Research and Advanced Scientific Computing Research, Scientific Discovery through Advanced Computing (SciDAC) program. Office of Science, Biological and Environmental Research, Environmental Remediation Sciences Program (ERSP).