DANSE Distributed Data Analysis for Neutron Scattering Experiments Michael M. McKerns, Michael A.G. Aivazis, Tim M. Kelley, June Kim, and Brent Fultz.

Slides:



Advertisements
Similar presentations
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Advertisements

Report from DANSE Workshop Sept. 3-8, 2003 Goals: 1) To explain DANSE to selected scientists and engineers who develop software for neutron scattering.
Summary Role of Software (1 slide) ARCS Software Architecture (4 slides) SNS -- Caltech Interactions (3 slides)
© 2006 IBM Corporation IBM Software Group Relevance of Service Orientated Architecture to an Academic Infrastructure Gareth Greenwood, e-learning Evangelist,
CIM2564 Introduction to Development Frameworks 1 Overview of a Development Framework Topic 1.
ARCS Data Analysis Software An overview of the ARCS software management plan Michael Aivazis California Institute of Technology ARCS Baseline Review March.
Inelastic Neutron Scattering B. Fultz, J. Lin, O. Delaire, M. Kresch Caltech Science interests Where is the field going? Tasks and goals of the DANSE subproject.
Baseline Review The Path of ARCS from Science to a Project Brent Fultz California Institute of Technology.
Software Project Brent Fultz California Institute of Technology Issues Specifications Algorithms Web service model Plan for a plan.
Experimental Facilities DivisionORNL - SNS June 22, 2004 SNS Update – Team Building Steve Miller June 22, 2004 DANSE Meeting at Caltech.
DCS Architecture Bob Krzaczek. Key Design Requirement Distilled from the DCS Mission statement and the results of the Conceptual Design Review (June 1999):
CS 501: Software Engineering Fall 2000 Lecture 16 System Architecture III Distributed Objects.
Software Technology Workshop Brent Fultz Welcome to the DANSE How to explain DANSE Component example Immediate tasks.
Software Project Brent Fultz California Institute of Technology Software Functions Full Experiment Simulations Inversions of Dynamics Models.
DANSE Central Services Michael Aivazis Caltech NSF Review May 23, 2008.
Java Programming, 3e Concepts and Techniques Chapter 1 An Introduction to Java and Program Design.
© , Michael Aivazis DANSE Software Issues Michael Aivazis California Institute of Technology DANSE Software Workshop September 3-8, 2003.
The ARCS Data Analysis Software Michael Aivazis California Institute of Technology.
Introduction to DANSE Brent Fultz Prof. Materials Science and Applied Physics California Institute of Technology Distributed Data Analysis Architecture.
The ARCS Data Analysis Software Michael Aivazis California Institute of Technology.
© , Michael Aivazis DANSE Software Architecture Challenges and opportunities for the next generation of data analysis software Michael Aivazis.
An overview of the DANSE software architecture Michael Aivazis Caltech DANSE Kick-Off Meeting Pasadena Aug 15, 2006.
SNS Update DANSE Workshop Steve Miller September 20-21, 2004.
Pyre: a distributed component framework Michael Aivazis Caltech DANSE Developers Workshop January 22-23, 2007.
Brent Fultz Prof. Materials Science and Applied Physics California Institute of Technology ARCS Project Inelastic Scattering Scope of Software Project.
Distributed Systems: Client/Server Computing
DAVE: Cooperative Development of Data Visualization and Analysis Software Rob Dimeo NIST Center for Neutron Research What is DAVE? The DAVE team Motivation.
–Streamline / organize Improve readability of code Decrease code volume/line count Simplify mechanisms Improve maintainability & clarity Decrease development.
G RID R ESOURCE BROKER FOR SCHEDULING COMPONENT - BASED APPLICATIONS ON DISTRIBUTED RESOURCES Reporter : Yi-Wei Wu.
Java Programming, 2E Introductory Concepts and Techniques Chapter 1 An Introduction to Java and Program Design.
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
An Introduction to Software Architecture
A Data Analysis Framework for the Neutron Community Michael M. McKerns Materials Science and Applied Physics Center for Advanced Computing Research California.
Nick Draper Teswww.mantidproject.orgwww.mantidproject.org Instrument Independent Reduction and Analysis at ISIS and SNS.
Magnetic Field Measurement System as Part of a Software Family Jerzy M. Nogiec Joe DiMarco Fermilab.
DANSE Diffraction Software for the SNS: DiffDANSE S.J.L. Billinge Dept. Physics and Astronomy Michigan State University.
Through the development of advanced middleware, Grid computing has evolved to a mature technology in which scientists and researchers can leverage to gain.
BLU-ICE and the Distributed Control System Constraints for Software Development Strategies Timothy M. McPhillips Stanford Synchrotron Radiation Laboratory.
DANSE Central Services Michael Aivazis Caltech NSF Review May 31, 2007.
Brent Fultz; Co-PIs are Michael Aivazis, Ian Anderson; PM is Mike McKerns California Institute of Technology.
Center for Component Technology for Terascale Simulation Software CCA is about: Enhancing Programmer Productivity without sacrificing performance. Supporting.
Extending the Neutron Scientist’s Toolkit Michael McKerns Materials Science and Applied Physics Center for Advanced Computing Research California Institute.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
Office Business Applications Workshop Defining Business Process and Workflows.
L6-S1 UML Overview 2003 SJSU -- CmpE Advanced Object-Oriented Analysis & Design Dr. M.E. Fayad, Professor Computer Engineering Department, Room #283I College.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Introduction to soarchitect. agenda SOA background and overview transaction recorder summary.
Computing and SE II Chapter 9: Design Methods and Design Models Er-Yu Ding Software Institute, NJU.
Distribution and components. 2 What is the problem? Enterprise computing is Large scale & complex: It supports large scale and complex organisations Spanning.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
The EDGeS project receives Community research funding 1 Porting Applications to the EDGeS Infrastructure A comparison of the available methods, APIs, and.
A Software Framework for Distributed Services Michael M. McKerns and Michael A.G. Aivazis California Institute of Technology, Pasadena, CA Introduction.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. Enabling Components Management and Dynamic Execution Semantic.
Nick Draper Tessella Instrument Independent Reduction and Analysis at ISIS and SNS.
ProActive components and legacy code Matthieu MOREL.
© 2006 Pearson Addison-Wesley. All rights reserved 2-1 Chapter 2 Principles of Programming & Software Engineering.
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
SOFTWARE ENGINEERING. Objectives Have a basic understanding of the origins of Software development, in particular the problems faced in the Software Crisis.
August 2003 At A Glance The IRC is a platform independent, extensible, and adaptive framework that provides robust, interactive, and distributed control.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
The Integrated Spectral Analysis Workbench (ISAW) DANSE Kickoff Meeting, Aug. 15, 2006, D. Mikkelson, T. Worlton, Julian Tao.
David Adams ATLAS ATLAS Distributed Analysis and proposal for ATLAS-LHCb system David Adams BNL March 22, 2004 ATLAS-LHCb-GANGA Meeting.
ISWG / SIF / GEOSS OOSSIW - November, 2008 GEOSS “Interoperability” Steven F. Browdy (ISWG, SIF, SCC)
Elements of LCG Architecture Application Architecture Blueprint RTAG 8 th June 2002 P. Mato / CERN.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
Deployment of Flows Loretta Auvil
reduction data treatment for ARCS
Design and Manufacturing in a Distributed Computer Environment
An Introduction to Software Architecture
Grid Systems: What do we need from web service standards?
Presentation transcript:

DANSE Distributed Data Analysis for Neutron Scattering Experiments Michael M. McKerns, Michael A.G. Aivazis, Tim M. Kelley, June Kim, and Brent Fultz Materials Science and Applied Physics California Institute of Technology

Abstract The DANSE system will merge the various computational tasks of neutron scattering into a unified, component based run-time environment. Standard components will implement data analysis, visualization, modeling, and instrument simulation for all areas of neutron scattering. A core technology of DANSE is an open source framework that supports the components and mediates their interactions. Within the DANSE environment, users will be able to mix and match different software components without compilation, and execute calculations seamlessly across distributed resources. DANSE will provide tools to help instrument scientists and expert users migrate their existing routines (written in any number of languages) to components, and an interface that will allow new and casual users to access a stock set of standard analysis applications or configure their own new computing procedures for novel experiments. The modular structure of DANSE parallels the steps of data analysis performed by scientists, thus making it a natural environment for creating flexible computing procedures. DANSE will lower barriers to sharing software, and extend the experimentalist’s toolkit with capabilities of analysis and interpretation such as high-performance simulations (band structure, molecular dynamics, etc.), co-analysis of data from multiple experiments, and real-time feedback for experimental control.

An introduction to DANSE DANSE is a community organizing project with the potential to provide a unique facility/user interaction: –a single environment for data analysis, visualization, modeling, and instrument simulation for all areas of neutron scattering –a collaborative effort between software professionals, neutron scattering scientists, and facilities –provides tools for remote collaboration and co-analysis –support from members of the international community and from the directors of SNS, IPNS, HFIR/CNS, Lujan Center, NCNR –potentially the software environment for all instruments at the SNS DANSE provides a unified component-based runtime environment for computational neutron scattering: –open-source framework provides seamless use of distributed and high- performance resources –a flexible, extensible, dynamic, interactive, cross-platform, cross-compiler, object-oriented software architecture –integration of legacy codes and community-standard software –well suited for the development of new science, standard stock computation, quality and plausibility assessment, and as a educational tool

Tools for each level of user Beginning student –user of prepackaged tools and documentation as a learning environment Visiting scientist –user of prepackaged and specialized analysis tools Instrument scientist –author of prepackaged specialized tools Analysis expert –author of analysis, modeling or simulation software Established researcher –collaboration coordinator, designer of new analysis procedures Software integrator –responsible for extending software with new technology Framework maintainer –responsible for maintaining and extending the DANSE infrastructure

Encourages Better Science More science from experiment execution –Single crystals on chopper spectrometers –Feedback control for engineering diffraction –Alter experiment depending on results: visualization of science trends, not data trends e.g., see structure, not I(Q) on-demand modeling, ab-initio calculations reality checks against scattering theory Better science by planning experiments –Plausibility tests before submitting a proposal –Assessment of sample plus instrument –Contingency planning using prior simulations –Assessments of trends in previous data

Facilitates New Science New science with better data analysis –FEM calculations of strains in microstructures –Monte-Carlo inversions of S(Q,E) to obtain parameters of structure and dynamics models –Model refinements with multiple data sets. New science by leveraging theory –VASP, CASTEP, ABINIT are commodities today; use them for assessing structures and dynamics. –Micromechanics – correlations of local strains –Phase diagrams – thermodynamic functions –Ab-initio calculations of spin interactions –Soft matter structure – atomic force fields guided by diffraction

Simulation and plausibility testing on virtual instruments

Ni Pd Pt

Phonon Partition Function fcc Ni for E,g_E in spectrum: Z *= one_osc(E,T) ** g_E

Built on the Pyre integration architecture Pyre is a robust, stable foundation –75,000 lines of Python; 30,000 lines of C++ –multiply leveraged DoE ASCI project Pyre is a software architecture: –a specification of the organization of the software system –a description of the crucial structural elements and their interfaces –a specification for the possible collaborations of these elements –a strategy for the composition of structural and behavioral elements Pyre is multi-layered –flexibility –complexity management –robustness under evolutionary pressures Pyre is a component framework application-general application-specific framework computational engines

Component architecture component bindings library extension component bindings custom code core facility framework facility component bindings custom code service requirement implementation package The integration framework is a set of co-operating abstract services FORTRAN/C/C++ python

ANL LANL NIST ISIS java F77 IDL Matlab ISAW GSAS DAVE Mslice … A Path for Software of Today Finer-Grained Interoperable Components

NeXusReader Selector Bckgrnd Selector Energy NeXusWriter times instrument info raw counts filename time interval energy bins filename Component dataflow Granularity allows reusability of object-oriented components

Component Templates Standard Data Streams Python objects Standard communication protocol between components that can reside anywhere Data Flow Paradigm histograms tables meta-data Code Place Name Place Initiate, terminate, error properties

'''Multiphonon.py Calculates the multiphonon scattering, using a phonon DOS... ''' from mpFunctions import * def run(All_Inputs_List): """Multiphonon.py main loop...""" # check user inputs for validity, get data from disk checkUserInput(input_arglist) setup_arglist = setupRun(run_arglist) # 1-phonon quantities, multiphonon terms single_arglist = onePhonon(arglist) multi_arglist = multiPhonon(N_arglist) # prepare results for output, send to disk, etc. output_arglist = prepareResults(result_arglist) outputResults(output_arglist) return if __name__ == '__main__': """Run main loop if launched standalone.""" from mpUserInput import * run(All_Inputs_List) Encapsulation Abstraction Launched standalone or Inside Analysis Procedure

Component implementation strategy Write engine –custom code, third party libraries –modularize by providing explicit support for life cycle management –implement handling of exceptional events Construct python bindings –select entry points to expose Integrate into framework –construct object oriented veneer –extend and leverage framework services Cast as a component –provide object that implements component interface –describe user configurable parameters –provide meta data that specify the IO port characteristics –code custom conversions from standard data streams into lower level data structures

Flexibility through the use of scripting Scripting enables us to –organize large numbers of user tunable parameters –allow the runtime environment to discover new capabilities without the need for recompilation or relinking –compose computations at runtime The interpretive environment: –Python is a modern object oriented language robust, portable, mature, well supported, well documented easily extensible rapid application development –has been extended to support for parallel programming –has no measurable impact on either performance or scalability

Encapsulating critical technologies Extensibility –new algorithms and analysis engines –technologies and infrastructure High end –visualization –easy access to large data sets single runs, backgrounds, archived data metadata –distributed computing –parallel computing Flexibility: –interactivity: web, GUI, scripts –must be able to do almost everything on a laptop

Data Analysis as a Distributed Service Data analysis is a service controlled by the user User’s laptop issues commands and receives results Computation is arranged by your client software

Support for distributed computing We are in the process of migrating the existing support for distributed processing into gsl, a new package that completely encapsulates the middleware Provide both user space and grid-enabled solution User space: –ssh, scp –pyre service factories and component management Web services –pyglobus Advanced features –dynamic discovery for optimized deployment –reservation system for computational resources

Fultz/Aivazis Billinge Strengthening the neutron community Ustundag Kienzle Butler Fultz/Trouw

3 SNS instruments on-line in IDT instruments PROTONS Engineering Diffractometer – BL 9 Areas for User and Instrument Support SANS – BL 6 Cold Neutron Chopper Spectrometer – BL 5 Magnetism – BL 4a Liquids – BL 4b Reflectometers High Pressure Diffractometer – BL 3 Backscattering Spectrometer – BL 2 Disordered Materials Diffractometer – BL 1b ARCS Spectrometer – BL 18 High Resolution Chopper Spectrometer – BL 17 Single Crystal Diffractometer – BL 12 Fundamental Physics Beamline – BL 13 Powder Diffractometer – BL 11a Powder Diffractometer – BL 11a Software needs to be on-line to support BL 2, 4a, 4b, 5, 18

Crystal modelC1XX, C1XY… Calculate force constant matrix Phi_{alpha beta}(0 l_ kappa kappa_) Sweep reciprocal space Calculate dynamical Matrix D(q) Diagonalize D(q) Update DOS histogram Output DOS Initial guess Compare with experimental DOS Powell minimize n y Ouput force constants End? RMS Converged ?