Pyre: a distributed component framework Michael Aivazis Caltech DANSE Developers Workshop January 22-23, 2007.

Slides:



Advertisements
Similar presentations
MicroKernel Pattern Presented by Sahibzada Sami ud din Kashif Khurshid.
Advertisements

A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
DIGIDOC A web based tool to Manage Documents. System Overview DigiDoc is a web-based customizable, integrated solution for Business Process Management.
Ch:8 Design Concepts S.W Design should have following quality attribute: Functionality Usability Reliability Performance Supportability (extensibility,
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Technical Architectures
ARCS Data Analysis Software An overview of the ARCS software management plan Michael Aivazis California Institute of Technology ARCS Baseline Review March.
DANSE Central Services Michael Aivazis Caltech NSF Review May 23, 2008.
Presented by IBM developer Works ibm.com/developerworks/ 2006 January – April © 2006 IBM Corporation. Making the most of Creating Eclipse plug-ins.
© , Michael Aivazis DANSE Software Issues Michael Aivazis California Institute of Technology DANSE Software Workshop September 3-8, 2003.
The ARCS Data Analysis Software Michael Aivazis California Institute of Technology.
CASE Tools CIS 376 Bruce R. Maxim UM-Dearborn. Prerequisites to Software Tool Use Collection of useful tools that help in every step of building a product.
The ARCS Data Analysis Software Michael Aivazis California Institute of Technology.
© , Michael Aivazis DANSE Software Architecture Challenges and opportunities for the next generation of data analysis software Michael Aivazis.
An overview of the DANSE software architecture Michael Aivazis Caltech DANSE Kick-Off Meeting Pasadena Aug 15, 2006.
Software Engineering Module 1 -Components Teaching unit 3 – Advanced development Ernesto Damiani Free University of Bozen - Bolzano Lesson 2 – Components.
Application Architectures Vijayan Sugumaran Department of DIS Oakland University.
Course Instructor: Aisha Azeem
Copyright Arshi Khan1 System Programming Instructor Arshi Khan.
Enterprise Resource Planning
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 18 Slide 1 Software Reuse 2.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 8 Slide 1 Software Prototyping l Rapid software development to validate requirements l.
Windows.Net Programming Series Preview. Course Schedule CourseDate Microsoft.Net Fundamentals 01/13/2014 Microsoft Windows/Web Fundamentals 01/20/2014.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
Architecture Of ASP.NET. What is ASP?  Server-side scripting technology.  Files containing HTML and scripting code.  Access via HTTP requests.  Scripting.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
XForms: A case study Rajiv Shivane & Pavitar Singh.
Metadata Creation with the Earth System Modeling Framework Ryan O’Kuinghttons – NESII/CIRES/NOAA Kathy Saint – NESII/CSG July 22, 2014.
An Introduction to Software Architecture
SOFTWARE DESIGN AND ARCHITECTURE LECTURE 07. Review Architectural Representation – Using UML – Using ADL.
Architecting Web Services Unit – II – PART - III.
BLU-ICE and the Distributed Control System Constraints for Software Development Strategies Timothy M. McPhillips Stanford Synchrotron Radiation Laboratory.
DANSE Central Services Michael Aivazis Caltech NSF Review May 31, 2007.
R R R 1 Frameworks III Practical Issues. R R R 2 How to use Application Frameworks Application developed with Framework has 3 parts: –framework –concrete.
소프트웨어공학 강좌 1 Chap 7. Software Prototyping - Rapid software development to validate requirements -
Selected Topics in Software Engineering - Distributed Software Development.
Middleware for FIs Apeego House 4B, Tardeo Rd. Mumbai Tel: Fax:
Systems Analysis and Design in a Changing World, 3rd Edition
March 27, 2007HPC 07 - Norfolk, VA1 C++ Reflection for High Performance Problem Solving Environments Tharaka Devadithya 1, Kenneth Chiu 2, Wei Lu 1 1.
Adaptable Consistency Control for Distributed File Systems Simon Cuce Monash University Dept. of Computer Science and Software.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
6/12/99 Java GrandeT. Haupt1 The Gateway System This project is a collaborative effort between Northeast Parallel Architectures Center (NPAC) Ohio Supercomputer.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Introduction to soarchitect. agenda SOA background and overview transaction recorder summary.
University of Toronto at Scarborough © Kersti Wain-Bantin CSCC40 system architecture 1 after designing to meet functional requirements, design the system.
Software Prototyping Rapid software development to validate requirements.
A Software Framework for Distributed Services Michael M. McKerns and Michael A.G. Aivazis California Institute of Technology, Pasadena, CA Introduction.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. Enabling Components Management and Dynamic Execution Semantic.
Abstract A Structured Approach for Modular Design: A Plug and Play Middleware for Sensory Modules, Actuation Platforms, Task Descriptions and Implementations.
© FPT SOFTWARE – TRAINING MATERIAL – Internal use 04e-BM/NS/HDCV/FSOFT v2/3 JSP Application Models.
CSI 3125, Preliminaries, page 1 SERVLET. CSI 3125, Preliminaries, page 2 SERVLET A servlet is a server-side software program, written in Java code, that.
August 2003 At A Glance The IRC is a platform independent, extensible, and adaptive framework that provides robust, interactive, and distributed control.
WebFlow High-Level Programming Environment and Visual Authoring Toolkit for HPDC (desktop access to remote resources) Tomasz Haupt Northeast Parallel Architectures.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
Chapter 1 Basic Concepts of Operating Systems Introduction Software A program is a sequence of instructions that enables the computer to carry.
Java Programming: Advanced Topics 1 Enterprise JavaBeans Chapter 14.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
March 2004 At A Glance The AutoFDS provides a web- based interface to acquire, generate, and distribute products, using the GMSEC Reference Architecture.
INFSO-RI JRA2 Test Management Tools Eva Takacs (4D SOFT) ETICS 2 Final Review Brussels - 11 May 2010.
David Adams ATLAS ATLAS Distributed Analysis and proposal for ATLAS-LHCb system David Adams BNL March 22, 2004 ATLAS-LHCb-GANGA Meeting.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
LOCO Extract – Transform - Load
Hierarchical Architecture
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Design and Maintenance of Web Applications in J2EE
Service-centric Software Engineering
Analysis models and design models
An Introduction to Software Architecture
Design Yaodong Bi.
Building Pyre Applications
Presentation transcript:

Pyre: a distributed component framework Michael Aivazis Caltech DANSE Developers Workshop January 22-23, 2007

2 Overview What is a distributed component framework? what is a component? what is a framework? why be distributed? Why bother building a framework? is it the solution to any relevant problem? is it the right solution? High level description of the specific solution provided by pyre

3 Pyre overview Projects Caltech ASC Center (DOE) Computational Infrastructure in Geodynamics (NSF): DANSE (NSF) Portability: languages: C, C++, F77, F90 compilers: all native compilers on supported platforms, gcc, Absoft, PGI platforms: all common Unix variants, OSX, Windows Statistics: 1200 classes, 75,000 lines of Python, 30,000 lines of C++ Largest run: nirvana at LANL, 1764 processors for 24 hrs, generated 1.5 Tb

4 Flexibility through the use of scripting Scripting enables us to Organize the large number of simulation parameters Allow the simulation environment to discover new capabilities without the need for recompilation or relinking The python interpreter The interpreter modern object oriented language robust, portable, mature, well supported, well documented easily extensible rapid application development Support for parallel programming trivial embedding of the interpreter in an MPI compliant manner a python interpreter on each compute node MPI is fully integrated: bindings + OO layer No measurable impact on either performance or scalability

5 User stereotypes End-user occasional user of prepackaged and specialized analysis tools Application author author of prepackaged specialized tools Expert user investigator with a specific scientific goal Domain expert author of analysis, modeling or simulation software Software integrator responsible for extending software with new technology Framework maintainer responsible for maintaining and extending the infrastructure

6 Facilitating common tasks Interchangeable components; e.g. create and initialize a fluid mesh by reading some geometry input description reading a checkpoint file invoking a user provided callback for setting initial conditions

7 Distributed services Workstation Front end Compute nodes launcher journal monitor solid fluid

8 Pyre: the integration architecture Pyre is a software architecture: a specification of the organization of the software system a description of the crucial structural elements and their interfaces a specification for the possible collaborations of these elements a strategy for the composition of structural and behavioral elements Pyre is multi-layered flexibility complexity management robustness under evolutionary pressures Pyre is a component framework application-general application-specific framework computational engines

9 Example application controller coupler optimizer script gui cgi analysis stager journal archiver monitor viz

10 Component architecture component bindings library extension component bindings custom code core facility framework facility component bindings custom code service requirement implementation package The integration framework is a set of co-operating abstract services FORTRAN/C/C++ python

11 Encapsulating critical technologies Extensibility new algorithms and analysis engines technologies and infrastructure High-end computations visualization easy access to large data sets single runs, backgrounds, archived data metadata distributed computing parallel computing Flexibility: interactivity: web, GUI, scripts must be able to debug almost everything on a laptop

12 Component Component schematic input ports output ports properties component core name control

13 Component anatomy Core: encapsulation of computational engines middleware that manages the interaction between the framework and codes written in low level languages Harness: an intermediary between a component’s core and the external world framework services: control port deployment core services: deployment launching teardown

14 Component core Three tier encapsulation of access to computational engines engine bindings facility implementation by extending abstract framework services Cores enable the lowest integration level available suitable for integrating large codes that interact with one another by exchanging complex data structures UI: text editor facility bindings custom code core

15 Computational engines Normal engine life cycle: deployment staging, instantiation, static initialization, dynamic initialization, resource allocation launching input delivery, execution control, hauling of output teardown resource de-allocation, archiving, execution statistics Exceptional events core dumps, resource allocation failures diagnostics: errors, warnings, informational messages monitoring: debugging information, self consistency checks Distributed computing Parallel processing

16 Component harness The harness collects and delivers user configurable parameters interacts with the data transport mechanisms guides the core through the various stages of its lifecycle provides monitoring services Parallelism and distributed computing are achieved by specialized harness implementations The harness enables the second level of integration adding constraints makes code interaction more predictable provides complete support for an application generic interface

17 Support for concurrent applications Python as the driver for concurrent applications that are embarrassingly parallel have custom communication strategies sockets, ICE, shared memory Excellent support for MPI mpipython.exe : MPI enabled interpreter (needed only on some platforms) mpi : package with python bindings for MPI support for staging and launching communicator and processor group manipulation support for exchanging python objects among processors mpi.Application : support for launching and staging MPI applications descendant of pyre.application.Application auto-detection of parallelism fully configurable at runtime used as a base class for user defined application classes

18 Support for distributed computing We are in the process of migrating the existing support for distributed processing into gsl, a new package that completely encapsulates the middleware Provide both user space and grid-enabled solution User space: ssh, scp pyre service factories and component management Web services pyGridWare from Keith Jackson’s group Advanced features dynamic discovery for optimized deployment reservation system for computational resources

19 Ports and pipes Ports further enable the physical decoupling of components by encapsulating data exchange Runtime connectivity implies a two stage negotiation process when the connection is first established, the io ports exchange abstract descriptions of their requirements appropriate encoding and decoding takes place during data flow Pipes are data transport mechanisms chosen for efficiency intra-process or inter-process components need not be aware of the location of their neighbors Standardized data types obviate the need for a complicated runtime typing system meta-data in a format that is easy to parse (XML) tables histograms data pipe ports

20 Component implementation strategy Write engine custom code, third party libraries modularize by providing explicit support for life cycle management implement handling of exceptional events Construct python bindings select entry points to expose Integrate into framework construct object oriented veneer extend and leverage framework services Cast as a component provide object that implements component interface describe user configurable parameters provide meta data that specify the IO port characteristics code custom conversions from standard data streams into lower level data structures All steps are well localized!

21 Cost/benefit Drawbacks some reengineering required paradigm shift learning curve – not helped by the lack of documentation… Benefits clear path forward for “legacy” applications easy, normalized access to large number of facilities structured way for enabling engines in modern computational environments rigorous separation of UI from computational engines easy re-hosting of compliant application

22 Status update Database access backend access data types SQL queries Application hosting (user interfaces) GUI web portals web services Distributed services semi-asynchronous, fully asynchronous authentication, GUIDs, session management, monitoring distributed control layer “steering”

23 Wrap up Expect pyre 1.0 early 2007 largely a documentation effort minor re-design of some internals re-examination of the component-inventory coupling There is a lot of material on the web under extensive reorganization currently at soon to be at Contact info