Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Software Framework for Distributed Services Michael M. McKerns and Michael A.G. Aivazis California Institute of Technology, Pasadena, CA 91125 Introduction.

Similar presentations


Presentation on theme: "A Software Framework for Distributed Services Michael M. McKerns and Michael A.G. Aivazis California Institute of Technology, Pasadena, CA 91125 Introduction."— Presentation transcript:

1 A Software Framework for Distributed Services Michael M. McKerns and Michael A.G. Aivazis California Institute of Technology, Pasadena, CA 91125 Introduction Deployment Model Interfacing to the Physical ModelExecution and Deployment Strategy References Acknowledgements Handling Iterators with Services The authors acknowledge portions of this work developed as part of the DANSE project, funded by the NSF under grant DMR-0520547, and the Caltech PSAAP Center, funded by DOE under grant DE-FC52-08NA28613. [1] pyre http://danse.us/trac/pyre [2] DANSE http://danse.us [3] PARK http://www.reflectometry.org/danse/park.html [4] SciPy http://scipy.org Parameter sensitivity and uncertainty quantification (PS/UQ) are global optimization problems requiring efficient exploration of a huge parameter space. The need to stage, monitor, and analyze the output of thousands of simulations against the backdrop of constantly shifting computational resources is a complex problem. Managing such a complex computational environment requires a sophisticated software infrastructure. Pyre [1], the DANSE [2] component framework, has partial support for much of the required underlying infrastructure and basic services for distributed and parallel computing. The distributed service framework is being developed in Pyre. Component-based solutions, like Pyre, encourage the decomposition of the software development into manageable functional units. Each different type of component has a specific job it performs -- where simplicity of design and encapsulation of responsibility within a single entity provides greater flexibility and robustness in the applications that can be built from a given set of software components. If we define a SERVICE as the simplest fundamental unit capable of doing the “real work” of iterating parameters within a given physical model, then we have encapsulated the responsibility to update the physical models within a single entity. Thus the core of an optimization framework built on the notion of “fit services” can be abstracted away from creation, monitoring, and control of these services. The central element in the design of the distributed service framework is the JOB MANAGER. The Job Manager is responsible for creating and managing jobs. The Job Manager contains:  a STAGER and a LAUNCHER, which are responsible for composing and launching new jobs.  the JOB REGISTRY, which maintains a unique identifier for each launched job.  a BROADCASTER that sends requests to existing jobs, including execution control directives such as: “pause job”, “resume job”, “get current best-fit”, “get current fit environment”, and “kill job”.  a LISTENER that subscribes to events broadcast from launched jobs (i.e. services).  the REQUEST HANDLER, which processes requests from the client or from an automated script. In the above figure, the Job Manager communicates with launched jobs through the Multiplexer, and with the User through the Request Handler. The Job Manager manages the framework’s distributed services. A Service contains the execution strategies to perform the required optimizations. The distributed framework is configured by connecting services under the control of the Job Manager. While Launchers are responsible for initiating execution, a Strategy contains all of the rules for directing workflow between or within the various optimization services. There are a few basic types of Strategies:  a SingleIterator Strategy directs the Job Manager to have a single Iterator within a single Job to act on a Model  ConcurrentIterator and CascadedIterator Strategies direct the Job Manager to execute Jobs in parallel and in series, respectively  Adaptive Strategies couple multiple Iterators together, where the CoupledAdaptiveIterator Strategy utilizes two- way communication. Strategies can be grouped into three interaction types: “Single” which only provides passing of control to a single Iterator, “Coordination” which involves Iterators that are nested, cascaded, concurrent, or interactive, and “Parallelism” which involves Iterators that communicate with asynchronous local, message passing, or nested (master/slave or peer/static) mechanisms. Launchers contain the logic required to initiate execution on the selected platform for the selected network configuration. All communication across process boundaries in the framework are sent as events; thus extending the pipeline to distributed computing is a matter of building an appropriate Launcher and Controller. On a more basic level, a Model is composed of PARAMETERS, RESPONSE, and INTERFACE. A Response is composed of the function, gradient, and Hessian evaluations. Function evaluations are composed of objectives, constraints, least square terms, etc; while, gradients and Hessians can either be numerical or analytic. The Parameters are collections of the fit variables organized as best to suit the selected Strategy. The simplest of Models are a girdle around a single forward Solver. Models can also allow increasing levels of complexity. For example, one mechanism for coupling a two Solvers within a single PARK model is shown on the left. The framework is configured to process a Service. A Service follows a Deployment Strategy to execute one or more ITERATOR (i.e. inverse solver). An Iterator acts on a MODEL (i.e. forward solver) to determine the optimal Model parameter values. The Model takes a set of parameters and produces a response, while the Iterator produces the next set of Model parameters based on the response history. A Model contains all of the physics in the optimization problem, and is handled directly in PARK. PARK can provide the mechanisms for wrapping the physical models into a COSTFUNCTION. A Costfunction contains a Cost Metric that the Iterator measures progress against. The use of a Model Factory and a Cost Factory ensure the proper functor relationship is built between a Costfunction and an Iterator. Iterators built from algorithms within PARK [3] (and other optimization toolkits such as SciPy [4] ) are augmented with Monitors and an Event Handler that can, respectively, broadcast iteration results and respond to control events sent by other parts of the framework. The mapping of an Iterator to a distributed Service utilizes the same design as the mapping a physical Model to an Iterator. The deployment Strategy contains logic required to parse the workload across distributed resources. One of the simplest Cost Metrics is the difference of squares. The framework also provides tools to construct more complex ways of measuring solver progress. A Service knows nothing about optimization or physical models, but is a mechanism for coupling solvers together to provide the complexity required for Parameter Sensitivity or Uncertainty Quantification. The Iterator acts on a Costfunction to provide the next set of values for the evaluation parameters. The complexity of the workflow is contained purely within the Strategy. The simplest opimization stragegy involves a single Iterator acting on a single Model. However, by providing a Coupler within an Iterator or a Model allows complex strategies (e.g. Models with nested Iterators) to be built with the same fundamental unit (i.e. a Service).


Download ppt "A Software Framework for Distributed Services Michael M. McKerns and Michael A.G. Aivazis California Institute of Technology, Pasadena, CA 91125 Introduction."

Similar presentations


Ads by Google