Loads Balanced with CQoS Nicole Lemaster, Damian Rouson, Jaideep Ray Sandia National Laboratories Sponsor: DOE CCA Meeting – January 22, 2009.

Slides:



Advertisements
Similar presentations
1 Service Oriented Architectures (SOA): What Users Need to Know. OGF 19: January 31, 2007 Charlotte, NC John Salasin, Ph.D, Visiting Researcher National.
Advertisements

ARCHITECTURES FOR ARTIFICIAL INTELLIGENCE SYSTEMS
1 An Adaptive GA for Multi Objective Flexible Manufacturing Systems A. Younes, H. Ghenniwa, S. Areibi uoguelph.ca.
Object-Oriented Software Development CS 3331 Fall 2009.
Point-wise Discretization Errors in Boundary Element Method for Elasticity Problem Bart F. Zalewski Case Western Reserve University Robert L. Mullen Case.
1 Coven a Framework for High Performance Problem Solving Environments Nathan A. DeBardeleben Walter B. Ligon III Sourabh Pandit Dan C. Stanzione Jr. Parallel.
How to: Design and Develop an Application to Ensure its Quality James Hippolite Senior.NET Developer Telecom New Zealand Limited James Hippolite Senior.NET.
Parallelizing stencil computations Based on slides from David Culler, Jim Demmel, Bob Lucas, Horst Simon, Kathy Yelick, et al., UCB CS267.
 delivers evidence that a solution developed achieves the purpose for which it was designed.  The purpose of evaluation is to demonstrate the utility,
Extending the capability of TOUGHREACT simulator using parallel computing Application to environmental problems.
OpenFOAM on a GPU-based Heterogeneous Cluster
Technical Architectures
Software Quality Metrics
Software Version Control SubVersion software version control system WebSVN graphical interface o View version history logs o Browse directory structure.
CSE351/ IT351 Modeling And Simulation Choosing a Mesh Model Dr. Jim Holten.
Kazi Fall 2006 EEGN 4941 EEGN-494 HDL Design Principles for VLSI/FPGAs Khurram Kazi.
Tools for Engineering Analysis of High Performance Parallel Programs David Culler, Frederick Wong, Alan Mainwaring Computer Science Division U.C.Berkeley.
Nick Trebon, Alan Morris, Jaideep Ray, Sameer Shende, Allen Malony {ntrebon, amorris, Department of.
High Performance Computing 1 Parallelization Strategies and Load Balancing Some material borrowed from lectures of J. Demmel, UC Berkeley.
Space Allocation Optimization at NASA Langley Research Center Rex K. Kincaid, College of William & Mary Robert Gage, NASA Langley Research Center Raymond.
Software Issues Derived from Dr. Fawcett’s Slides Phil Pratt-Szeliga Fall 2009.
MCE 561 Computational Methods in Solid Mechanics
Software Process and Product Metrics
1 A survey on Reconfigurable Computing for Signal Processing Applications Anne Pratoomtong Spring2002.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.
Models for Software Reliability N. El Kadri SEG3202.
Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI RD Project Review Meeting Canadian Meteorological Centre August.
An Automated Component-Based Performance Experiment and Modeling Environment Van Bui, Boyana Norris, Lois Curfman McInnes, and Li Li Argonne National Laboratory,
A Digital Age Skill for All [space for presenters name, organization]
CompuCell Software Current capabilities and Research Plan Rajiv Chaturvedi Jesús A. Izaguirre With Patrick M. Virtue.
Working Group on Methodology for Optimizing Multilevel Parallelism Fialho, Gimenez, Tallent, Welton, Morris, Malony, Montoya and Browne.
CQoS Update Li Li, Boyana Norris, Lois Curfman McInnes Argonne National Laboratory Kevin Huck University of Oregon.
Processing of a CAD/CAE Jobs in grid environment using Elmer Electronics Group, Physics Department, Faculty of Science, Ain Shams University, Mohamed Hussein.
Component Infrastructure of CQoS and Its Application in Scientific Computations Li Li 1, Boyana Norris 1, Lois Curfman McInnes 1, Kevin Huck 2, Joseph.
Scaling to New Heights Retrospective IEEE/ACM SC2002 Conference Baltimore, MD.
Stochastic Algorithms Some of the fastest known algorithms for certain tasks rely on chance Stochastic/Randomized Algorithms Two common variations – Monte.
Computer Science Open Research Questions Adversary models –Define/Formalize adversary models Need to incorporate characteristics of new technologies and.
ADVANCED DESIGN APPLICATIONS UNIT 4 - MANUFACTURING © 2015 International Technology and Engineering Educators Association, Learning Cycle Three – Looping.
High Performance Computing 1 Load-Balancing. High Performance Computing 1 Load-Balancing What is load-balancing? –Dividing up the total work between processes.
Test-based programming Ask: “What should this software do?” Write a test first “Does this software do X correctly?” Fill in the code, and keep working.
ENM 503 Lesson 1 – Methods and Models The why’s, how’s, and what’s of mathematical modeling A model is a representation in mathematical terms of some real.
Study Guide Project 1 Ryan Thompson. Workplace Skills. Employability or “Soft Skills,” is often almost as important as your technical skills. It is always.
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
Cmpe 589 Spring 2006 Lecture 2. Software Engineering Definition –A strategy for producing high quality software.
A Software Framework for Distributed Services Michael M. McKerns and Michael A.G. Aivazis California Institute of Technology, Pasadena, CA Introduction.
Enabling Self-management of Component-based High-performance Scientific Applications Hua (Maria) Liu and Manish Parashar The Applied Software Systems Laboratory.
MODEL-BASED SOFTWARE ARCHITECTURES.  Models of software are used in an increasing number of projects to handle the complexity of application domains.
Domain Decomposition in High-Level Parallelizaton of PDE codes Xing Cai University of Oslo.
A Roadmap towards Machine Intelligence
An Evaluation of Partitioners for Parallel SAMR Applications Sumir Chandra & Manish Parashar ECE Dept., Rutgers University Submitted to: Euro-Par 2001.
1 1 Slide Simulation Professor Ahmadi. 2 2 Slide Simulation Chapter Outline n Computer Simulation n Simulation Modeling n Random Variables and Pseudo-Random.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
Data Structures and Algorithms in Parallel Computing Lecture 7.
Robert Aydelotte ExxonMobil - Upstream Technical Computing 13 May 2004 Standardizing Fluid Property Reporting.
Adaptive Integrated Framework (AIF): a new methodology for managing impacts of multiple stressors in coastal ecosystems A bit more on AIF, project components.
Modern Systems Analysis and Design Third Edition Chapter 2 Succeeding as a Systems Analyst 2.1.
Motivation: dynamic apps Rocket center applications: –exhibit irregular structure, dynamic behavior, and need adaptive control strategies. Geometries are.
C OMPUTATIONAL R ESEARCH D IVISION 1 Defining Software Requirements for Scientific Computing Phillip Colella Applied Numerical Algorithms Group Lawrence.
Predictive Load Balancing Using Mesh Adjacencies for Mesh Adaptation  Cameron Smith, Onkar Sahni, Mark S. Shephard  Scientific Computation Research Center.
Center for Component Technology for Terascale Simulation Software (CCTTSS) 110 April 2002CCA Forum, Townsend, TN CCA Status, Code Walkthroughs, and Demonstrations.
Strong Scalability Analysis and Performance Evaluation of a SAMR CCA-based Reacting Flow Code Sophia Lefantzi, Jaideep Ray and Sameer Shende SAMR: Structured.
Quality of Service for Numerical Components Lori Freitag Diachin, Paul Hovland, Kate Keahey, Lois McInnes, Boyana Norris, Padma Raghavan.
An Introduction to Computational Fluids Dynamics Prapared by: Chudasama Gulambhai H ( ) Azhar Damani ( ) Dave Aman ( )
Introduction It had its early roots in World War II and is flourishing in business and industry with the aid of computer.
 Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems n Introduction.
Operations Research Chapter one.
Self-Tuning Memory Management of A Database System
Parallel Programming By J. H. Wang May 2, 2017.
Anne Pratoomtong ECE734, Spring2002
Service Oriented Architectures (SOA): What Users Need to Know.
Presentation transcript:

Loads Balanced with CQoS Nicole Lemaster, Damian Rouson, Jaideep Ray Sandia National Laboratories Sponsor: DOE CCA Meeting – January 22, 2009

Computational Quality of Service Definition: The ability to change a simulation code (a collection of CCA components) on-the-fly in order to maintain optimality –Optimality: Determined by a user-defined cost function of simulation behavior and/or solution properties Choose components to make the resultant code –Fast –Robust –Accurate Component behavior/performance depends on the problem at hand (i.e. the input)‏ –Requirement: quantification of problem “difficulty” –Requirement: metrics for component performance and input difficulty Competing constraints

Theoretical Needs What metrics define performance? –FLOPS, iterations to convergence, % load imbalance What metrics define robustness? –Convergence failure, bad load-imbalance, etc. What metrics define accuracy? –Global, local, statistical, deterministic, etc. All metrics are functions of the component’s input We need a model that, given component inputs and machine characteristics, predicts the component’s performance

Practical Needs To make performance models, we need –A collection of components to choose from –A test harness for components –A performance measurement tool – TAU –A database for storing empirical performance data –Statistical tools for model making To use performance models and make adaptive codes, we need a control system that contains –An optimization system to choose the “best” component –A feedback system that can take corrective action if a bad component is chosen by the optimization system

Our Interests Problem: Create a control system that can choose the best load-balancer for a simulation –Load per grid cell varies in both space and time over Cartesian mesh What numerical techniques lead to imbalance? –Adaptive Cartesian meshes –Some operator-split constructions What applications show such behavior? –Hydrocarbon combustion –Astrophysics – Type II supernovae Net result: Simulation becomes load-imbalanced

Considerations Solution: Repartition frequently using fast, dynamic load-balancer –Speed achieved mainly by sacrificing partition quality –Some are partial to load-balance, others minimize communication time or data migration during repartition Physics and numerics determine if the simulation is computation- or communication-dominated, so –Same load-balancer may not work throughout the run –We need to choose load-balancers anew every time we repartition

Control System Configuration What would a control infrastructure for an analytical control law look like? Partitioner-C DriverPartitioner Mesh Partitions Driver Meta- Partitioner (if-then-else)‏   Mesh Characterizer Partitioner-B Partitioner-A Mesh PartitionsMesh  Mesh,  Control law

Load-Balancer Selection Model the simulation to formulate metrics that depend on the current state –e.g., communication/computation cost Characterize the dynamic load-balancers with simplified metrics –e.g., communication time, data migration effort, grid shape, runtime Develop rules to pair simulation state with appropriate partitioner –Implement a “meta-partitioner” to select a load- balancing partitioner using the rules Essentially, the code adapts to the problem!

Control Systems Research Mostly done by J. Steensland & H. Johansson –Johansson H.; Design and Implementation of a Dynamic and Adaptive Meta-Partitioner for Parallel SAMR Grid Hierarchies Have a set of parameterized load-balancers Modeled relationship between mesh characteristics and load-balancer inputs that lead to optimal partitions Performed tests to predict if, given a mesh, the model can predict the best load-balancer –It cannot predict it reliably, but –Provides set of (~10) candidates that contains best one –Brute-force solution: test all candidates and select best (takes ~10 seconds)‏

Required Components Essentially, things in the CCA toolkit –Simulation components: a mesh, some integrators, some linear solvers, some physics components, etc. –A variety of load-balancers And a control system to choose load-balancers

2D mesh already exists Part of the tutorial and toolkit Parallel capabilities Works in Bocca Used in reaction-diffusion problems, with multiple integration techniques Can accommodate slab-wise and block- wise decomposition No connection to load-balancers yet –Does its own simple domain decomposition Great for tutorials, but too simple for CQoS Mesh Component Status

Over the next 6 months... Extend mesh to 3D to tackle harder problems Lemaster, Stone, & Gardiner (2007)‏

Over the next 6 months... Extend mesh to 3D to tackle harder problems Extend it to incorporate domain decomposition beyond slab- and block-wise

Over the next 6 months... Extend mesh to 3D to tackle harder problems Extend it to incorporate domain decomposition beyond slab- and block-wise –Sub-domains consisting of a disjoint set of abutting rectangles Design ports to load-balancers Identify more interesting applications for use in CQoS testing –Construct any extra components needed –Solve the problem; quantify the degree of difficulty

Results to come! Contact info: Nicole Lemaster Sandia National Labs