Algebraic Solvers in FASTMath Argonne Training Program on Extreme-Scale Computing August 2015.

Slides:



Advertisements
Similar presentations
SE263 Video Analytics Course Project Initial Report Presented by M. Aravind Krishnan, SERC, IISc X. Mei and H. Ling, ICCV’09.
Advertisements

Advanced Computational Software Scientific Libraries: Part 2 Blue Waters Undergraduate Petascale Education Program May 29 – June
Computational Science R&D for Electromagnetic Modeling: Recent Advances and Perspective to Extreme-Scale Lie-Quan Lee For SLAC Computational Team ComPASS.
Trellis: A Framework for Adaptive Numerical Analysis Based on Multiparadigm Programming in C++ Jean-Francois Remacle, Ottmar Klaas and Mark Shephard Scientific.
Advanced CFD Analysis of Aerodynamics Using CFX
Problem Uncertainty quantification (UQ) is an important scientific driver for pushing to the exascale, potentially enabling rigorous and accurate predictive.
Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation,
Multilevel Incomplete Factorizations for Non-Linear FE problems in Geomechanics DMMMSA – University of Padova Department of Mathematical Methods and Models.
Advancing Computational Science Research for Accelerator Design and Optimization Accelerator Science and Technology - SLAC, LBNL, LLNL, SNL, UT Austin,
PETSc Portable, Extensible Toolkit for Scientific computing.
Direct and iterative sparse linear solvers applied to groundwater flow simulations Matrix Analysis and Applications October 2007.
1 Parallel Simulations of Underground Flow in Porous and Fractured Media H. Mustapha 1,2, A. Beaudoin 1, J. Erhel 1 and J.R. De Dreuzy IRISA – INRIA.
ME964 High Performance Computing for Engineering Applications “The real problem is not whether machines think but whether men do.” B. F. Skinner © Dan.
1 With contributions from: I. Demeshko (SNL), S. Price (LANL) and M. Hoffman (LANL) Saturday, March 14, 2015 SIAM Conference on Computational Science &
Iterative computation is a kernel function to many data mining and data analysis algorithms. Missing in current MapReduce frameworks is collective communication,
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
Antonio M. Vidal Jesús Peinado
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
MS17 A Case Study on the Vertical Integration of Trilinos Solver Algorithms with a Production Application Code Organizer: Roscoe A. Bartlett Sandia National.
Continuation Methods for Performing Stability Analysis of Large-Scale Applications LOCA: Library Of Continuation Algorithms Andy Salinger Roger Pawlowski,
Page 1 Trilinos Software Engineering Technologies and Integration Numerical Algorithm Interoperability and Vertical Integration –Abstract Numerical Algorithms.
1 Using the PETSc Parallel Software library in Developing MPP Software for Calculating Exact Cumulative Reaction Probabilities for Large Systems (M. Minkoff.
Large-Scale Stability Analysis Algorithms Andy Salinger, Roger Pawlowski, Ed Wilkes Louis Romero, Rich Lehoucq, John Shadid Sandia National Labs Albuquerque,
R. Ryne, NUG mtg: Page 1 High Energy Physics Greenbook Presentation Robert D. Ryne Lawrence Berkeley National Laboratory NERSC User Group Meeting.
Strategic Goals: To align the many efforts at Sandia involved in developing software for the modeling and simulation of physical systems (mostly PDEs):
ANS 1998 Winter Meeting DOE 2000 Numerics Capabilities 1 Barry Smith Argonne National Laboratory DOE 2000 Numerics Capability
PDE2D, A General-Purpose PDE Solver Granville Sewell Mathematics Dept. University of Texas El Paso.
ParCFD Parallel computation of pollutant dispersion in industrial sites Julien Montagnier Marc Buffat David Guibert.
ML: Multilevel Preconditioning Package Trilinos User’s Group Meeting Wednesday, October 15, 2003 Jonathan Hu Sandia is a multiprogram laboratory operated.
Automatic Differentiation: Introduction Automatic differentiation (AD) is a technology for transforming a subprogram that computes some function into a.
Strategies for Solving Large-Scale Optimization Problems Judith Hill Sandia National Laboratories October 23, 2007 Modeling and High-Performance Computing.
1 SciDAC TOPS PETSc Work SciDAC TOPS Developers Satish Balay Chris Buschelman Matt Knepley Barry Smith.
Computational Aspects of Multi-scale Modeling Ahmed Sameh, Ananth Grama Computing Research Institute Purdue University.
Danny Dunlavy, Andy Salinger Sandia National Laboratories Albuquerque, New Mexico, USA SIAM Parallel Processing February 23, 2006 SAND C Sandia.
1 1 What does Performance Across the Software Stack mean?  High level view: Providing performance for physics simulations meaningful to applications 
1 1  Capabilities: Scalable algebraic solvers for PDEs Freely available and supported research code Usable from C, C++, Fortran 77/90, Python, MATLAB.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
Update on Sandia Albany/FELIX First-Order Stokes (FELIX-FO) Solver Irina K. Tezaur Sandia National Laboratories In collaboration with Mauro Perego, Andy.
Implementing Hypre- AMG in NIMROD via PETSc S. Vadlamani- Tech X S. Kruger- Tech X T. Manteuffel- CU APPM S. McCormick- CU APPM Funding: DE-FG02-07ER84730.
Cracow Grid Workshop, November 5-6, 2001 Concepts for implementing adaptive finite element codes for grid computing Krzysztof Banaś, Joanna Płażek Cracow.
Connections to Other Packages The Cactus Team Albert Einstein Institute
A Dirichlet-to-Neumann (DtN)Multigrid Algorithm for Locally Conservative Methods Sandia National Laboratories is a multi program laboratory managed and.
Linear Algebra Libraries: BLAS, LAPACK, ScaLAPACK, PLASMA, MAGMA
Land-Ice and Atmospheric Modeling at Sandia: the Albany/FELIX and Aeras Solvers Irina K. Tezaur Org Quantitative Modeling & Analysis Department Sandia.
Barry Smith Algebraic Solvers in FASTMath Part 1 – Introduction to several packages unable to present today Part 2 – Portable Extensible Toolkit for Scientific.
On the Use of Finite Difference Matrix-Vector Products in Newton-Krylov Solvers for Implicit Climate Dynamics with Spectral Elements ImpactObjectives 
Photos placed in horizontal position with even amount of white space between photos and header Sandia National Laboratories is a multi-program laboratory.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Advanced User Support for MPCUGLES code at University of Minnesota October 09,
CONFIDENTIAL © 2007 Maplesoft, a division of Waterloo Maple Inc. Confidential MapleSim Pilot Test Program.
Report from LBNL TOPS Meeting TOPS/ – 2Investigators  Staff Members:  Parry Husbands  Sherry Li  Osni Marques  Esmond G. Ng 
Consider Preconditioning – Basic Principles Basic Idea: is to use Krylov subspace method (CG, GMRES, MINRES …) on a modified system such as The matrix.
Programming Massively Parallel Graphics Multiprocessors using CUDA Final Project Amirhassan Asgari Kamiabad
Center for Extended MHD Modeling (PI: S. Jardin, PPPL) –Two extensively developed fully 3-D nonlinear MHD codes, NIMROD and M3D formed the basis for further.
On the Path to Trinity - Experiences Bringing Codes to the Next Generation ASC Platform Courtenay T. Vaughan and Simon D. Hammond Sandia National Laboratories.
Conjugate gradient iteration One matrix-vector multiplication per iteration Two vector dot products per iteration Four n-vectors of working storage x 0.
CSCAPES Mission Research and development Provide load balancing and parallelization toolkits for petascale computation Develop advanced automatic differentiation.
LLNL-PRES This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
High Performance Computing Seminar II Parallel mesh partitioning with ParMETIS Parallel iterative solvers with Hypre M.Sc. Caroline Mendonça Costa.
Unstructured Meshing Tools for Fusion Plasma Simulations
A survey of Exascale Linear Algebra Libraries for Data Assimilation
Challenges in Electromagnetic Modeling Scalable Solvers
Shrirang Abhyankar IEEE PES HPC Working Group Meeting
HPC Modeling of the Power Grid
GENERAL VIEW OF KRATOS MULTIPHYSICS
Salient application properties Expectations TOPS has of users
Numerical Linear Algebra
Introduction to Scientific Computing II
Embedded Nonlinear Analysis Tools Capability Area
Ph.D. Thesis Numerical Solution of PDEs and Their Object-oriented Parallel Implementations Xing Cai October 26, 1998.
Presentation transcript:

Algebraic Solvers in FASTMath Argonne Training Program on Extreme-Scale Computing August 2015

2 2  Hypre – see detailed presentation  PARPACK  PETSc – see detailed presentation  SUNDIALS – see detailed presentation  SuperLU – see detailed presesntation  Trilinos-ML,NOX FASTMath SciDAC Institute Algebraic Solvers in FASTMath

3 3  Capabilities: Compute a few eigenpairs of a Hermitian and non-Hermitian matrix Both standard and generalized eigenvalues Extremal and interior eigenvalues Reverse communication allows easy integration with application MPI/BLACS communication  Download:  Further information: beyond PARPACK EIGPEN (based on penalty trace minimization for computing many eigenpairs) Parallel multiple shift-invert interface for computing many eigenpairs PARPACK

4 4  PPCG (Projected Preconditioned Conjugate Gradient) method for symmetric eigenvalue problems For computing a relatively large number of smallest eigenpairs Reduce Rayleigh-Ritz cost  GPLHR (Generalized Preconditioned Local Harmonic Ritz) method for interior eigenvalues of a non-Hermitian sparse matrix  Special solver for linear response eigenvalue problems (TDDFT linear response and Bethe-Salpeter equation) Other large-scale eigensolvers

5 5  A list of drivers provided in $ARPACKTOPDIR/PARPACK/EXAMPLES  Reverse communication interface  Hybrid MPI/OpenMP implementation  MATLAB interface available (eigs) PARPACK usage 10continue call pdsaupd(comm, ido,….) if (ido.eq. 1.or. -1) then matvec(…,workd(ipntr(1)), workd(ipntr(2)…. endif goto 10

6 6  ML: aggregation-based algebraic multigrid algorithms Support for scalar problems (diffusion, convection-diffusion), PDE systems (elasticity), electromagnetic problems (eddy current) Various coarsening and data rebalancing options Smoothers (SOR, polynomial, ILU, block variants, line, user-provided) Written in C  MueLu: templated multigrid framework Support for energy minimizing multigrid algorithms in addition to many algorithms from ML Leverages Trilinos templated sparse linear algebra stack  Optimized kernels for multiple architectures (GPU, OpenMP, Xeon Phi)  Templated scalar type allowing mixed precision, UQ, … Advanced data reuse possibilities, extensible by design Written in C++  Download/further information: ML and MueLu: Multigrid libraries in Trilinos

7 7  Magnetohydrodynamics (Drekar) ML scales to 512K cores on BG/Q and to 128K cores on Titan  Fluid dynamics (Nalu) MueLu scales to 524K cores of BG/Q ML and MueLu: Application highlights

8 8  Component reuse in multigrid can be effective in reducing setup costs while maintaining solver convergence. We have demonstrated that reuse can yield 2.5x speedup on 25K cores of Cray XE6.  Block systems arise naturally in mixed discretizations. Our new multigrid algorithm preserves such block structure on coarse levels for Stokes and Navier-Stokes systems.  MueLu/ML provide a specialized AMG for PISCEES project through semi-coarsening and line smoothers that exploit partial structure in meshes arising in ice sheet modeling. MueLu: Research framework Fig 3: Automatically coarsened 17x17 mesh Automatically generated coarse mesh for Q2-Q1 discretization of a Stokes system. Semicoarsening followed by regular 2D coarsening for Greenland model.

9 9  Capabilities: Newton-Based Nonlinear Solver  Linked to Trilinos linear solvers for scalability  Matrix-Free option Anderson Acceleration for Fixed-Point iterations Globalizations for improved robustness  Line Searches, Trust Region, Homotopy methods Customizable: C++ abstractions at every level Extended by LOCA package  Parameter continuation, Stability analysis, Bifurcation tracking  Download: Part of Trilinos (trilinos.sandia.gov)  Further information: Andy Salinger Trilinos/NOX Nonlinear Solver

10  Ice Sheets modeled by nonlinear Stokes’s equation Initial solve is fragile: Full Newton fails Homotopy continuation on regularization parameter “  ” saves the day Trilinos/NOX: Robustness for Ice Sheet Simulation: PISCEES SciDAC Application project (BER-ASCR) Greenland Ice Sheet Surface Velocities (constant friction model)  =  =  =  =10 -10

11 FASTMath SciDAC Institute NOX and ML are part of larger Trilinos solver stack: Linear solvers, Equations solvers, Analysis tools Analysis Tools UQ (sampling) Parameter Studies Optimization Analysis Tools (black-box) LinearSolvers Direct Solvers Linear Algebra Algebraic Preconditioners Iterative Solvers EquationSolvers UQ Solver Nonlinear Solver Time Integration Optimization Continuation Sensitivity Analysis Stability Analysis Analysis Tools (embedded) Linear Solver Interface Nonlinear Model Interface Your Model Here Solved Problem Interface NOX Multilevel Preconditioners ML