Consequences for scalability arising from multi-material modeling Allen C. Robinson Jay Mosso, Chris Siefert, Jonathan Hu Sandia National Laboratories.

Slides:



Advertisements
Similar presentations
Steady-state heat conduction on triangulated planar domain May, 2002
Advertisements

4 th order Embedded Boundary FDTD algorithm for Maxwell Equations Lingling Wu, Stony Brook University Roman Samulyak, BNL Tianshi Lu, BNL Application collaborators:
05/11/2005 Carnegie Mellon School of Computer Science Aladdin Lamps 05 Combinatorial and algebraic tools for multigrid Yiannis Koutis Computer Science.
Parallel Jacobi Algorithm Steven Dong Applied Mathematics.
Workshop on Numerical Methods for Multi-material Fluid Flows, Prague, Czech Republic, September 10-14, 2007 Sandia is a multiprogram laboratory operated.
Surface Simplification Using Quadric Error Metrics Speaker: Fengwei Zhang September
A Discrete Adjoint-Based Approach for Optimization Problems on 3D Unstructured Meshes Dimitri J. Mavriplis Department of Mechanical Engineering University.
Parallelizing stencil computations Based on slides from David Culler, Jim Demmel, Bob Lucas, Horst Simon, Kathy Yelick, et al., UCB CS267.
1 Approved for unlimited release as SAND C Verification Practices for Code Development Teams Greg Weirs Computational Shock and Multiphysics.
1 Chapter 4 Interpolation and Approximation Lagrange Interpolation The basic interpolation problem can be posed in one of two ways: The basic interpolation.
Unstructured Data Partitioning for Large Scale Visualization CSCAPES Workshop June, 2008 Kenneth Moreland Sandia National Laboratories Sandia is a multiprogram.
Katsuyo Thornton*, R. Edwin García✝, Larry Aagesen*
Problem Uncertainty quantification (UQ) is an important scientific driver for pushing to the exascale, potentially enabling rigorous and accurate predictive.
Coupled Fluid-Structural Solver CFD incompressible flow solver has been coupled with a FEA code to analyze dynamic fluid-structure coupling phenomena CFD.
Isoparametric Elements
Exploring Communication Options with Adaptive Mesh Refinement Courtenay T. Vaughan, and Richard F. Barrett Sandia National Laboratories SIAM Computational.
Parallel Mesh Refinement with Optimal Load Balancing Jean-Francois Remacle, Joseph E. Flaherty and Mark. S. Shephard Scientific Computation Research Center.
Scientific Computing on Heterogeneous Clusters using DRUM (Dynamic Resource Utilization Model) Jamal Faik 1, J. D. Teresco 2, J. E. Flaherty 1, K. Devine.
Advancing Computational Science Research for Accelerator Design and Optimization Accelerator Science and Technology - SLAC, LBNL, LLNL, SNL, UT Austin,
Dynamical Systems Tools for Ocean Studies: Chaotic Advection versus Turbulence Reza Malek-Madani.
Theoretical & Industrial Design of Aerofoils P M V Subbarao Professor Mechanical Engineering Department An Objective Invention ……
Jin Yao Lawrence Livermore National Laboratory *This work was performed under the auspices of the U.S. Department of Energy by the University of California,
© 2011 Autodesk Freely licensed for use by educational institutions. Reuse and changes require a note indicating that content has been modified from the.
1 CFD Analysis Process. 2 1.Formulate the Flow Problem 2.Model the Geometry 3.Model the Flow (Computational) Domain 4.Generate the Grid 5.Specify the.
Modeling and representation 1 – comparative review and polygon mesh models 2.1 Introduction 2.2 Polygonal representation of three-dimensional objects 2.3.
Department of Aerospace and Mechanical Engineering A one-field discontinuous Galerkin formulation of non-linear Kirchhoff-Love shells Ludovic Noels Computational.
© Fujitsu Laboratories of Europe 2009 HPC and Chaste: Towards Real-Time Simulation 24 March
Grid Generation.
Improving Coarsening and Interpolation for Algebraic Multigrid Jeff Butler Hans De Sterck Department of Applied Mathematics (In Collaboration with Ulrike.
A D V A N C E D C O M P U T E R G R A P H I C S CMSC 635 January 15, 2013 Spline curves 1/23 Curves and Surfaces.
A D V A N C E D C O M P U T E R G R A P H I C S CMSC 635 January 15, 2013 Quadric Error Metrics 1/20 Quadric Error Metrics.
Geometric Modeling using Polygonal Meshes Lecture 1: Introduction Hamid Laga Office: South.
Mesh Generation 58:110 Computer-Aided Engineering Reference: Lecture Notes on Delaunay Mesh Generation, J. Shewchuk (1999)
ParCFD Parallel computation of pollutant dispersion in industrial sites Julien Montagnier Marc Buffat David Guibert.
Discontinuous Galerkin Methods and Strand Mesh Generation
The Geometry of Biomolecular Solvation 2. Electrostatics Patrice Koehl Computer Science and Genome Center
Discontinuous Galerkin Methods Li, Yang FerienAkademie 2008.
The swiss-carpet preconditioner: a simple parallel preconditioner of Dirichlet-Neumann type A. Quarteroni (Lausanne and Milan) M. Sala (Lausanne) A. Valli.
Stable, Circulation- Preserving, Simplicial Fluids Sharif Elcott, Yiying Tong, Eva Kanso, Peter Schröder, and Mathieu Desbrun.
Computational Aspects of Multi-scale Modeling Ahmed Sameh, Ananth Grama Computing Research Institute Purdue University.
© Fluent Inc. 11/24/2015J1 Fluids Review TRN Overview of CFD Solution Methodologies.
Lesson 4: Computer method overview
Ale with Mixed Elements 10 – 14 September 2007 Ale with Mixed Elements Ale with Mixed Elements C. Aymard, J. Flament, J.P. Perlat.
New Features in ML 2004 Trilinos Users Group Meeting November 2-4, 2004 Jonathan Hu, Ray Tuminaro, Marzio Sala, Michael Gee, Haim Waisman Sandia is a multiprogram.
LLNL-PRES DRAFT This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract.
Quality of model and Error Analysis in Variational Data Assimilation François-Xavier LE DIMET Victor SHUTYAEV Université Joseph Fourier+INRIA Projet IDOPT,
Optical Flow. Distribution of apparent velocities of movement of brightness pattern in an image.
October 2008 Integrated Predictive Simulation System for Earthquake and Tsunami Disaster CREST/Japan Science and Technology Agency (JST)
A Dirichlet-to-Neumann (DtN)Multigrid Algorithm for Locally Conservative Methods Sandia National Laboratories is a multi program laboratory managed and.
Lecture 21 MA471 Fall 03. Recall Jacobi Smoothing We recall that the relaxed Jacobi scheme: Smooths out the highest frequency modes fastest.
Site Report DOECGF April 26, 2011 W. Alan Scott Sandia National Laboratories Sandia National Laboratories is a multi-program laboratory managed and operated.
AMS 691 Special Topics in Applied Mathematics Lecture 8
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Advanced User Support for MPCUGLES code at University of Minnesota October 09,
WORKSHOP ERCIM Global convergence for iterative aggregation – disaggregation method Ivana Pultarova Czech Technical University in Prague, Czech Republic.
ILAS Threshold partitioning for iterative aggregation – disaggregation method Ivana Pultarova Czech Technical University in Prague, Czech Republic.
1 Chapter 4 Interpolation and Approximation Lagrange Interpolation The basic interpolation problem can be posed in one of two ways: The basic interpolation.
Mesh Control Winter Semester PART 1 Meshing.
Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation,
On the Path to Trinity - Experiences Bringing Codes to the Next Generation ASC Platform Courtenay T. Vaughan and Simon D. Hammond Sandia National Laboratories.
Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation,
Xing Cai University of Oslo
Katsuyo Thornton1, R. Edwin García2, Larry Aagesen3
Nodal Methods for Core Neutron Diffusion Calculations
© Fluent Inc. 1/10/2018L1 Fluids Review TRN Solution Methods.
Brigham Young University
Introduction to Multigrid Method
GENERAL VIEW OF KRATOS MULTIPHYSICS
Software Cradle Releases SC/Tetra V8
Low Order Methods for Simulation of Turbulence in Complex Geometries
Numerical Modeling Ramaz Botchorishvili
Presentation transcript:

Consequences for scalability arising from multi-material modeling Allen C. Robinson Jay Mosso, Chris Siefert, Jonathan Hu Sandia National Laboratories Tom Gardiner, Cray Inc. Joe Crepeau, Applied Research Associates, Inc. Numerical methods for multi-material fluid flows Czech Technical University, Prague, Czech Republic September , 2007 Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear Security Administration under Contract DE-AC04-94AL84000.

April 18, 2015 Sept 2007 Prague 2 The ALEGRA-HEDP mission: predictive design and analysis capability to effectively use the Z machine ALEGRA-HEDP Z-pinches as x-ray sources Magnetic flyers for EOS Algorithms Meshing Platforms Computer and Information Sciences Pulsed Power Sciences Analysis Material models IMC (Joint LLNL/SNL development) Optimization/UQ

April 18, 2015 Sept 2007 Prague 3 Multimaterial and Multi-physics Modeling in arbitrary mesh ALE codes Complicated geometries and many materials are a fact of life for realistic simulations. Future machines may be less tolerant of load imbalances. Multimaterial issues play a key role with respect to algorithmic performance. For example, –Interface Reconstruction –Implicit solver performance. –Material models What processes are required to confront and solve performance and load balancing issues in a timely manner? R-T unstable Z-pinch Density perturbation from slot r z θ Diagnostic slots

April 18, 2015 Sept 2007 Prague 4 What do current/future machines look like? Representative largest platforms: Purple: Compute Nodes: 1336 nodes * 8 sockets/node = CPU (core) speed IBM Power5 (GHz) 1.9 Theoretical system peak performance = TFlop/s Red Storm: Compute Nodes: sockets * 2 cores/socket = CPU speed AMD Opteron (GHz) 2.4 Theoretical system peak performance = 124 Tflop/s

April 18, 2015 Sept 2007 Prague 5 What do future machines look like? Representative largest platform in 5 years: (likely) – 10+ Petaflops – 40,000 sockets * 25 cores/socket = 1 Million cores –.5 Gbyte/core Representative largest platforms in 10 years: (crystal ball) –Exaflops –100 Million cores. Sounds great: but –Memory bandwidth is clearly at serious risk –Can latency and cross-sectional bandwidth keep up? Minor software/algorithmic/process flaws today may be near fatal weaknesses tomorrow from both a scalability and robustness point of view.

April 18, 2015 Sept 2007 Prague 6 ALEGRA Scalability Testing Process Define sequences of gradually more complicated problems in a software environment that easily generates large scale scalability tests. (python/xml) Budget/assign personnel and computer time to exercise these tests on a regular basis. Take action as required to minimize impact of problematic results achieved on large scale systems.

April 18, 2015 Sept 2007 Prague 7 Available Interface Reconstruction Options in ALEGRA SLIC – Single Line Interface Reconstruction SMYRA – Sandia Modified Young’s Reconstruction works with a unit cube description New Smyra – Alternate version of SMYRA algorithm PIR – Patterned Interface Reconstruction –Works with physical element description (not unit cubes) –Additional smoothing steps yields second order accuracy –Strict ordering and polygonal removal by material guarantees self-consistent geometry. –More expensive Interface reconstruction is not needed for single material.

April 18, 2015 Sept 2007 Prague 8 Problem Description AdvectBlock – Single material simple advection InterfaceTrack – 6 material advection problem in a periodic box with spheres and hemispheres

April 18, 2015 Sept 2007 Prague 9 Large scale testing smokes out error (1/12/2007) Parallel communication overhead SN = 1 core/node VN= 2 cores/node Nose dive showed up at 6000 cores Traced to a misplaced all to one Difficult to diagnosis performance impact existed at small scale } 13% loss due to multi-core contention Before this fix was found Purple results showed similar results then suddenly dropped to 5% at this point.

April 18, 2015 Sept 2007 Prague 10 Interface Track (6/22/2007) Periodic bc is always “parallel” but no real communication occurs Mileage varies presumably due to improved locality on the machine. Flattens out as worst case communications is achieved % loss due to interface tracking

April 18, 2015 Sept 2007 Prague 11 Basic PIR is an extension of the Youngs’ 3D algorithm: – DL Youngs, “An Interface Tracking Method for a Three- Dimensional Hydrodynamics Code”, Technical Report 44/92/35, (AWRE 1984) –Approximate interface normal by –Grad(Vf) –Position planar interface (polygon) in element to conserve volume exactly for arbitrary shaped elements. –not spatially second-order accurate Smoothed PIR –Planar algorithm generates a trial normal. –Spherical algorithm generates an alternative trial normal. –“roughness” measure determines which trial normal agrees best with the local neighborhood. PIR Utility - more accurately move materials through the computational mesh - visualization Next generation Pattern Interface Reconstruction (PIR) Algorithm

April 18, 2015 Sept 2007 Prague 12 PIR Smoothing Algorithms Smoothing uses Swartz Stability Points – SJ Mosso, BK Swartz, DB Kothe, RC Ferrell, “A Parallel Volume Tracking Algorithm for Unstructured Meshes”, Parallel CFD Conference Proceedings, Capri, Italy, –The centroid of each interface is a ‘stable’ position Algorithm –Compute the centroid of each interface –Fit surface(s) to the neighboring centroids –Compute the normal(s) of the fit(s) –Choose the best normal –Re-adjust positions to conserve volume –Iterate to convergence.

April 18, 2015 Sept 2007 Prague 13 Planar Normal Algorithm Least-Squares fit of a plane to the immediate 3D neighborhood 2 evecs ~ in plane 1 evec ~ out of plane (minimal eigenvalue)

April 18, 2015 Sept 2007 Prague 14 Spherical Normal Algorithm Construct plane at midpoint of chord joining home S 0 and each neighboring S i Compute V closest to all midchord planes h V Planar fit is non-optimal for curved interfaces

April 18, 2015 Sept 2007 Prague 15 Roughness measure Displacement roughnessOrientation roughness Roughness is sum of a displacement volume and a relative orientation volume V V

April 18, 2015 Sept 2007 Prague 16 Selection of ‘best’ normal Three candidate normals: gradient, planar, spherical Extrapolate shape and compute spatial s i agreement and normal agreement: ‘roughness’ Method with lowest roughness is selected

April 18, 2015 Sept 2007 Prague 17 InterfaceTrack Test Problem (modified – not periodic)

April 18, 2015 Sept 2007 Prague 18 PIR Smoothing Algorithm Illustration Unsmoothed Smoothed

April 18, 2015 Sept 2007 Prague 19 PIR Status Smoothed PIR is nearing completion in both 2D and 3D as a fully functional feature in ALEGRA. The method significantly reduces the numeric distortion of the shape of the body, as it moves through the mesh Increased fidelity comes at cost. ~50% more floating point operations but ~10x cost. Why? Non-optimized code. Using tools such as valgrind with cachegrind we expect rapid improvements. Example: one line modification to STL::vector usage already resulted in 32% improvement in this algorithm! Comparison of non-smoothed PIR with other options

April 18, 2015 Sept 2007 Prague 20 Eddy Current Equations Model for many E&M phenomena. Sandia interest: Z-Pinch. 3D magnetic diffusion step in Lagrangian operator split. Challenge: Large null space of curl. Solution: Compatible (edge) discretization. H 1  ) Node H(Curl;  ) Edge N(Curl) H(Div;  ) Face N(Div) GradCurl L 2 (  ) Element Div

April 18, 2015 Sept 2007 Prague 21 Algebraic Multigrid Solvers P P T Setup –Coarsen –Project –Recurse Each grid solves “smooth” modes on that grid. P=prolongator P T =restriction P P T

April 18, 2015 Sept 2007 Prague 22 H(curl) Multigrid Operates on two grids: nodes and edges. We have developed two H(curl) AMG solvers –Special (commuting) prolongator (Hu, et al., 2006) –Discrete Hodge Laplacian reformulation (Bochev, et al., 2007, in review). H 1  ) Node H(Curl;  ) Edge N(Curl) H(Div;  ) Face N(Div) GradCurl L 2 (  ) Element Div

April 18, 2015 Sept 2007 Prague 23 New AMG: Laplace Reformulation Idea: Reformulate to Hodge Laplacian: Use a discrete Hodge decomposition: Resulting preconditioner looks like: Hodge part interpolated to vector nodal Laplacian Then apply standard AMG algorithms to each diagonal block “Multigrid was designed for Laplacians.”

April 18, 2015 Sept 2007 Prague 24 Theory: Multigrid & Multimaterial Recent work by Xu and Zhu (2007) for Laplace is encouraging. Idea: Material jumps have limited effect on AMG. –Only a small number of eigenvalues get perturbed. –The reduced condition number is O(|log h| 2 ) without these eigenvalues. Caveats: –Theory is only for Laplace (not Maxwell). –Assumes number of materials is small. –If we really have varying properties “which we do in real problems” then more bad EVs…

April 18, 2015 Sept 2007 Prague 25 Test Problems (10 6 jump in conductivity) Sphere: ball in a box. –Half-filled elements near surface. Liner: cylindrical liner. –Non-orthogonal mesh, slight stretching. LinerF: fingered cylindrical liner. –Non-orthogonal, slight mesh stretching. –Material fingering. Weak scaling tests

April 18, 2015 Sept 2007 Prague 26 Multimaterial Issues & Scalability Basic Issue: coefficient (  ) changes. Physics & discretization issues. –Multimaterial & mesh stretching. –Material fingering. –Half-filled elements at material boundaries. Multigrid issues. –Aggregates crossing material boundaries. –What is an appropriate semi-coarsening? –H(grad) theory not directly applicable.

April 18, 2015 Sept 2007 Prague 27 Old H(curl) Iterations (7/9/2007) Liner and Liner F – 1 Hiptmair fine smooth, LU coarse grid, smooth prolongator Sphere -2 Hiptmair fine sweeps, 6 coarse Hiptmair, smooth prolongator off Performance sensitive to solver settings and problem.

April 18, 2015 Sept 2007 Prague 28 Old H(curl) Run Time (7/9/2007) Liner and Liner F – 1 Hiptmair fine smooth, LU coarse grid, smooth prolongator Note degradation due to fingering Sphere -2 Hiptmair fine sweeps, 6 coarse Hiptmair, smooth prolongator off Performance sensitive to solver settings and problem.

April 18, 2015 Sept 2007 Prague 29 Sphere - Old/New Comparison

April 18, 2015 Sept 2007 Prague 30 Liner- Old/New Comparison

April 18, 2015 Sept 2007 Prague 31 LinerF- Old/New Comparison

April 18, 2015 Sept 2007 Prague 32 Observations Multimaterial issue have a significant effect for AMG performance. However, getting the right overall multigrid solver settings seems at least as important as the effect of multimaterial issues on the multigrid performance on a given problem. We need to expand our test suite to include smoothly varying properties Improve matrix of tests versus AMG option settings. Investigate whether “optimal” default settings exist. Expensive process.

April 18, 2015 Sept 2007 Prague 33 Summary Multimaterial modeling impacts scalable performance. Interface reconstruction algorithms impact scalable performance to a significant degree. High quality reconstruction such as PIR is needed but comes at a cost. This justifies dedicated attention to performance issues related to high order interface reconstruction. AMR multigrid performance can be strongly dependent on material discontinuities, details of the problem and solver settings. New H(curl) Hodge Laplacian multigrid show promise at large scale. A continual testing and improvement process is required for large scale capacity computing success today and even more so in the future. Continued emphasis on answering questions of optimal algorithmic choices appears to be key to achieving future requirements.