High Performance Computing Seminar II Parallel mesh partitioning with ParMETIS Parallel iterative solvers with Hypre M.Sc. Caroline Mendonça Costa.

Slides:



Advertisements
Similar presentations
05/11/2005 Carnegie Mellon School of Computer Science Aladdin Lamps 05 Combinatorial and algebraic tools for multigrid Yiannis Koutis Computer Science.
Advertisements

Multilevel Hypergraph Partitioning Daniel Salce Matthew Zobel.
The Combinatorial Multigrid Solver Yiannis Koutis, Gary Miller Carnegie Mellon University TexPoint fonts used in EMF. Read the TexPoint manual before you.
METIS Three Phases Coarsening Partitioning Uncoarsening
Siddharth Choudhary.  Refines a visual reconstruction to produce jointly optimal 3D structure and viewing parameters  ‘bundle’ refers to the bundle.
1 Numerical Solvers for BVPs By Dong Xu State Key Lab of CAD&CG, ZJU.
CS 290H 7 November Introduction to multigrid methods
MULTISCALE COMPUTATIONAL METHODS Achi Brandt The Weizmann Institute of Science UCLA
Parallelizing stencil computations Based on slides from David Culler, Jim Demmel, Bob Lucas, Horst Simon, Kathy Yelick, et al., UCB CS267.
Algebraic MultiGrid. Algebraic MultiGrid – AMG (Brandt 1982)  General structure  Choose a subset of variables: the C-points such that every variable.
Landscape Erosion Kirsten Meeker
A scalable multilevel algorithm for community structure detection
Multigrid Eigensolvers for Image Segmentation Andrew Knyazev Supported by NSF DMS This presentation is at
An Algebraic Multigrid Solver for Analytical Placement With Layout Based Clustering Hongyu Chen, Chung-Kuan Cheng, Andrew B. Kahng, Bo Yao, Zhengyong Zhu.
Multilevel Graph Partitioning and Fiduccia-Mattheyses
1 Parallel Simulations of Underground Flow in Porous and Fractured Media H. Mustapha 1,2, A. Beaudoin 1, J. Erhel 1 and J.R. De Dreuzy IRISA – INRIA.
Multilevel Hypergraph Partitioning G. Karypis, R. Aggarwal, V. Kumar, and S. Shekhar Computer Science Department, U of MN Applications in VLSI Domain.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.
Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI RD Project Review Meeting Canadian Meteorological Centre August.
The sequence of graph transformation (P1)-(P2)-(P4) generating an initial mesh with two finite elements GENERATION OF THE TOPOLOGY OF INITIAL MESH Graph.
Improving Coarsening and Interpolation for Algebraic Multigrid Jeff Butler Hans De Sterck Department of Applied Mathematics (In Collaboration with Ulrike.
L21: “Irregular” Graph Algorithms November 11, 2010.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
Graph Partitioning Donald Nguyen October 24, 2011.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
Hans De Sterck Department of Applied Mathematics University of Colorado at Boulder Ulrike Meier Yang Center for Applied Scientific Computing Lawrence Livermore.
Van Emden Henson Panayot Vassilevski Center for Applied Scientific Computing Lawrence Livermore National Laboratory Element-Free AMGe: General algorithms.
Computation on meshes, sparse matrices, and graphs Some slides are from David Culler, Jim Demmel, Bob Lucas, Horst Simon, Kathy Yelick, et al., UCB CS267.
Parallel Computing Sciences Department MOV’01 Multilevel Combinatorial Methods in Scientific Computing Bruce Hendrickson Sandia National Laboratories Parallel.
ParCFD Parallel computation of pollutant dispersion in industrial sites Julien Montagnier Marc Buffat David Guibert.
Automatic Differentiation: Introduction Automatic differentiation (AD) is a technology for transforming a subprogram that computes some function into a.
Application Paradigms: Unstructured Grids CS433 Spring 2001 Laxmikant Kale.
Supercomputing ‘99 Parallelization of a Dynamic Unstructured Application using Three Leading Paradigms Leonid Oliker NERSC Lawrence Berkeley National Laboratory.
Combinatorial Scientific Computing and Petascale Simulation (CSCAPES) A SciDAC Institute Funded by DOE’s Office of Science Investigators Alex Pothen, Florin.
The swiss-carpet preconditioner: a simple parallel preconditioner of Dirichlet-Neumann type A. Quarteroni (Lausanne and Milan) M. Sala (Lausanne) A. Valli.
New Features in ML 2004 Trilinos Users Group Meeting November 2-4, 2004 Jonathan Hu, Ray Tuminaro, Marzio Sala, Michael Gee, Haim Waisman Sandia is a multiprogram.
Implementing Hypre- AMG in NIMROD via PETSc S. Vadlamani- Tech X S. Kruger- Tech X T. Manteuffel- CU APPM S. McCormick- CU APPM Funding: DE-FG02-07ER84730.
PaGrid: A Mesh Partitioner for Computational Grids Virendra C. Bhavsar Professor and Dean Faculty of Computer Science UNB, Fredericton This.
A Dirichlet-to-Neumann (DtN)Multigrid Algorithm for Locally Conservative Methods Sandia National Laboratories is a multi program laboratory managed and.
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
Lecture 21 MA471 Fall 03. Recall Jacobi Smoothing We recall that the relaxed Jacobi scheme: Smooths out the highest frequency modes fastest.
1 Mark F. Adams 22 October 2004 Applications of Algebraic Multigrid to Large Scale Mechanics Problems.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University.
Data Structures and Algorithms in Parallel Computing Lecture 7.
1 Mark F. Adams SciDAC - 27 June 2005 Ax=b: The Link between Gyrokinetic Particle Simulations of Turbulent Transport in Burning Plasmas and Micro-FE Analysis.
MULTISCALE COMPUTATIONAL METHODS Achi Brandt The Weizmann Institute of Science UCLA
MA/CS 471 Lecture 15, Fall 2002 Introduction to Graph Partitioning.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Advanced User Support for MPCUGLES code at University of Minnesota October 09,
Algebraic Solvers in FASTMath Argonne Training Program on Extreme-Scale Computing August 2015.
1 Overview (Part 1) Background notions A reference framework for multiresolution meshes Classification of multiresolution meshes An introduction to LOD.
Multilevel Partitioning
A Parallel Hierarchical Solver for the Poisson Equation Seung Lee Deparment of Mechanical Engineering
Mesh Generation, Refinement and Partitioning Algorithms Xin Sui The University of Texas at Austin.
High Performance Computing Seminar
Hui Liu University of Calgary
Parallel Hypergraph Partitioning for Scientific Computing
Xing Cai University of Oslo
Ana Gainaru Aparna Sasidharan Babak Behzad Jon Calhoun
MultiGrid.
Computation on meshes, sparse matrices, and graphs
A Parallel Hierarchical Solver for the Poisson Equation
Computational meshes, matrices, conjugate gradients, and mesh partitioning Some slides are from David Culler, Jim Demmel, Bob Lucas, Horst Simon, Kathy.
Introduction to Scientific Computing II
Integrating Efficient Partitioning Techniques for Graph Oriented Applications My dissertation work represents a study of load balancing and data locality.
A Parallelization of State-of-the-Art Graph Bisection Algorithms
Introduction to Scientific Computing II
Introduction to Scientific Computing II
Ph.D. Thesis Numerical Solution of PDEs and Their Object-oriented Parallel Implementations Xing Cai October 26, 1998.
Presentation transcript:

High Performance Computing Seminar II Parallel mesh partitioning with ParMETIS Parallel iterative solvers with Hypre M.Sc. Caroline Mendonça Costa

Caroline Mendonça Costa ParMETIS and Hypre Page 2 Outline ParMETIS ● Review of Multilevel K- way partitioning ● Parallel K-way partitioning ● Recursive bisection ● Geometric partitioning ● Work in progress PETSc and Hypre ● Solution of PDE's ● CARP ● Large linear systems ● PCG (PETSc) ● Preconditioning ● AMG ● BoomerAMG (Hypre) ● Performance comparison

Caroline Mendonça Costa ParMETIS and Hypre Page 3 Parallel mesh partitioning with ParMETIS

Caroline Mendonça Costa ParMETIS and Hypre Page 4 Overview of the package ● Parallel version of the Metis mesh partitioner ● Developed by the Karypis lab at the Digital Technology Center of the University of Minnesota ● Can be downloaded at ● ● What it can do... ● Mesh/graph partitioning ● Graph repartitioning ● Partition refinement ● Matrix reordering

Caroline Mendonça Costa ParMETIS and Hypre Page 5 K-way graph partitioning problem

Caroline Mendonça Costa ParMETIS and Hypre Page 6 Multilevel K-way partitioning Nodes are collapsed together Matching Multilevel Recursive Bisection Initial K-way partitioning Partitioning is projected onto the coarser graphs Partitioning refinement

Caroline Mendonça Costa ParMETIS and Hypre Page 7 Parallel multilevel K-way partitioning Graph coloring algorithm ● Luby's algorithm Parallel coloring ● Distributed memory ● Communication setup ● Exchange of random numbers ● Single augmentation step ● Drawbacks ● Independent set is not maximal ● Increase number of colors ● Not all nodes are colored ● Advantages ● Reduces the overall time ● Most nodes participate

Caroline Mendonça Costa ParMETIS and Hypre Page 8 Parallel multilevel K-way partitioning Coarsening phase ● Parallel matching algorithm ● Uses the coloring of the graph to structure the computations ● During the c th iteration, the vertices of color c select one of their unmatched neighbors using HEM heuristics ● After each iteration, the vertices synchronize ● Distributed memory ● The write and read operations are gathered together and sent in a single message ● During the read operation the processors also determine if they will store the collapsed vertex

Caroline Mendonça Costa ParMETIS and Hypre Page 9 Parallel multilevel K-way partitioning Uncoarsening/refinement phase Uncoarsening ● Each processor contains one subgraph ● Each subgraph is expanded independently Refinement ● Uses the coloring to structure the vertex swaps ● During the cth iteration all vertices of color c are considered for movement ● The subset that leads to a reduction in the EC is “moved” ● Distributed memory ● The vertices do not move immediately: the partition number is updated ● By the end of the algorithm the vertices are moved to the corresponding processor

Caroline Mendonça Costa ParMETIS and Hypre Page 10 Parallel multilevel K-way partitioning K-way partitioning ● Nested dissection ● Parallelization? Geometric partitioning ● Space-filling curves ● Parallelization?

Caroline Mendonça Costa ParMETIS and Hypre Page 11 Work in progress Metis x ParMetis ● Metis: stand-alone software and library ● ParMetis: only distributed as a library Using ParMetis to partitioning meshes ● A few steps... ● Obtain graph from mesh ● Include geometric information – Not working ● Use the graph partitioning routines ● K-way partitioning ● Geometric K-way partitioning ● Challenges ● Structure of the code is not well described in the manual or in the code

Caroline Mendonça Costa ParMETIS and Hypre Page 12 Parallel Iterative Solvers with Hypre

Caroline Mendonça Costa ParMETIS and Hypre Page 13 Solution of PDE systems Bidomain model Elliptic system Nonlinear ODE system Parabolic PDE Operator Splitting FE M Bottleneck of computation Large linear system

Caroline Mendonça Costa ParMETIS and Hypre Page 14 Solution of PDE systems CARP – Cardiac Arrhythmia Research Package Contact Dr. Gernot Plank at the Institute of Biophysics of the MedUniGraz

Caroline Mendonça Costa ParMETIS and Hypre Page 15 Large Linear Systems sM = r Preconditioned Conjugate Gradient Apply different iterative solver to the residual (defect) system AM G

Caroline Mendonça Costa ParMETIS and Hypre Page 16 Preconditioning Algebraic Multigrid ● Setup phase ● Assemble ● Coarse levels ● Restriction ● Prolongation Restriction Interpolation

Caroline Mendonça Costa ParMETIS and Hypre Page 17 Large Linear Systems PCG PETSc - Portable, Extensible Toolkit for Scientific Computation Freely distributed at

Caroline Mendonça Costa ParMETIS and Hypre Page 18 Preconditioning Hypre – High Performance Preconditioners BoomerAMG Offered together with PETSc

Caroline Mendonça Costa ParMETIS and Hypre Page 19 Preconditioning BoomerAMG: Parallel Algebraic Multigrid ● General parallelization can be extended from GMG ● Except for coarse-grid selection ● Naturally sequential ● Parallelization of the coarse-grid selection ● Parallel CLJP algorithm ● Parallel Ruge-Stüben coarsening ● Parallel Falgout coarsening

Caroline Mendonça Costa ParMETIS and Hypre Page 20 Preconditioning BoomerAMG: Parallel Algebraic Multigrid ● Parallel CLJP algorithm ● Based on Luby's graph coloring algorithm ● Coarsening is computed for each color separated ● Poor coarsening on the interior of processors domains ● Parallel Ruge-Stüben coarsening ● Nodes with maximal weight are included in the coarse grid ● Each processor applies RS to its own nodes subset ● Difficulties on the boundaries of processors domains ● Parallel Falgout coarsening ● Uses CLJP near boundary nodes of each processor ● Uses RS on the internal nodes of each processor

Caroline Mendonça Costa ParMETIS and Hypre Page 21 A Few Results... Table 1. Solution time Unable to isolate the setup time of the AMG RS tend to behave as Falgout as the number of processor increase? Setup time is included in the final solution time Falgout leads to a better conditioned system than RS and CLJP or it takes less time to perform coarsening than RS and CLJP?

Caroline Mendonça Costa ParMETIS and Hypre Page 22 More to come... ● Isolate setup and solution time of the AMG ● Show more results ● Using different unstructured and larger meshes ● Using more processors ● Using different interpolation types ● Using different smoothers

Caroline Mendonça Costa ParMETIS and Hypre Page 23 Some bibliography ● ParMetis ● Manual: ● Parallel Multilevel k-way Partitioning Scheme for Irregular Graphs: ● Hypre ● Manual: ● BoomerAMG:

Caroline Mendonça Costa ParMETIS and Hypre Page 24 Thank you for your attention!