High Performance Computing Seminar II Parallel mesh partitioning with ParMETIS Parallel iterative solvers with Hypre M.Sc. Caroline Mendonça Costa
Caroline Mendonça Costa ParMETIS and Hypre Page 2 Outline ParMETIS ● Review of Multilevel K- way partitioning ● Parallel K-way partitioning ● Recursive bisection ● Geometric partitioning ● Work in progress PETSc and Hypre ● Solution of PDE's ● CARP ● Large linear systems ● PCG (PETSc) ● Preconditioning ● AMG ● BoomerAMG (Hypre) ● Performance comparison
Caroline Mendonça Costa ParMETIS and Hypre Page 3 Parallel mesh partitioning with ParMETIS
Caroline Mendonça Costa ParMETIS and Hypre Page 4 Overview of the package ● Parallel version of the Metis mesh partitioner ● Developed by the Karypis lab at the Digital Technology Center of the University of Minnesota ● Can be downloaded at ● ● What it can do... ● Mesh/graph partitioning ● Graph repartitioning ● Partition refinement ● Matrix reordering
Caroline Mendonça Costa ParMETIS and Hypre Page 5 K-way graph partitioning problem
Caroline Mendonça Costa ParMETIS and Hypre Page 6 Multilevel K-way partitioning Nodes are collapsed together Matching Multilevel Recursive Bisection Initial K-way partitioning Partitioning is projected onto the coarser graphs Partitioning refinement
Caroline Mendonça Costa ParMETIS and Hypre Page 7 Parallel multilevel K-way partitioning Graph coloring algorithm ● Luby's algorithm Parallel coloring ● Distributed memory ● Communication setup ● Exchange of random numbers ● Single augmentation step ● Drawbacks ● Independent set is not maximal ● Increase number of colors ● Not all nodes are colored ● Advantages ● Reduces the overall time ● Most nodes participate
Caroline Mendonça Costa ParMETIS and Hypre Page 8 Parallel multilevel K-way partitioning Coarsening phase ● Parallel matching algorithm ● Uses the coloring of the graph to structure the computations ● During the c th iteration, the vertices of color c select one of their unmatched neighbors using HEM heuristics ● After each iteration, the vertices synchronize ● Distributed memory ● The write and read operations are gathered together and sent in a single message ● During the read operation the processors also determine if they will store the collapsed vertex
Caroline Mendonça Costa ParMETIS and Hypre Page 9 Parallel multilevel K-way partitioning Uncoarsening/refinement phase Uncoarsening ● Each processor contains one subgraph ● Each subgraph is expanded independently Refinement ● Uses the coloring to structure the vertex swaps ● During the cth iteration all vertices of color c are considered for movement ● The subset that leads to a reduction in the EC is “moved” ● Distributed memory ● The vertices do not move immediately: the partition number is updated ● By the end of the algorithm the vertices are moved to the corresponding processor
Caroline Mendonça Costa ParMETIS and Hypre Page 10 Parallel multilevel K-way partitioning K-way partitioning ● Nested dissection ● Parallelization? Geometric partitioning ● Space-filling curves ● Parallelization?
Caroline Mendonça Costa ParMETIS and Hypre Page 11 Work in progress Metis x ParMetis ● Metis: stand-alone software and library ● ParMetis: only distributed as a library Using ParMetis to partitioning meshes ● A few steps... ● Obtain graph from mesh ● Include geometric information – Not working ● Use the graph partitioning routines ● K-way partitioning ● Geometric K-way partitioning ● Challenges ● Structure of the code is not well described in the manual or in the code
Caroline Mendonça Costa ParMETIS and Hypre Page 12 Parallel Iterative Solvers with Hypre
Caroline Mendonça Costa ParMETIS and Hypre Page 13 Solution of PDE systems Bidomain model Elliptic system Nonlinear ODE system Parabolic PDE Operator Splitting FE M Bottleneck of computation Large linear system
Caroline Mendonça Costa ParMETIS and Hypre Page 14 Solution of PDE systems CARP – Cardiac Arrhythmia Research Package Contact Dr. Gernot Plank at the Institute of Biophysics of the MedUniGraz
Caroline Mendonça Costa ParMETIS and Hypre Page 15 Large Linear Systems sM = r Preconditioned Conjugate Gradient Apply different iterative solver to the residual (defect) system AM G
Caroline Mendonça Costa ParMETIS and Hypre Page 16 Preconditioning Algebraic Multigrid ● Setup phase ● Assemble ● Coarse levels ● Restriction ● Prolongation Restriction Interpolation
Caroline Mendonça Costa ParMETIS and Hypre Page 17 Large Linear Systems PCG PETSc - Portable, Extensible Toolkit for Scientific Computation Freely distributed at
Caroline Mendonça Costa ParMETIS and Hypre Page 18 Preconditioning Hypre – High Performance Preconditioners BoomerAMG Offered together with PETSc
Caroline Mendonça Costa ParMETIS and Hypre Page 19 Preconditioning BoomerAMG: Parallel Algebraic Multigrid ● General parallelization can be extended from GMG ● Except for coarse-grid selection ● Naturally sequential ● Parallelization of the coarse-grid selection ● Parallel CLJP algorithm ● Parallel Ruge-Stüben coarsening ● Parallel Falgout coarsening
Caroline Mendonça Costa ParMETIS and Hypre Page 20 Preconditioning BoomerAMG: Parallel Algebraic Multigrid ● Parallel CLJP algorithm ● Based on Luby's graph coloring algorithm ● Coarsening is computed for each color separated ● Poor coarsening on the interior of processors domains ● Parallel Ruge-Stüben coarsening ● Nodes with maximal weight are included in the coarse grid ● Each processor applies RS to its own nodes subset ● Difficulties on the boundaries of processors domains ● Parallel Falgout coarsening ● Uses CLJP near boundary nodes of each processor ● Uses RS on the internal nodes of each processor
Caroline Mendonça Costa ParMETIS and Hypre Page 21 A Few Results... Table 1. Solution time Unable to isolate the setup time of the AMG RS tend to behave as Falgout as the number of processor increase? Setup time is included in the final solution time Falgout leads to a better conditioned system than RS and CLJP or it takes less time to perform coarsening than RS and CLJP?
Caroline Mendonça Costa ParMETIS and Hypre Page 22 More to come... ● Isolate setup and solution time of the AMG ● Show more results ● Using different unstructured and larger meshes ● Using more processors ● Using different interpolation types ● Using different smoothers
Caroline Mendonça Costa ParMETIS and Hypre Page 23 Some bibliography ● ParMetis ● Manual: ● Parallel Multilevel k-way Partitioning Scheme for Irregular Graphs: ● Hypre ● Manual: ● BoomerAMG:
Caroline Mendonça Costa ParMETIS and Hypre Page 24 Thank you for your attention!