July 1, 2010Parallel solution of the Helmholtz equation1 Parallel solution of the Helmholtz equation with large wave numbers Dan Gordon Computer Science University of Haifa Rachel Gordon Aerospace Eng. Technion
July 1, 2010Parallel solution of the Helmholtz equation2 Outline of Talk The Kaczmarz algorithm (KACZ) KACZ CARP (Component- Averaged Row Projections) Applications of CARP CARP-CG: CG acceleration of CARP Sample results with the Helmholtz equation
July 1, 2010Parallel solution of the Helmholtz equation3 KACZ: The Kaczmarz algorithm Iterative method, due to Kaczmarz (1937). Rediscovered for CT as ART Geometric algorithm: consider the hyperplane defined by each equation Start from an arbitrary initial point Successively project current point onto the next hyperplane, in cyclic order
July 1, 2010Parallel solution of the Helmholtz equation4 KACZ: Geometric Description eq. 1 eq. 2 eq. 3 initial point
July 1, 2010Parallel solution of the Helmholtz equation5 KACZ with Relaxation Parameter KACZ can be used with a relaxation parameter =1: project exactly on the hyperplane <1: project in front of hyperplane >1: project beyond the hyperplane Cyclic relaxation: eq. i is assigned a relaxation parameter i
July 1, 2010Parallel solution of the Helmholtz equation6 Convergence Properties of KACZ KACZ with relaxation (0<<2) converges for consistent systems: –Herman, Lent & Lutz, 1978 –Trummer, 1981 For inconsistent systems, KACZ converges cyclically: –Tanabe, 1971 –Eggermont, Herman & Lent, 1981 (for cyclic relaxation parameters).
July 1, 2010Parallel solution of the Helmholtz equation7 Algebraic formulation of KACZ Given the systemAx = b Consider the "normal equations" systemAA T y = b, x=A T y Well-known fact: KACZ is SOR applied to the normal equations The relaxation parameter of KACZ is the usual relax. par. of SOR
July 1, 2010Parallel solution of the Helmholtz equation8 Block Mode & Parallelization Block KACZ: projection onto affine subspace defined by a block of eqns Block-sequential KACZ: –partition eqns into blocks –each block consists of independent eqns –iterate over blocks –in each block, perform projections in parallel
July 1, 2010Parallel solution of the Helmholtz equation9 CARP: Component-Averaged Row Projections A block-parallel version of KACZ The equations are divided into blocks (not necessarily disjoint) Initial estimate: vector x=(x 1,…,x n ) Suppose x 1 is a variable (component of x) that appears in 3 blocks x 1 is “cloned” as y 1, z 1, t 1 in the different blocks. Perform one (or more) KACZ iteration(s) on each block (independently, in parallel)
July 1, 2010Parallel solution of the Helmholtz equation10 CARP – Explanation (cont) The internal iterations in each block produce 3 new values for the clones of x 1 : y 1 ’, z 1 ’, t 1 ’ The next iterative value of x 1 is x 1 ’ = (y 1 ’ + z 1 ’ + t 1 ’)/3 The next iterate is x’ = (x 1 ’,..., x n ’) Repeat iterations as needed for convergence
July 1, 2010Parallel solution of the Helmholtz equation11 CARP as Domain Decomposition xx y domain A domain B external grid point of A clone of clone of x 1 Note: domains may overlap
July 1, 2010Parallel solution of the Helmholtz equation12 Overview of CARP domain A domain B KACZiterationsKACZiterations averaging cloning KACZ in superspace (with cyclic relaxation)
July 1, 2010Parallel solution of the Helmholtz equation13 Convergence of CARP Averaging Lemma: the component- averaging and cloning operations of CARP are equivalent to KACZ row- projections in a certain superspace (with =1) CARP is equivalent to KACZ in the superspace, with cyclic relaxation parameters – known to converge
July 1, 2010Parallel solution of the Helmholtz equation14 CARP Application: Solution of stiff linear systems from PDEs Elliptic PDEs w/large convection term result in stiff linear systems (large off-diagonal elements) CARP is very robust on these systems, as compared to leading solver/preconditioner combinations Downside: Not always efficient
July 1, 2010Parallel solution of the Helmholtz equation15 CARP Application: Electron Tomography (joint work with J.-J. Fernández) 3D reconstructions: Each processor is assigned a block of consecutive slices. Data is in overlapping blobs. The blocks are processed in parallel. The values of shared variables are transmitted between the processors which share them, averaged, and redestributed.
July 1, 2010Parallel solution of the Helmholtz equation16 CARP-CG: CG acceleration of CARP CARP is KACZ in some superspace (with cyclic relaxation parameters) Björck & Elfving (BIT 79): developed CGMN, which is a (sequential) CG- acceleration of KACZ (double sweep, fixed relax. parameter) We extended this result to allow cyclic relaxation parameters Result: CARP-CG
July 1, 2010Parallel solution of the Helmholtz equation17 CARP-CG: Properties Same robustness as CARP Very significant improvement in performance on stiff linear systems derived from elliptic PDEs Very competitive runtime compared to leading solver/preconditioner combinations on systems derived from convection-dominated PDEs Improved performance in ET
July 1, 2010Parallel solution of the Helmholtz equation18 CARP-CG: Properties On one processor, CARP-CG is identical to CGMN Particularly useful on systems with LARGE off-diagonal elements –example: convection-dominated PDEs Discontinuous coefficients are handled without requiring domain decomposition (DD)
July 1, 2010Parallel solution of the Helmholtz equation19 Robustness of CARP-CG KACZ inherently normalizes the eqns After normalization, the diagonal elements of AA T are larger than the off- diagonal ones (in each row) This is not diagonal dominance, but it makes the normal eqns manageable Normalization was also found to be useful for discontinuous coefficients
July 1, 2010Parallel solution of the Helmholtz equation20 The Helmholtz Equation Eqn: -Δu - k 2 u = f Wave length: = 2/k No. of grid pts per : N g = 2/kh Shifted Laplacian approach: –Bayliss, Goldstein & Turkel, 1983 –Erlangga, Vuik & Oosterlee, 2004/06 -Δu – ( i k 2 u = f uses multigrid to solve the PC (PC = preconditioner)
July 1, 2010Parallel solution of the Helmholtz equation21 The Helmholtz Equation Bollhöfer, Grote & Schenk, 2009: introduced algebraic multilevel PC for the Helmholtz eqn in heterogeneous media. Uses symmetric max weight matchings and an inverse-based pivoting method. Apologies to many other contributors to this problem!
July 1, 2010Parallel solution of the Helmholtz equation22 Experiments CARP-CG was used with a fixed relaxation parameter of 1.7 in all cases Domain: unit square [] 2nd order central difference scheme
July 1, 2010Parallel solution of the Helmholtz equation23 Problem 1 (with analytic sol'n) Based on Erlangga et al '04, §6.1 Eqn: (Δ+k 2 )u = (k 2 –5 2 )sin(x)sin(2y) bndry condition: u=0 on all sides Analytic solution: u = sin(x)sin(2y) Grid points per : N g = 6,8,10,12 No. of processors: 1 – 32 k = 100, 300
July 1, 2010Parallel solution of the Helmholtz equation28 Problem 2 (homogeneous) Based on Erlangga et al '04, §6.2 Eqn: Δu + k 2 u = 0 Domain: unit square [0,1]x[0,1] Dirichlet bndry cond. on one side, with a discontinuity at midpoint 1st-order absorbing bndry cond. on other sides Grid points per : N g = 6, 8, 10 No. of processors: 1 – 32 k = 75, 150, 300, 450, 600
July 1, 2010Parallel solution of the Helmholtz equation29
July 1, 2010Parallel solution of the Helmholtz equation30
July 1, 2010Parallel solution of the Helmholtz equation31
Problem 3 (heterogeneous) 3-layer heterogeneous problem Based on Erlangga et al '04, §6.3 Everything is identical to Problem 2 EXCEPT: k=600 k=450 k=300
July 1, 2010Parallel solution of the Helmholtz equation34
July 1, 2010Parallel solution of the Helmholtz equation35
July 1, 2010Parallel solution of the Helmholtz equation36
July 1, 2010Parallel solution of the Helmholtz equation37
Comparative Timing Results Method time/iter 1 proc time/iter 16 proc it-ratio 1 proc it-ratio 16 proc CARP-CG Bi- CGSTAB GMRES (restart =10) Time/iter of Bi-CGSTAB and GMRES relative to CARP-CG it-ratio = (time/iter of algorithm) / (time/iter of CARP-CG) Results from CARP-CG paper (PARCO, 2010)
Timing and Speedup Results No. procNo. IterTime (s)Speed-upEfficiency % % % % % Problem 2, k=600, N g =8, grid: 763 ,169 (complex) equations, rel-res<10 -7
Summary CARP-CG is highly scalable on the Helmholtz eqn w/high wave numbers Applicable to discontinuous coefficients Very simple to implement General-purpose – useful also for other problems with large off- diagonal elements and discontinuous coefficients
July 1, 2010Parallel solution of the Helmholtz equation41 Other Potential Applications Fourth-order schemes for the Helmholtz equation (already have good initial results) Maxwell equations Saddle-point problems Circuit problems Linear solvers in some eigenvalue methods ...
July 1, 2010Parallel solution of the Helmholtz equation42 Relevant Publications CARP: SIAM J Sci Comp 2005 CGMN: ACM Trans Math Software 2008 Microscopy: J Parallel & Distr Comp 2008 Large convection + discontin coef: CMES 2009 CARP-CG: Parallel Comp 2010 Scaling for discont coef: J Comp & Appl Math 2010 CARP-CG SOFTWARE AVAILABLE ON REQUEST THANK YOU!