Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Software Framework for Easy Parallelization of PDE Solvers

Similar presentations


Presentation on theme: "A Software Framework for Easy Parallelization of PDE Solvers"— Presentation transcript:

1 A Software Framework for Easy Parallelization of PDE Solvers
Hans Petter Langtangen Xing Cai Dept. of Informatics University of Oslo

2 Outline of the Talk Intro & motivation Parallelization
based on domain decomposition at the linear algebra level Implementational aspects Numerical experiments

3 The Question Starting point: sequential PDE solvers
How to do the parallelization? We need a good parallelization strategy a good and simple implementation of the strategy Resulting parallel solvers should have good parallel efficiency good overall numerical performance

4 Problem Domain Partial differential equations
Finite elements/differences Communication through message passing

5 A Known Problem “The hope among early domain decomposition workers was that one could write a simple controlling program which would call the old PDE software directly to perform the subdomain solves. This turned out to be unrealistic because most PDE packages are too rigid and inflexible.” - Smith, Bjørstad and Gropp One remedy: Use of object-oriented programming techniques

6 Domain Decomposition Solution of the original large problem through iteratively solving many smaller subproblems Can be used as solution method or preconditioner Flexibility -- localized treatment of irregular geometries, singularities etc Very efficient numerical methods -- even on sequential computers Suitable for coarse grained parallelization

7 Overlapping DD Alternating Schwarz method for two subdomains
Example: solving an elliptic boundary value problem in A sequence of approximations where

8 Additive Schwarz Method
Subproblems can be solved in parallel Subproblems are of the same form as the original large problem, with possibly different boundary conditions on artificial boundaries

9 Convergence of the Solution
Single-phase groundwater flow

10 Coarse Grid Correction
This DD algorithm is a kind of block Jacobi iteration (CBJ) Problem: often (very) slow convergence Remedy: coarse grid correction A kind of two-grid multigrid algorithm Coarse grid solve on each processor

11 Observations DD is a good parallelization strategy
The approach is not PDE-specific A program for the original global problem can be reused (modulo B.C.) for each subdomain Must communicate overlapping point values No need for global data Data distribution implied Explicit temporal schemes are a special case where no iteration is needed (“exact DD”)

12 Goals for the Implementation
Reuse sequential solver as subdomain solver Add DD management and communication as separate modules Collect common operations in generic library modules Flexibility and portability Simplified parallelization process for the end-user

13 Generic Programming Framework

14 The Administrator Parameters DD algorithm Operations Administrator
solution method or preconditioner, max iterations stopping criterion etc DD algorithm Subdomain solve + coarse grid correction Operations Matrix-vector product, inner-product etc

15 The Subdomain Simulator
seq. solver add-on communication

16 The Communicator Need functionality for exchanging point values inside the overlapping regions Build a generic communication module: The communicator The communicator works with a hidden communication model MPI in use, but easy to change

17 Realization Object-oriented programming (C++, Java, Python)
Use inheritance Simplifies modularization Supports reuse of sequential solver (without touching its source code!)

18 Generic Subdomain Simulators
abstract interface to all subdomain simulators, as seen by the Administrator SubdomainFEMSolver Special case of SubdomainSimulator for finite element-based simulators These are generic classes, not restricted to specific application areas SubdomainSimulator SubdomainFEMSolver

19 Making the Simulator Parallel
class SimulatorP : public SubdomainFEMSolver public Simulator { // … just a small amount of code virtual void createLocalMatrix () { Simulator::makeSystem (); } }; Administrator SubdomainSimulator Simulator SubdomainFEMSolver SimulatorP

20 Performance Algorithmic efficiency Parallel efficiency
efficiency of original sequential simulator(s) efficiency of domain decomposition method Parallel efficiency communication overhead (low) coarse grid correction overhead (normally low) load balancing subproblem size work on subdomain solves

21 Summary So Far A generic approach Works if the DD algorithm works
Make use of class hierarchies The new parallel-specific code, SimulatorP, is very small and simple to write

22 P: number of processors
Application Single-phase groundwater flow DD as the global solution method Subdomain solvers use CG+FFT Fixed number of subdomains M=32 (independent of P) Straightforward parallelization of an existing simulator P: number of processors

23 Diffpack O-O software environment for scientific computation
Rich collection of PDE solution components - portable, flexible, extensible H.P.Langtangen: Computational Partial Differential Equations, Springer 1999

24 Straightforward Parallelization
Develop a sequential simulator, without paying attention to parallelism Follow the Diffpack coding standards Need Diffpack add-on libraries for parallel computing Add a few new statements for transformation to a parallel simulator

25 Linear-algebra-level Approach
Parallelize matrix/vector operations inner-product of two vectors matrix-vector product preconditioning - block contribution from subgrids Easy to use access to all Diffpack v3.0 iterative methods, preconditioners and convergence monitors “hidden” parallelization need only to add a few lines of new code arbitrary choice of number of procs at run-time less flexibility than DD

26 Library Tool class GridPartAdm
Generate overlapping or non-overlapping subgrids Prepare communication patterns Update global values matvec, innerProd, norm

27 Mesh Partition Example

28 A Simple Coding Example
GridPartAdm* adm; // access to parallelizaion functionality LinEqAdm* lineq; // administrator for linear system & solver // ... #ifdef PARALLEL_CODE adm->scan (menu); adm->prepareSubgrids (); adm->prepareCommunication (); lineq->attachCommAdm (*adm); #endif lineq->solve (); set subdomain list = DEFAULT set global grid = grid1.file set partition-algorithm = METIS set number of overlaps = 0

29 Single-phase Groundwater Flow
Highly unstructured grid Discontinuity in the coefficient K (0.1 & 1)

30 Measurements 130,561 degrees of freedom Overlapping subgrids
Global BiCGStab using (block) ILU prec.

31 A Fast FEM N-S Solver Operator splitting in the tradition of pressure correction, velocity correction, Helmholtz decomposition This version is due to Ren & Utnes, 1993

32 A Fast FEM N-S Solver Calculation of an intermediate velocity

33 A Fast FEM N-S Solver Solution of a Poisson Equation
Correction of the intermediate velocity

34 Test Case: Vortex-Shedding

35 Simulation Snapshots Pressure

36 Animated Pressure Field

37 Simulation Snapshots Velocity

38 Animated Velocity Field

39 Some CPU-Measurements
The pressure equation is solved by the CG method

40 Combined Approach Use a CG-like method as basic solver
(i.e. use a parallelized Diffpack linear solver) Use DD as preconditioner (i.e. SimulatorP is invoked as a preconditioner solve) Combine with coarse grid correction CG-like method + DD prec. is normally faster than DD as a basic solver

41 Two-phase Porous Media Flow
SEQ: PEQ: BiCGStab + DD prec. for global pressure eq. Multigrid V-cycle in subdomain solves

42 Two-Phase Porous Media Flow
Simulation result obtained on 16 processors

43 Two-phase Porous Media Flow
History of saturation for water and oil

44 Nonlinear Water Waves

45 Nonlinear Water Waves Fully nonlinear 3D water waves Primary unknowns:
Parallelization based on an existing sequential Diffpack simulator

46 Nonlinear Water Waves CG + DD prec. for global solver
Multigrid V-cycle as subdomain solver Fixed number of subdomains M=16 (independent of P) Subgrids from partition of a global 41x41x41 grid

47 Nonlinear Water Waves 3D Poisson equation in water wave simulation

48 Application Test case: 2D linear elasticity, 241 x 241 global grid.
Vector equation Straightforward parallelization based on an existing Diffpack simulator

49 2D Linear Elasticity

50 2D Linear Elasticity BiCGStab + DD prec. as global solver
Multigrid V-cycle in subdomain solves I: number of global BiCGStab iterations needed P: number of processors (P=#subdomains)

51 Summary Goal: provide software and programming rules for easy parallelization of sequential simulators Two parallelization strategies: domain decomposition: very flexible, compact visible code/algorithm parallelization at the linear algebra level: “automatic” hidden parallelization Performance: satisfactory speed-up

52 Future Application DD with different PDEs and local solvers
Out in deep sea: Eulerian, finite differences, Boussinesq PDEs, F77 code Near shore: Lagrangian, finite element, shallow water PDEs, C++ code


Download ppt "A Software Framework for Easy Parallelization of PDE Solvers"

Similar presentations


Ads by Google