Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computational Science R&D for Electromagnetic Modeling: Recent Advances and Perspective to Extreme-Scale Lie-Quan Lee For SLAC Computational Team ComPASS.

Similar presentations


Presentation on theme: "Computational Science R&D for Electromagnetic Modeling: Recent Advances and Perspective to Extreme-Scale Lie-Quan Lee For SLAC Computational Team ComPASS."— Presentation transcript:

1 Computational Science R&D for Electromagnetic Modeling: Recent Advances and Perspective to Extreme-Scale Lie-Quan Lee For SLAC Computational Team ComPASS all hands meeting, Boulder, CO October, 2009

2 Overview *Recent advances in CS/AM for electromagnetic modeling –Eigensolvers –Meshing –Load balancing –Visualization *Perspective to extreme-scale –Extreme scale problems –Perspective from computational science

3 Frequency Domain Eigenmode Analysis: Omega3P Find frequency, quality factor and field vector of modes. Maxwell’s Eqns in Frequency Domain N1N1 N2N2 Curved tetrahedral finite elements with higher-order vector basis functions N i : ACE3P Finite Element Method Generalized Eigenvalue Problem Interior eigenvalues Can be a complex nonlinear eigenvalue problem with more complex boundary conditions For order p=2: 20 different N i ’ s For order p=6: 216 different N i ’ s

4 Example Spectrum of the Eigensystem Zoom in Eigenvalue of interest

5 Eigensolver and Linear Solver for Modeling Accelerator Cavities *To solve the eigensystem for interior eigenvalues *The shift-invert spectral transformation *Need to solve highly indefinite linear system –Sparse direct solver or iterative solver with good preconditioner

6 Memory Configuration of Recent Supercomputers Memory-usage scalability of Algorithms is a critical issue!

7 Memory-usage of A Sparse Direct Solver *Maximal per-core MU is 4-5 times larger than the average MU *Once it cannot fit into N cores, it most likely will not fit into 2*N cores *More “memory-usage” scalable solver needed MUMPS per-core memory usage N=1.11M, nnz=46.1M Complex matrix

8 Another Sparse Direct Solver *Speed: scales better *Memory usage: scales poorly (bottleneck)

9 Memory Bottleneck *Need solvers to be scalable for memory-usage –Hybrid linear solver (LBNL) –Domain specific spectra multilevel preconditioner –Scalable eigensolvers (Today)

10 Construct Explicit Gradient Space G *Tree-cotree splitting for lowest order vector basis functions –Minimum spanning tree for a mesh (edges on electric boundary conditions needs special care) –Remove all the DOFs on the tree edges –Add the gradient of vertices as the replacing basis functions *Explicit formulation of gradient space for higher order basis functions Example of tree- cotree splitting for 2D circular cavity

11 New Scalable Eigensolver: Method 1 Decompose finite element bases { N i } into gradient space G an rotational space R. That makes the GEP Kx= Mx into two-by-two block form Where K 11 is symmetric positive definite. Thus, the null space of matrix K is Y:

12 More Scalable Eigensolver (continue) *We can prove that the original GEP Kx= Mx has the same non-zero eigenvalues as the following eigenvalue problem: *We will use Arnoldi algorithm to compute smallest (extreme) eigenvalues of the above EP. *(I-YY T ) is very easy to apply because *Corresponding eigenvectors can be recovered from the following (M-orthogonalization with the null space Y) *The benefit is, we can solve Mp=q in a very scalable way (memory-usage): for example, conjugate gradient with incomplete Cholesky preconditioner

13 Preliminary Results for the Method 1 *DDS Cell *Compute the first two nonzero eigen-pairs *Shift-Invert *Number of operation (K-  M) -1 M : 53 *New method *Number of operation (I-YY T )M -1 K : 1361 *Remember that M -1 is much more scalable and easy than (K-  M) -1. Computer model of 1/8 th of the DDS Cell

14 Caveats for Method 1 *The residual for the eigenvalue problem cannot be too small (not a big issue) –After the transformation back, the residual of the original eigenvalue problem is small *Need larger search space *The convergence of the eigenvalue problem is mesh- dependent –Arnoldi method makes extreme eigenvalues converge Smallest eigenvalues Largest eigenvalues –Denser mesh will make the largest eigenvalue larger! Deflate those converged but unwanted eigenvalues? No. of Elements91319788 No. of Iterations13613469

15 New Thoughts *Consider the following eigen-system and That is equivalent to

16 Another New Scalable Eigensolver: Method 2 *Make a transformation: *Use Arnoldi Method: *For each matrix-vector multiplication Ap=q *Two solves: one for M 22 and the other for K 11 *Use conjugate gradient with incomplete Cholesky A

17 Preliminary Results for the Method 2 *DDS Cell *Compute the first two nonzero eigen-pairs *Shift-Invert *Number of operation (K-  M) -1 M : 53 *New method *Number of operation : 38 *Remember that operation is much more scalable and easy than (K-  M) -1. Computer model of 1/8 th of the DDS Cell

18 Future Work K 11 is more and more difficult to solve as meshes gets denser –Further study needed to efficiently solve it (e.g., convergence independent of mesh size?) Spectra of K 11 from different meshes

19 Multi-file NetCDF Format *SLAC and RPI collaborated in parallel mesh generation for meshes with large number of elements *A multi-file NetCDF format is designed to remove synchronized parallel writing bottleneck *Preliminary testing has shown the success and efficacy of using the format File1 File2 File3 File4 File5 File6 File7 Summary SLAC Finite-element Simulation Suite Multi-file NetCDF format

20 Effects of Inverted Curved Elements *Inverted Curved Elements *E.g.: Edge crosses the volume at the points other than vertices 2D example

21 Spectra Comparison For Frequency Domain Analysis Red: from original mesh with inverted curved elements Blue: from fixed mesh

22 Spectra Comparison For Frequency Domain Analysis Red: from original mesh with inverted curved elements Blue: from fixed mesh Abnormal eigenvalue

23 Impact on Frequency Domain Analysis *Largest eigenvalue from mesh with inverted curved element is an order of magnitude larger *Small nonzero eigenvalues does not change much! (Good) *If sparse direct solver (SDS) is used for shifted linear system, the impact is minimal due to relatively robust SDS

24 Spectra Comparison for Time Domain Analysis Red: from original mesh with inverted curved elements Blue: from fixed mesh

25 Spectra Comparison for Time Domain Analysis Red: from original mesh with inverted curved elements Blue: from fixed mesh Abnormal eigenvalue

26 Impact on Time Domain Analysis *Largest eigenvalue from mesh with inverted curved element is an order of magnitude larger *Smallest eigenvalue does not change *Condition number of a matrix is ratio of the largest eigenvalue to the smallest eigenvalue *Convergence of iterative solver suffers greatly because of one order of magnitude larger condition number *The noise associated with this large eigenvalue is unknown (likely to have adverse effects in longer time) *Correcting inverted curved element and controlling the shape are crucial! –RPI implement inverted curved element correction tool –Recently they added in shape control measure –Geometry-mesh relation is very important but it is missing in the tool

27 Mesh Partitioning for Balanced Load *Mismatch –Partitioning on mesh elements –Load is on number of DOFs *Need improved graph model for unstructured mesh –Current graph model represents only element face sharing –Tetrahedral elements sharing only edges will be accurately represented –Graph edges are weighted to balance number of degrees of freedom *Refine partitioning to balancing number of DOFs (>10k cores) 56 million elements

28 Visualization *Greg Schussman’s talk today

29 Extreme-Scale Computing Need: ILC/Project X Cryomodule for International Linear Collider / Project X What we accomplished -Frequency-domain: 10 7 elements, 10 8 DOFs -Time-domain: 10 8 elements, 10 9 DOFs Physics Goal -Broadband beam heating in superconducting cryomodule (~10m long) with 300  m beam size Problem size - 5 x 10 12 elements, > 10 13 DOFs, 10 22 flops HEP Extreme-scale Workshop, SLAC, Dec. 2008

30 Extreme scale Coupling, wakefield and dark current 5 x 10 13 elements, ~ 10 14 DOFs, 100K time steps, 10 24 flops A snapshot of fields in coupled system of PETS and main accelerating structure (Arno Candel) Preliminary study (Arno Candel) 17 million elements with 20 million DOFs Extreme-Scale Computing Need: CLIC Drive Beam Main Beam Two-Beam Module for CLIC PETS Accelerating structure HEP Extreme-scale Workshop, SLAC, Dec. 2008

31 Omega3P Success in EM Modeling An increase of 10 5 in problem size with 0.01% relative error over a decade From closed cavities to waveguide loaded cavities 2D Cell 3D Cell 2D detuned structure 3D detuned structure SCR cavity Cryomodule Moore’s law RF unit Year

32 Current Simulation and Analysis Flow Mesh generation Analysis & visualization CAD model Netcdf mesh file Input parameters Partition mesh Assemble matrices Solvers: frequency/time domains Postprocess: E/B

33 Path to Extreme-scale Computing *Meshing (ITAPS) –Parallel mesh generation –Adaptive mesh refinement –Online mesh generation *Partitioning and Load Balancing (CSCAPES & ITAPS) –Improved scalability and better balancing *Solvers (TOPS) –Speed and memory-usage scalability –Advancement in different level Use new method (domain decomposition, discontinuous Galerkin, etc) Device new algorithms for eigensolvers, linear solvers, preconditioners *Visualization and Analysis (IUSV) –Parallel visualization –Explorative methods –Integrated simulation and analysis


Download ppt "Computational Science R&D for Electromagnetic Modeling: Recent Advances and Perspective to Extreme-Scale Lie-Quan Lee For SLAC Computational Team ComPASS."

Similar presentations


Ads by Google