Download presentation

Presentation is loading. Please wait.

Published bySawyer Norrie Modified about 1 year ago

1
Weighted Matrix Reordering and Parallel Banded Preconditioners for Nonsymmetric Linear Systems Murat Manguoğlu*, Mehmet Koyutürk**, Ananth Grama* and Ahmed Sameh* * Purdue University ** Case Western Reserve University Support: DARPA, NSF, Intel, NCSA

2
A computational loop Integration Newton Iteration Linear system solvers kk kk tt

3
Motivation New architectures increasingly rely on parallelism Concurrency and localization play an important role Algorithms for such platforms must account for concurrency and memory references

4
Implications: General Sparse Solvers Maximal use of dense kernels Development of methods that optimize concurrency A banded matrix is a natural candidate as a preconditioner

5
Preprocessing to Obtain the Preconditioner (BiCGStab/GMRES is used as the iterative solver) ILUPACK : Multilevel ILU [Bollh ö fer] –http://www.math.tu-berlin.de/ilupack/ ILUT : Incomplete LU Factorization from Sparsekit [Saad] –http://www-users.cs.umn.edu/~saad/software/SPARSKIT/sparskit.html ILUT-I : Improved ILUT [Benzi, et. al. ] 1.reorder using HSL-MC64 to maximize the product of the diagonals and scale the matrix 2.apply symmetric RCM reordering 3.get the incomplete factorization via ILUT

6
WSO : Our proposed method 1.reorder using HSL-MC64 to make the diagonal zero free 2.reorder |A| + |A T | using HSL-MC73 to place larger elements closest to the main diagonal 3.extract a banded preconditioner, such that %99.9 percent of the weight is inside the band 4.factorize the banded preconditioner

7
Test Problems Matrix NameApplicationnnnz 1. ASIC_680KCircuit Simulation680,0002, 638, DC1Circuit Simulation116, , FINAN512Econometrics74, , H2OQuantum Chemistry67, 0242, 216, D_54019_HIGHKDevice Simulation54, , APPUNASA Benchmark14, 0001, 853, 104

8
Comparison to ILUPACK AMF/PQ preconditoners on an uniprocessor [of Sgi-Altix] Method\Matrix Number ILUPACK-AMF>600 sConv.BestConv. ILUPACK-PQ>600 sConv. BestConv. WSOBest Conv. Best Outer Iterative Solver: unrestarted GMRES ILUPACK Parameters : droptol : 1e-1, bound for inv(L), inv(U) : 10, elbow space : 100

9
Comparison to ILUT and Improved- ILUT Preconditioners on an uniprocessor [of Clovertown] Method\Matrix Number ILUT(1e-1, n)FailConv.Best FailConv. ILUTI(1e-1,n)Conv. ILUT(1e-3,n)FailConv. Fail>600s ILUTI(1e-3,n)Conv. >600s ILUT(0,k)Fail>600sConv.>600sConv.>600s ILUTI(0,k)Conv.>600sConv.>600sConv.>600s WSOBest Conv. Best Outer Iterative Solver : BiCGStab

10
WSO: Factorization+Solve time Scalability Speed improvement over uniprocessor timing on Sgi-Altix

11
Reordering and Solve Times of 3 Different Systems on an Uniprocessor

12
Reservoir Simulation (SPE10 benchmarks) Problem #1 : N= 2,244,000 Problem #2 : N= 2,462,265 “banded systems” → Simple/no reordering to extract a central band as a preconditioner Results on an SGI-Altix

13
Reservoir Simulation #1 Algebraic Multigrid time: 31.4 seconds (AMD dual core)

14
Reservoir Simulation #1

15

16
Reservoir Simulation #

17
Reservoir Simulation #2

18
Summary and Future Work Weighted reordering is an effective method for obtaining a banded preconditioner Overall the method we propose is both reliable and scalable Spectral reordering is relatively inexpensive for extracting banded preconditioners for solving several systems with “roughly the same” matrix of coefficients. Parallel weighted reordering schemes needs to be developed

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google