Download presentation

Presentation is loading. Please wait.

Published byLola Aster Modified about 1 year ago

1
A NOVEL APPROACH TO SOLVING LARGE-SCALE LINEAR SYSTEMS Ken Habgood, Itamar Arel Department of Electrical Engineering & Computer Science GABRIEL CRAMER (1704-1752) April 3, 2009

2
Outline Problem statement and motivation Novel approach Revisiting Cramer’s rule Matrix condensation Illustration of the proposed scheme Implementation results Challenges ahead 2

3
Solving large-scale linear systems Many scientific applications Computer models in finance, biology, physics Real-time load flow calculations for electric utilities Short-circuit fault, economic analysis Consumer generation on the electric grid may soon require real-time calculations (hybrid cars, solar panels) 3

4
Why try to improve? We want parallel processing for speed Current schemes use Gaussian elimination Mainstream approach: LU decomposition O(N 3 ) computational complexity, O(N 2 ) parallelizable If you had N 2 processors O(N) time So far so good … The “catch”: I rregular communication patterns for load balancing across processing nodes 4

5
Cramer’s rule revisited 5 ax + by = e cx + dy = f e b f d a b c d x = The solution to a linear system Ax = b is given by x i = |A i (b)|/|A| where A i (b) denotes A with its i th column replaced by b O(N!) computational complexity

6
Chio’s matrix condensation 6 a 1 b 1 c 1 a 2 b 2 c 2 a 3 b 3 c 3 Matrix A = a 1 –(n-2) a 1 b 1 a 2 b 2 a 1 c 1 a 2 c 2 a 1 b 1 a 3 b 3 a 1 c 1 a 3 c 3 b 2’ c 2’ b 3’ c 3’ = a 1 –(n-2) a 1,1 cannot be 0 If a 1,1 is 1 then a 1,1 –(n-2) =1 Let D denote the matrix obtained by replacing each element a i,j by a 1,1 a 1,j a i,1 a i,j Then |A| = |D| a 1,1 n-2 Recursive determinant calculation O(N 3 ) computational complexity

7
Highlights of the approach Chio’s condensation combined with Cramer’s rule results in O(N 4 ) Goal to remain at O(N 3 ) Retain attractive parallel processing potential Solution: clever bookkeeping to reduce computations “Mirror” matrix before applying condensation Each matrix solves for half of the unknowns Condense each until matrix size matches the number of unknowns Mirror the matrices again 7

8
3 3 2 1 2 -5 4 0 -5 0 -4 -3 0 4 2 3 4 -4 2 3 -2 3 1 4 -5 2 2 4 -2 4 -5 1 3 -2 0 2 -5 -3 -2 2 -4 -3 -5 0 2 0 -4 4 4 -2 0 -4 -5 -5 -2 0 1 0 2 -4 4 0 1 -5 0 0 0-40034-3-204 3 3 2 1 2 -5 4 0 -5 0 -4 -3 0 4 2 3 4 -4 2 0-40 Mirroring the matrix 3 -2 3 1 4 -5 2 2 4 -2 4 -5 1 3 -2 0 2 -5 -3 -2 2 -4 -3 -5 0 2 0 -4 4 4 -2 0 -4 -5 -5 -2 0 1 0 2 034-3-20 9 unknowns to solve for

9
3 3 2 1 2 -5 4 0 -5 0 -4 -3 0 4 2 3 4 -4 2 3 -2 3 1 4 -5 2 2 4 -2 4 -5 1 3 -2 0 2 -5 -3 -2 2 -4 -3 -5 0 2 0 -4 4 4 -2 0 -4 -5 -5 -2 0 1 0 2 -4 4 0 1 -5 0 0 0-40034-3-204 3 3 2 1 2 -5 4 0 -5 0 -4 -3 0 4 2 3 4 -4 2 3 -2 3 1 4 -5 2 2 4 -2 4 -5 1 -3 2 0 -2 5 3 2 4 3 5 0 0 4 -4 2 2 0 4 5 1 5 5 2 0 1 0 1 -2 0-4003 320 4 0 1 -5 0 0 4 Mirroring the matrix (cont’)

10
3 3 2 1 2 -5 4 0 -5 0 -4 -3 0 4 2 3 4 -4 2 3 -2 3 1 4 -5 2 2 4 -2 4 -5 1 3 -2 0 2 -5 -3 -2 2 -4 -3 -5 0 2 0 -4 4 4 -2 0 -4 -5 -5 -2 0 1 0 2 -4 4 0 1 -5 0 0 0-40034-3-204 3 3 2 1 2 -5 4 0 -5 0 -4 -3 0 4 2 3 4 -4 2 3 -2 3 1 4 -5 2 2 4 -2 4 -5 1 -3 2 0 -2 5 3 2 4 3 5 0 0 4 -4 2 2 0 4 5 1 5 5 2 0 1 0 1 -2 0-4003 320 4 0 1 -5 0 0 4 Mirroring the matrix (cont’) 5 unknowns to solve for 4 unknowns to solve for

11
3 3 2 1 2 -5 4 0 -5 0 -4 -3 0 4 2 3 4 -4 2 3 -2 3 1 4 -5 2 2 4 -2 4 -5 1 3 -2 0 2 -5 -3 -2 2 -4 -3 -5 0 2 0 -4 4 4 -2 0 -4 -5 -5 -2 0 1 0 2 -4 4 0 1 -5 0 0 0-40034-3-204 3 3 2 1 2 -5 4 0 -5 0 -4 -3 0 4 2 3 4 -4 2 3 -2 3 1 4 -5 2 2 4 -2 4 -5 1 -3 2 0 -2 5 3 2 4 3 5 0 0 4 -4 2 2 0 4 5 1 5 5 2 0 1 0 1 -2 0-4003 320 4 0 1 -5 0 0 4 Mirroring the matrix (cont’) 5 unknowns to solve for 4 unknowns to solve for

12
3 3 2 1 2 -5 4 0 -5 0 -4 -3 0 4 2 3 4 -4 2 3 -2 3 1 4 -5 2 2 4 -2 4 -5 1 3 -2 0 2 -5 -3 -2 2 -4 -3 -5 0 2 0 -4 4 4 -2 0 -4 -5 -5 -2 0 1 0 2 -4 4 0 1 -5 0 0 0-40034-3-204 Chio’s matrix condensation

13
3 3 2 1 2 -5 4 0 -5 0 -4 -3 0 4 2 3 4 -4 2 3 -2 3 1 4 -5 2 2 4 -2 4 -5 1 3 -2 0 2 -5 -3 -2 2 -4 -3 -5 0 2 0 -4 4 4 -2 0 -4 -5 -5 -2 0 1 0 2 -4 4 0 1 -5 0 0 0-40034-3-204 Chio’s matrix condensation (cont’) = 0

14
3 3 2 1 2 -5 4 0 0 -5 0 -4 -3 0 4 2 3 4 -4 2 3 -2 3 1 4 -5 2 2 4 -2 4 -5 1 3 -2 0 2 -5 -3 -2 2 -4 -3 -5 0 2 0 -4 4 4 -2 0 -4 -5 -5 -2 0 1 0 2 -4 4 0 1 -5 0 0 0-40034-3-204 Chio’s matrix condensation (cont’) -6 =

15
3 3 1 2 -5 4 0 0 0 -4 -3 0 4 -6 3 4 -4 2 3 -15 3 1 4 -5 2 2 6 -2 4 -5 1 3 -15 0 2 -5 -3 -2 2 -4 3 -5 0 2 0 -4 4 4 -18 -2 0 -4 -5 -5 9 0 1 0 2 4 0 1 -4 -5 0 0 0-40034-3-204 Chio’s matrix condensation (cont’) = 2 -4 24

16
3 3 2 1 2 -5 4 0 0 0 -4 -3 0 4 -6 3 4 -4 2 3 -15 3 1 4 -5 2 2 6 -2 4 -5 1 3 -15 0 2 -5 -3 -2 2 -4 3 -5 0 2 0 -4 4 4 -18 -2 0 -4 -5 -5 9 0 1 0 2 24 0 1 -4 -5 0 0 0-40034-3-204 Chio’s matrix condensation (cont’) = -13 4

17
3 2 1 2 -5 4 0 -13 -2 2 -17 -5 0 4 1 8 -20 8 -19 6 3 3 0 -9 27 -27 6 2 -19 -8 8 -5 -3 3 -6 3 -21 6 -18 6 -4 -7 4 14 -20 4 12 4 -14 -4 -20 5 -19 -15 -5 10 8 7 -25 17 6 8 7 -4 -35 16 0 0-1200912-9-6012 Chio’s matrix condensation (cont’) = 30-6-156 3-18924

18
-13 -2 2 -17 -5 0 1 8 -20 8 -19 6 3 0 -9 27 -27 6 -19 -8 8 -5 -3 -6 3 -21 6 -18 6 -7 4 14 -20 4 12 -14 -4 -20 5 -19 -15 10 8 7 -25 17 6 8 7 -4 -35 16 0 -1200912-9-6012 Chio’s matrix condensation (cont’) 0-6-156 3-18924 The value in the a1,1 position cannot be zero

19
9 -5 1 -3 6 7 -2 2 4 -5 -2 6 -7 5 2 -2 -9 2 973-40 -3 Mirroring the matrix

20
9 -5 1 -3 6 7 -2 2 4 -5 -2 6 -7 5 1 2 -2 -9 2 973-40 -3 9 -5 1 -3 6 7 -2 2 4 -5 -2 6 -7 5 1 2 -2 -9 2 97 3 -40 -3 Mirroring the matrix (cont’)

21
9 -5 1 -3 6 7 -2 2 4 -5 -2 6 -7 5 1 2 -2 -9 2 973-40 -3 9 -5 1 -3 6 7 -2 2 4 5 2 -6 7 -5 7 -2 -9 2 9 7 -3 40 Chio’s matrix condensation -6 8 19 -10 6 -17 14 -9 8 -7 4 -8 -6 1 420-1211-6 3 unknowns to solve for 2 unknowns to solve for

22
Applying Cramer’s rule -6 8 19 -10 6 -17 14 -9 8 -7 4 -8 -6 1 420-1211-6 8 19 -10 6 -17 14 -9 8 -7 4 -8 -6 1 420-1211-6

23
Applying Cramer’s rule -6 8 19 -10 6 -17 14 -9 8 -7 4 -8 -6 1 420-1211-6 8 19 -10 6 -17 14 -9 8 -7 4 420-1211 = = 2688 7728 = 3 Answer for x 9

24
Applying Cramer’s rule -6 8 19 -10 6 -17 14 -9 8 -7 4 -8 -6 1 420-12 11 -6 8 19 -10 6 -17 14 -9 8 -7 4 420-1211 = = 2688 180 = 0.07 Answer for x 8

25
Overview of data flow structure 25 Mirroring of the matrix keeps an O(N 3 ) algorithm. Original Matrix (N) Original Matrix Mirror (N) (N/2) (N/2) Image (N/2) ( N/2) Image (N/4) (N/4)Image (N/4) (N/4) I (N/4)(N/4) I (N/4) Original Matrix (N) 24 variables 12 variables 6 variables 3 x 3 Chio’s condensation

26
Parallel computations Similar to LU-decomposition (Access by rows) Broadcast communication only Send-ahead on lead row values 26 Mirroring provides an advantage Algorithm mirrors as matrix reduces in size Load naturally redistributed among processors LU-decomposition needs blocking and interleaving to avoid idle processors, leads to complex communication patterns (overhead) Figure 9.2: Parallel Scientific Computing in C++ and MPI. George Em Karniadakis and Robert M. Kirby II

27
Paradigm shift – key points Apply Cramer’s rule Employ matrix condensation for efficient determinant calculations Highly parallel O(N 3 ) process Clever bookkeeping to re-use information Final result O(N 3 ) comp. with O(N 2 ) comm. Key advantage: regular communication patterns with low comm overhead and balanced processing load 27

28
Implementation results Trial platform Single-core Pentium M @ 1.5 GHz 64 KB L1 cache, 1 MB L2 cache Coded in C with SSE used for core function (Chio’s condensation) Memory access optimized using cache blocking Double precision variables and calculations Result: ~2.4x slower than Matlab (consistently) 28

29
Challenges ahead Further code improvement/optimization Current L2 miss rate is high Precision improvement Parallel implementation GPU implementation Distributed architecture implementation Sparse matrix optimization Other linear algebra applications (e.g. matrix inversion) 29

30
Thank you 30

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google