Presentation is loading. Please wait.

Presentation is loading. Please wait.

A NOVEL APPROACH TO SOLVING LARGE-SCALE LINEAR SYSTEMS Ken Habgood, Itamar Arel Department of Electrical Engineering & Computer Science GABRIEL CRAMER.

Similar presentations


Presentation on theme: "A NOVEL APPROACH TO SOLVING LARGE-SCALE LINEAR SYSTEMS Ken Habgood, Itamar Arel Department of Electrical Engineering & Computer Science GABRIEL CRAMER."— Presentation transcript:

1 A NOVEL APPROACH TO SOLVING LARGE-SCALE LINEAR SYSTEMS Ken Habgood, Itamar Arel Department of Electrical Engineering & Computer Science GABRIEL CRAMER ( ) April 3, 2009

2 Outline  Problem statement and motivation  Novel approach  Revisiting Cramer’s rule  Matrix condensation  Illustration of the proposed scheme  Implementation results  Challenges ahead 2

3 Solving large-scale linear systems  Many scientific applications  Computer models in finance, biology, physics  Real-time load flow calculations for electric utilities Short-circuit fault, economic analysis Consumer generation on the electric grid may soon require real-time calculations (hybrid cars, solar panels) 3

4 Why try to improve?  We want parallel processing for speed  Current schemes use Gaussian elimination  Mainstream approach: LU decomposition  O(N 3 ) computational complexity, O(N 2 ) parallelizable  If you had N 2 processors  O(N) time  So far so good …  The “catch”: I rregular communication patterns for load balancing across processing nodes 4

5 Cramer’s rule revisited 5 ax + by = e cx + dy = f e b f d a b c d x =  The solution to a linear system Ax = b is given by  x i = |A i (b)|/|A| where A i (b) denotes A with its i th column replaced by b  O(N!) computational complexity

6 Chio’s matrix condensation 6 a 1 b 1 c 1 a 2 b 2 c 2 a 3 b 3 c 3 Matrix A = a 1 –(n-2) a 1 b 1 a 2 b 2 a 1 c 1 a 2 c 2 a 1 b 1 a 3 b 3 a 1 c 1 a 3 c 3 b 2’ c 2’ b 3’ c 3’ = a 1 –(n-2)  a 1,1 cannot be 0  If a 1,1 is 1 then a 1,1 –(n-2) =1  Let D denote the matrix obtained by replacing each element a i,j by a 1,1 a 1,j a i,1 a i,j Then |A| = |D| a 1,1 n-2  Recursive determinant calculation  O(N 3 ) computational complexity

7 Highlights of the approach  Chio’s condensation combined with Cramer’s rule results in O(N 4 )  Goal to remain at O(N 3 )  Retain attractive parallel processing potential  Solution: clever bookkeeping to reduce computations  “Mirror” matrix before applying condensation  Each matrix solves for half of the unknowns  Condense each until matrix size matches the number of unknowns  Mirror the matrices again 7

8 Mirroring the matrix unknowns to solve for

9 Mirroring the matrix (cont’)

10 Mirroring the matrix (cont’) 5 unknowns to solve for 4 unknowns to solve for

11 Mirroring the matrix (cont’) 5 unknowns to solve for 4 unknowns to solve for

12 Chio’s matrix condensation

13 Chio’s matrix condensation (cont’) = 0

14 Chio’s matrix condensation (cont’) -6 =

15 Chio’s matrix condensation (cont’) =

16 Chio’s matrix condensation (cont’) = -13 4

17 Chio’s matrix condensation (cont’) =

18 Chio’s matrix condensation (cont’) The value in the a1,1 position cannot be zero

19 Mirroring the matrix

20 Mirroring the matrix (cont’)

21 Chio’s matrix condensation unknowns to solve for 2 unknowns to solve for

22 Applying Cramer’s rule

23 Applying Cramer’s rule = = = 3  Answer for x 9

24 Applying Cramer’s rule = = = 0.07  Answer for x 8

25 Overview of data flow structure 25 Mirroring of the matrix keeps an O(N 3 ) algorithm. Original Matrix (N) Original Matrix Mirror (N) (N/2) (N/2) Image (N/2) ( N/2) Image (N/4) (N/4)Image (N/4) (N/4) I (N/4)(N/4) I (N/4) Original Matrix (N) 24 variables 12 variables 6 variables 3 x 3 Chio’s condensation

26 Parallel computations  Similar to LU-decomposition (Access by rows)  Broadcast communication only  Send-ahead on lead row values 26  Mirroring provides an advantage  Algorithm mirrors as matrix reduces in size  Load naturally redistributed among processors  LU-decomposition needs blocking and interleaving to avoid idle processors, leads to complex communication patterns (overhead) Figure 9.2: Parallel Scientific Computing in C++ and MPI. George Em Karniadakis and Robert M. Kirby II

27 Paradigm shift – key points  Apply Cramer’s rule  Employ matrix condensation for efficient determinant calculations  Highly parallel O(N 3 ) process  Clever bookkeeping to re-use information  Final result  O(N 3 ) comp. with O(N 2 ) comm.  Key advantage: regular communication patterns with low comm overhead and balanced processing load 27

28 Implementation results  Trial platform  Single-core Pentium 1.5 GHz  64 KB L1 cache, 1 MB L2 cache  Coded in C with SSE used for core function (Chio’s condensation)  Memory access optimized using cache blocking  Double precision variables and calculations  Result: ~2.4x slower than Matlab (consistently) 28

29 Challenges ahead  Further code improvement/optimization  Current L2 miss rate is high  Precision improvement  Parallel implementation  GPU implementation  Distributed architecture implementation  Sparse matrix optimization  Other linear algebra applications (e.g. matrix inversion) 29

30 Thank you 30


Download ppt "A NOVEL APPROACH TO SOLVING LARGE-SCALE LINEAR SYSTEMS Ken Habgood, Itamar Arel Department of Electrical Engineering & Computer Science GABRIEL CRAMER."

Similar presentations


Ads by Google