ECE 530 – Analysis Techniques for Large-Scale Electrical Systems Prof. Hao Zhu Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign 12/2/ Lecture 25: Krylov Subspace Methods
Announcements No class on Thursday Dec 4 Homework 8 posted, due on Thursday Dec 11 2
Krylov Subspace Outline Review of fields and vector spaces Eigensystem basics Definition of Krylov subspaces and annihilating polynomial Generic Krylov subspace solver Steepest descent Conjugate gradient 3
Krylov Subspace Iterative methods to solve Ax=b build on the idea that Given a matrix A and a vector v, the i th order Krylov subspace is defined as For a specified matrix A and a vector v, the largest value of i is bounded 4
Generic Krylov Subspace Solver The following is a generic Krylov subspace solver method for solving Ax = b using only matrix vector multiplies Step 1: Start with an initial guess x (0) and some predefined error tolerance > 0; compute the residual, r (0) = b – A x (0) ; set i = 0 Step 2: While ||r (i) || Do (a) i := i + 1 (b) get K i (r (0),A) (c) find x (i) in {x (0) + K i (r (0),A)} to minimize ||r (i) || Stop 5
Note that no calculations are performed in Step 2 once i becomes greater than its largest value The Krylov subspace methods differ from each other in – the construction scheme of the Krylov subspace in Step 2(b) of the scheme – the residual minimization criterion used in Step 2(c) A common initial guess is x (0) = 0, giving r (0) = b – A x (0) = b Every solver involves the A matrix only in matrix- vector products: A i r (0), i=1,2,… Krylov Subspace Solver 6
Directly constructing the Krylov Subspace for any A and r (0) would be computationally expensive We will instead introduce iterative optimization methods for solving Ax = b, which turns out to be a a special case of Krylov Subspace method Without loss of generality, consider the system Ax = b where A is symmetric (i.e., A = A T ) and positive definite (i.e., A ≻0, all eigenvalues nonnegative) Any Ax = b with nonsingular A is equivalent to A T Ax = A T b where A T A is symmetric and positive definite Iterative Optimization Methods 7
Optimization Problem 8
Steepest Descent Algorithm 9 Note there is only one matrix, vector multiply per iteration
Steepest Descent Convergence 10
Steepest Descent Convergence 11
An improvement over the steepest descent is to take the exact number of steps using a set of search directions and obtain the solution after n such steps; this is the basic idea in the conjugate direction methods Image compares steepest descent with a conjugate direction approach Conjugate Direction Methods Image Source: 12
The basic idea is the n search directions denoted by need to be A-orthogonal, that is At the i th iteration, we will update Conjugate Direction Methods 13
Stepsize Selection 14
Convergence Proof 15
Linearly Independent Directions Proposition: If A is positive definite, and the set of nonzero vectors d (0), d (1) … d (n-1) are, then these vectors are linearly independent (l.i.) Proof: Suppose there are constants i, i=0,1,2,…n such Multiplying by A and then scalar product with d (i) gives Since A is positive definite, it follows i = 0 Hence, the vectors are l.i. 16 Recall l.i. only if all 's = 0
Conjugate Direction Method 17 What we have not yet covered is how to get the n search directions. We'll cover that shortly, but the next slide presents an algorithm, followed by an example.
Orthogonalization To quickly generate A –orthogonal search directions, one can use the Gram-Schmidt orthogonalization procedure Suppose we are given a l.i. set of n v ectors {u 0, u 1, …, u n-1 }, successively construct d (j), j=0, 1, 2, … n-1, by removing from u j all the components along directions The trick is to use the gradient directions; i.e., u i = r (i) for all i=0,1,…,n-1, which yields the very popular conjugate gradient method 18
Conjugate Gradient Method 19
Conjugate Gradient Algorithm 20 Note that there is only one matrix vector multiply per iteration!
Conjugate Gradient Example Using the same system as before, let Select i=0, x (0) = 0, = 0.1, then r (0) = b With i = 0, d (0) = r (0) = b 21
Conjugate Gradient Example 22 This first step exactly matches Steepest Descent
Conjugate Gradient Example With i=1 solve for (1) Then
Conjugate Gradient Example And
Conjugate Gradient Example With i=2 solve for (2) Then
Conjugate Gradient Example And 26 Done in 3 = n iterations!
Krylov Subspace Method 27
References D. P. Bertsekas, Nonlinear Programming, Chapter 1, 2 nd Edition, Athena Scientific, 1999 Y. Saad, Iterative Methods for Spare Linear Systems, 2002, free online at 28