Download presentation

Presentation is loading. Please wait.

1
1cs542g-term1-2007 Notes Assignment 1 due tonight (email me by tomorrow morning)

2
2cs542g-term1-2007 The Power Method Start with some random vector v, ||v|| 2 =1 Iterate v=(Av)/||Av|| The eigenvector with largest eigenvalue tends to dominate How fast? Linear convergence, slowed down by close eigenvalues

3
3cs542g-term1-2007 Shift and Invert (Rayleigh Iteration) Say the eigenvalue we want is approximately k The matrix (A- k I) -1 has the same eigenvectors as A But the eigenvalues are Use this in the power method instead Even better, update guess at eigenvalue each iteration: Gives cubic convergence! (triples the number of significant digits each iteration when converging)

4
4cs542g-term1-2007 Maximality and Orthogonality Unit eigenvectors v 1 of the maximum magnitude eigenvalue satisfy Unit eigenvectors v k of the k’th eigenvalue satisfy Can pick them off one by one, or….

5
5cs542g-term1-2007 Orthogonal iteration Solve for lots (or all) of eigenvectors simultaneously Start with initial guess V For k=1, 2, … Z=AV VR=Z (QR decomposition: orthogonalize Z) Easy, but slow (linear convergence, nearby eigenvalues slow things down a lot)

6
6cs542g-term1-2007 Rayleigh-Ritz Aside: find a subset of the eigenpairs E.g. largest k, smallest k Orthogonal estimate V (n k) of eigenvectors Simple Rayleigh estimate of eigenvalues: diag(V T AV) Rayleigh-Ritz approach: Solve k k eigenproblem V T AV Use those eigenvalues (Ritz values) and the associated orthogonal combinations of columns of V Note: another instance of “assume solution lies in span of a few basis vectors, solve reduced dimension problem”

7
7cs542g-term1-2007 Solving the Full Problem Orthogonal iteration works, but it’s slow First speed-up: make A tridiagonal Sequence of symmetric Householder reflections Then Z=AV runs in O(n 2 ) instead of O(n 3 ) Other ingredients: Shifting: if we shift A by an exact eigenvalue, A- I, we get an exact eigenvector out of QR (the last column) - improves on linear convergence Division: once an offdiagonal is almost zero, problem separates into decoupled blocks

8
8cs542g-term1-2007 Nonlinear optimization Switch gears a little: we’ve already seen plenty of instances of minimizing, with linear least-squares What about nonlinear problems? Find f(x) is called the objective This is an unconstrained problem, since no limits on x.

9
9cs542g-term1-2007 Classes of methods Only evaluate f: Stochastic search, pattern search, cyclic coordinate descent (Gauss-Seidel), genetic algorithms, etc. Also evaluate ∂f/∂x (gradient vector) Steepest descent and relatives Quasi-Newton methods Also evaluate ∂ 2 f/∂x 2 (Hessian matrix) Newton’s method and relatives

10
10cs542g-term1-2007 Steepest Descent The gradient is the direction of fastest change Locally, f(x+dx) is smallest when dx is in the direction of negative gradient f The algorithm: Start with guess Until converged: Find direction Choose step size Next guess is

11
11cs542g-term1-2007 Convergence? At global minimum, gradient is zero: Can test if gradient is smaller than some threshold for convergence Note: scaling problem: min A*f(B*x)+C However, gradient is also zero at Every local minimum Every local maximum Every saddle-point

12
12cs542g-term1-2007 Convexity A function is convex if Eliminates possibility of multiple strict local mins Strictly convex: at most one local min Very good property for a problem to have!

13
13cs542g-term1-2007 Selecting a step size Scaling problem again: physical dimensions of x and gradient may not match Choosing a step too large: May end up further from minimum Choosing a step too small: Slow, maybe too slow to actually converge Line search: keep picking different step sizes until satisfied

Similar presentations

© 2020 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google