Jonathan Richard Shewchuk Reading Group Presention By David Cline

Slides:



Advertisements
Similar presentations
Instabilities of SVD Small eigenvalues -> m+ sensitive to small amounts of noise Small eigenvalues maybe indistinguishable from 0 Possible to remove small.
Advertisements

Optimization.
Chapter 6 Eigenvalues and Eigenvectors
Optimization : The min and max of a function
Linear Algebra.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: The FIR Adaptive Filter The LMS Adaptive Filter Stability and Convergence.
Steepest Decent and Conjugate Gradients (CG). Solving of the linear equation system.
Modern iterative methods For basic iterative methods, converge linearly Modern iterative methods, converge faster –Krylov subspace method Steepest descent.
Classification and Prediction: Regression Via Gradient Descent Optimization Bamshad Mobasher DePaul University.
1cs542g-term High Dimensional Data  So far we’ve considered scalar data values f i (or interpolated/approximated each component of vector values.
1cs542g-term Notes  Assignment 1 due tonight ( me by tomorrow morning)
Linear Transformations
Numerical Optimization
Function Optimization Newton’s Method. Conjugate Gradients
Tutorial 12 Unconstrained optimization Conjugate gradients.
Tutorial 5-6 Function Optimization. Line Search. Taylor Series for Rn
6 1 Linear Transformations. 6 2 Hopfield Network Questions.
Optimization/Learning on the GPU (supplement figure slides) CIS 665 Joe Kider.
Function Optimization. Newton’s Method Conjugate Gradients Method
Monica Garika Chandana Guduru. METHODS TO SOLVE LINEAR SYSTEMS Direct methods Gaussian elimination method LU method for factorization Simplex method of.
CSE 245: Computer Aided Circuit Simulation and Verification
Why Function Optimization ?
Math for CSLecture 51 Function Optimization. Math for CSLecture 52 There are three main reasons why most problems in robotics, vision, and arguably every.
PETE 603 Lecture Session #29 Thursday, 7/29/ Iterative Solution Methods Older methods, such as PSOR, and LSOR require user supplied iteration.
1cs542g-term Notes  Extra class next week (Oct 12, not this Friday)  To submit your assignment: me the URL of a page containing (links to)
Dominant Eigenvalues & The Power Method

9 1 Performance Optimization. 9 2 Basic Optimization Algorithm p k - Search Direction  k - Learning Rate or.
Introduction to Numerical Analysis I MATH/CMPSC 455 Conjugate Gradient Methods.
By Mary Hudachek-Buswell. Overview Atmospheric Turbulence Blur.
ECE 530 – Analysis Techniques for Large-Scale Electrical Systems Prof. Hao Zhu Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign.
Point set alignment Closed-form solution of absolute orientation using unit quaternions Berthold K. P. Horn Department of Electrical Engineering, University.
Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.
Scientific Computing Linear Least Squares. Interpolation vs Approximation Recall: Given a set of (x,y) data points, Interpolation is the process of finding.
Eigen Decomposition Based on the slides by Mani Thomas Modified and extended by Longin Jan Latecki.
Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.
1 Unconstrained Optimization Objective: Find minimum of F(X) where X is a vector of design variables We may know lower and upper bounds for optimum No.
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Roots of Equations ~ Open Methods Chapter 6 Credit:
Computer Animation Rick Parent Computer Animation Algorithms and Techniques Optimization & Constraints Add mention of global techiques Add mention of calculus.
Vector Norms and the related Matrix Norms. Properties of a Vector Norm: Euclidean Vector Norm: Riemannian metric:
Multivariate Unconstrained Optimisation First we consider algorithms for functions for which derivatives are not available. Could try to extend direct.
CSE 245: Computer Aided Circuit Simulation and Verification Matrix Computations: Iterative Methods I Chung-Kuan Cheng.
Case Study in Computational Science & Engineering - Lecture 5 1 Iterative Solution of Linear Systems Jacobi Method while not converged do { }
Elementary Linear Algebra Anton & Rorres, 9 th Edition Lecture Set – 07 Chapter 7: Eigenvalues, Eigenvectors.
Eigenvalues The eigenvalue problem is to determine the nontrivial solutions of the equation Ax= x where A is an n-by-n matrix, x is a length n column.
Data Modeling Patrice Koehl Department of Biological Sciences National University of Singapore
Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.
Chapter 2-OPTIMIZATION G.Anuradha. Contents Derivative-based Optimization –Descent Methods –The Method of Steepest Descent –Classical Newton’s Method.
Krylov-Subspace Methods - I Lecture 6 Alessandra Nardi Thanks to Prof. Jacob White, Deepak Ramaswamy, Michal Rewienski, and Karen Veroy.
Parameter estimation class 5 Multiple View Geometry CPSC 689 Slides modified from Marc Pollefeys’ Comp
MA5233 Lecture 6 Krylov Subspaces and Conjugate Gradients Wayne M. Lawton Department of Mathematics National University of Singapore 2 Science Drive 2.
Matrices CHAPTER 8.9 ~ Ch _2 Contents  8.9 Power of Matrices 8.9 Power of Matrices  8.10 Orthogonal Matrices 8.10 Orthogonal Matrices 
ALGEBRAIC EIGEN VALUE PROBLEMS
CSC321: Neural Networks Lecture 9: Speeding up the Learning
Lecture 16: Image alignment
Function Optimization
Eigen Decomposition Based on the slides by Mani Thomas and book by Gilbert Strang. Modified and extended by Longin Jan Latecki.
Widrow-Hoff Learning (LMS Algorithm).
CSE 245: Computer Aided Circuit Simulation and Verification
Eigen Decomposition Based on the slides by Mani Thomas and book by Gilbert Strang. Modified and extended by Longin Jan Latecki.
CSE 245: Computer Aided Circuit Simulation and Verification
Eigen Decomposition Based on the slides by Mani Thomas and book by Gilbert Strang. Modified and extended by Longin Jan Latecki.
CS5321 Numerical Optimization
Introduction to Scientific Computing II
Eigen Decomposition Based on the slides by Mani Thomas and book by Gilbert Strang. Modified and extended by Longin Jan Latecki.
6.5 Taylor Series Linearization
The loss function, the normal equation,
Mathematical Foundations of BME Reza Shadmehr
Eigen Decomposition Based on the slides by Mani Thomas and book by Gilbert Strang. Modified and extended by Longin Jan Latecki.
Performance Optimization
Presentation transcript:

Jonathan Richard Shewchuk Reading Group Presention By David Cline An Introduction to the Conjugate Gradient Method without the Agonizing Pain Jonathan Richard Shewchuk Reading Group Presention By David Cline 4/16/2017

Linear System Unknown vector Known vector (what we want to find) Square matrix 4/16/2017

Matrix Multiplication 4/16/2017

Positive Definite Matrix [ x1 x2 … xn ] > 0 * Also, all eigenvalues of the matrix are positive 4/16/2017

Quadtratic form An expression of the form 4/16/2017

Why do we care? The gradient of the quadratic form is our original system if A is symmetric: 4/16/2017

Visual interpretation 4/16/2017

Example Problem: 4/16/2017

Visual representation f(x) f(x) f’(x) 4/16/2017

Solution the solution to the system, x, is the global minimum of f. … if A is symmetric, And since A is positive definite, x is the global minimum of f 4/16/2017

Definitions Error Residual Whenever you read ‘residual’, Think ‘the direction of steepest Descent’. 4/16/2017

Method of steepest descent Start with arbitrary point, x(0) move in direction opposite gradient of f, r(0) reach minimum in that direction at distance alpha repeat 4/16/2017

Steepest descent, mathematically - OR - 4/16/2017

Steepest descent, graphically 4/16/2017

Eigen vectors 4/16/2017

Steepest descent does well: Steepest descent converges in one Iteration if the error term is an Eigenvector. Steepest descent converges in one Iteration if the all the eigenvalues Are equal. 4/16/2017

Steepest descent does poorly If the error term is a mix of large and small eigenvectors, steepest descent will move back and forth along toward the solution, but take many iterations to converge. The worst case convergence is related to the ratio of the largest and smallest eigenvalues of A, called the “condition number”: 4/16/2017

Convergence of steepest descent: # iterations “energy norm” at iteration i “energy norm” at iteration 0 4/16/2017

How can we speed up or guarantee convergence? Use the eigenvectors as directions. terminates in n iterations. 4/16/2017

Method of conjugate directions Instead of eigenvectors, which are too hard to compute, use directions that are “conjugate” or “A-orthogonal”: 4/16/2017

Method of conjugate directions 4/16/2017

How to find conjugate directions? Gram-Shmidt Conjugation: Start with n linearly independent vectors u0…un-1 For each vector, subract those parts that are not A-orthogonal to the other processed vectors: 4/16/2017

Problem Gram-Schmidt conjugation is slow and we have to store all of the vectors we have created. 4/16/2017

Conjugate Gradient Method Apply the method of conjugate directions, but use the residuals for the u values: ui = r(i) 4/16/2017

How does this help us? It turns out that the residual ri is A-orthogonal to all of the previous residuals, except ri-1, so we simply make it A-orthogonal to ri-1, and we are set. 4/16/2017

Simplifying further k=i-1 4/16/2017

Putting it all together Start with steepest descent Compute distance to bottom Of parabola Slide down to bottom of parabola Compute steepest descent At next location Remove part of vector that Is not A-orthogonal to di 4/16/2017

Starting and stopping Start either with a rough estimate of the solution, or the zero vector. Stop when the norm of the residual is small enough. 4/16/2017

Benefit over steepest descent 4/16/2017

Preconditioning 4/16/2017

Diagonal preconditioning Just use the diagonal of A as M. A diagonal matrix is easy to invert, but of course it isn’t the best method out there. 4/16/2017

CG on the normal equations If A is not symmetric, or positive-definite, or not square, we can’t use CG directly to solve However, we can use it to solve the system is always symmetric, positive definite and square. The problem that we solve with this is the least-squares fit but the condition number increases. Also note that we never actually have to form Instead we multiply by AT and then by A. 4/16/2017

4/16/2017