Singular Value Decomposition COS 323. Underconstrained Least Squares What if you have fewer data points than parameters in your function?What if you have.

Slides:



Advertisements
Similar presentations
Eigen Decomposition and Singular Value Decomposition
Advertisements

Eigen Decomposition and Singular Value Decomposition
3D Geometry for Computer Graphics
Chapter 28 – Part II Matrix Operations. Gaussian elimination Gaussian elimination LU factorization LU factorization Gaussian elimination with partial.
Generalised Inverses Modal Analysis and Modal Testing S. Ziaei Rad.
PCA + SVD.
Solving Linear Systems (Numerical Recipes, Chap 2)
1cs542g-term High Dimensional Data  So far we’ve considered scalar data values f i (or interpolated/approximated each component of vector values.
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
Principal Component Analysis
Dan Witzner Hansen  Groups?  Improvements – what is missing?
Chapter 5 Orthogonality
Motion Analysis Slides are from RPI Registration Class.
Computer Graphics Recitation 5.
3D Geometry for Computer Graphics
Some useful linear algebra. Linearly independent vectors span(V): span of vector space V is all linear combinations of vectors v i, i.e.
Math for CSLecture 41 Linear Least Squares Problem Over-determined systems Minimization problem: Least squares norm Normal Equations Singular Value Decomposition.
SVD and PCA COS 323. Dimensionality Reduction Map points in high-dimensional space to lower number of dimensionsMap points in high-dimensional space to.
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Chapter 2 Matrices Definition of a matrix.
Clustering In Large Graphs And Matrices Petros Drineas, Alan Frieze, Ravi Kannan, Santosh Vempala, V. Vinay Presented by Eric Anderson.
Structure from Motion (With a digression on SVD).
3D Geometry for Computer Graphics
3D Geometry for Computer Graphics
SVD and PCA COS 323, Spring 05. SVD and PCA Principal Components Analysis (PCA): approximating a high-dimensional data set with a lower-dimensional subspacePrincipal.
6 6.3 © 2012 Pearson Education, Inc. Orthogonality and Least Squares ORTHOGONAL PROJECTIONS.
Boot Camp in Linear Algebra Joel Barajas Karla L Caballero University of California Silicon Valley Center October 8th, 2008.
1cs542g-term Notes  Extra class next week (Oct 12, not this Friday)  To submit your assignment: me the URL of a page containing (links to)
Matrices CS485/685 Computer Vision Dr. George Bebis.
5.1 Orthogonality.
Introduction The central problems of Linear Algebra are to study the properties of matrices and to investigate the solutions of systems of linear equations.
SVD(Singular Value Decomposition) and Its Applications
Summarized by Soo-Jin Kim
CSE554AlignmentSlide 1 CSE 554 Lecture 8: Alignment Fall 2014.
BMI II SS06 – Class 3 “Linear Algebra” Slide 1 Biomedical Imaging II Class 3 – Mathematical Preliminaries: Elementary Linear Algebra 2/13/06.
Modern Navigation Thomas Herring
CSE554AlignmentSlide 1 CSE 554 Lecture 5: Alignment Fall 2011.
D. van Alphen1 ECE 455 – Lecture 12 Orthogonal Matrices Singular Value Decomposition (SVD) SVD for Image Compression Recall: If vectors a and b have the.
AN ORTHOGONAL PROJECTION
SVD: Singular Value Decomposition
Vector Norms and the related Matrix Norms. Properties of a Vector Norm: Euclidean Vector Norm: Riemannian metric:
Scientific Computing Singular Value Decomposition SVD.
MATH 685/ CSI 700/ OR 682 Lecture Notes Lecture 4. Least squares.
CSE554AlignmentSlide 1 CSE 554 Lecture 8: Alignment Fall 2013.
CSE 185 Introduction to Computer Vision Face Recognition.
E IGEN & S INGULAR V ALUE D ECOMPOSITION قسمتی از درس ریاضی مهندسی پیشرفته ( ارشد و دکتری ) دوشنبه، 3/9/90 1 دکتر رنجبر نوعی، گروه مهندسی کنترل.
CSE 446 Dimensionality Reduction and PCA Winter 2012 Slides adapted from Carlos Guestrin & Luke Zettlemoyer.
Review of Linear Algebra Optimization 1/16/08 Recitation Joseph Bradley.
EIGENSYSTEMS, SVD, PCA Big Data Seminar, Dedi Gadot, December 14 th, 2014.
Recognition, SVD, and PCA. Recognition Suppose you want to find a face in an imageSuppose you want to find a face in an image One possibility: look for.
Advanced Computer Graphics Spring 2014 K. H. Ko School of Mechatronics Gwangju Institute of Science and Technology.
Instructor: Mircea Nicolescu Lecture 8 CS 485 / 685 Computer Vision.
2.5 – Determinants and Multiplicative Inverses of Matrices.
Singular Value Decomposition and Numerical Rank. The SVD was established for real square matrices in the 1870’s by Beltrami & Jordan for complex square.
Problems in solving generic AX = B Case 1: There are errors in data such that data cannot be fit perfectly (analog: simple case of fitting a line while.
Unsupervised Learning II Feature Extraction
Boot Camp in Linear Algebra TIM 209 Prof. Ram Akella.
Unsupervised Learning II Feature Extraction
CSE 554 Lecture 8: Alignment
Introduction The central problems of Linear Algebra are to study the properties of matrices and to investigate the solutions of systems of linear equations.
Eigen & Singular Value Decomposition
Lecture 8:Eigenfaces and Shared Features
Lecture: Face Recognition and Feature Reduction
Singular Value Decomposition
Some useful linear algebra
Principal Component Analysis
Recitation: SVD and dimensionality reduction
Outline Singular Value Decomposition Example of PCA: Eigenfaces.
Feature space tansformation methods
The Elements of Linear Algebra
Presentation transcript:

Singular Value Decomposition COS 323

Underconstrained Least Squares What if you have fewer data points than parameters in your function?What if you have fewer data points than parameters in your function? – Intuitively, can’t do standard least squares – Recall that solution takes the form A T Ax = A T b – When A has more columns than rows, A T A is singular: can’t take its inverse, etc.

Underconstrained Least Squares More subtle version: more data points than unknowns, but data poorly constrains functionMore subtle version: more data points than unknowns, but data poorly constrains function Example: fitting to y=ax 2 +bx+cExample: fitting to y=ax 2 +bx+c

Underconstrained Least Squares Problem: if problem very close to singular, roundoff error can have a huge effectProblem: if problem very close to singular, roundoff error can have a huge effect – Even on “well-determined” values! Can detect this:Can detect this: – Uncertainty proportional to covariance C = (A T A) -1 – In other words, unstable if A T A has small values – More precisely, care if x T (A T A)x is small for any x Idea: if part of solution unstable, set answer to 0Idea: if part of solution unstable, set answer to 0 – Avoid corrupting good parts of answer

Singular Value Decomposition (SVD) Handy mathematical technique that has application to many problemsHandy mathematical technique that has application to many problems Given any m  n matrix A, algorithm to find matrices U, V, and W such thatGiven any m  n matrix A, algorithm to find matrices U, V, and W such that A = U W V T U is m  n and orthonormal W is n  n and diagonal V is n  n and orthonormal

SVD Treat as black box: code widely available In Matlab: [U,W,V]=svd(A,0)Treat as black box: code widely available In Matlab: [U,W,V]=svd(A,0)

SVD The w i are called the singular values of AThe w i are called the singular values of A If A is singular, some of the w i will be 0If A is singular, some of the w i will be 0 In general rank ( A ) = number of nonzero w iIn general rank ( A ) = number of nonzero w i SVD is mostly unique (up to permutation of singular values, or if some w i are equal)SVD is mostly unique (up to permutation of singular values, or if some w i are equal)

SVD and Inverses Why is SVD so useful?Why is SVD so useful? Application #1: inversesApplication #1: inverses A -1 =( V T ) -1 W -1 U -1 = V W -1 U T A -1 =( V T ) -1 W -1 U -1 = V W -1 U T – Using fact that inverse = transpose for orthogonal matrices – Since W is diagonal, W -1 also diagonal with reciprocals of entries of W

SVD and Inverses A -1 =( V T ) -1 W -1 U -1 = V W -1 U T A -1 =( V T ) -1 W -1 U -1 = V W -1 U T This fails when some w i are 0This fails when some w i are 0 – It’s supposed to fail – singular matrix Pseudoinverse: if w i =0, set 1/ w i to 0 (!)Pseudoinverse: if w i =0, set 1/ w i to 0 (!) – “Closest” matrix to inverse – Defined for all (even non-square, singular, etc.) matrices – Equal to ( A T A ) -1 A T if A T A invertible

SVD and Least Squares Solving Ax = b by least squaresSolving Ax = b by least squares x =pseudoinverse( A ) times b x =pseudoinverse( A ) times b Compute pseudoinverse using SVDCompute pseudoinverse using SVD – Lets you see if data is singular – Even if not singular, ratio of max to min singular values (condition number) tells you how stable the solution will be – Set 1/ w i to 0 if w i is small (even if not exactly 0)

SVD and Eigenvectors Let A = UWV T, and let x i be i th column of VLet A = UWV T, and let x i be i th column of V Consider A T A x i :Consider A T A x i : So elements of W are sqrt(eigenvalues) and columns of V are eigenvectors of A T ASo elements of W are sqrt(eigenvalues) and columns of V are eigenvectors of A T A – What we wanted for robust least squares fitting!

SVD and Matrix Similarity One common definition for the norm of a matrix is the Frobenius norm:One common definition for the norm of a matrix is the Frobenius norm: Frobenius norm can be computed from SVDFrobenius norm can be computed from SVD So changes to a matrix can be evaluated by looking at changes to singular valuesSo changes to a matrix can be evaluated by looking at changes to singular values

SVD and Matrix Similarity Suppose you want to find best rank- k approximation to ASuppose you want to find best rank- k approximation to A Answer: set all but the largest k singular values to zeroAnswer: set all but the largest k singular values to zero Can form compact representation by eliminating columns of U and V corresponding to zeroed w iCan form compact representation by eliminating columns of U and V corresponding to zeroed w i

SVD and PCA Principal Components Analysis (PCA): approximating a high-dimensional data set with a lower-dimensional subspacePrincipal Components Analysis (PCA): approximating a high-dimensional data set with a lower-dimensional subspace Original axes * * * * * * * *** * * *** * * * * * * * * * Data points First principal component Second principal component

SVD and PCA Data matrix with points as rows, take SVDData matrix with points as rows, take SVD – Subtract out mean (“whitening”) Columns of V k are principal componentsColumns of V k are principal components Value of w i gives importance of each componentValue of w i gives importance of each component

PCA on Faces: “Eigenfaces” Average face First principal component Other components For all except average, “gray” = 0, “white” > 0, “black” < 0

Using PCA for Recognition Store each person as coefficients of projection onto first few principal componentsStore each person as coefficients of projection onto first few principal components Compute projections of target image, compare to database (“nearest neighbor classifier”)Compute projections of target image, compare to database (“nearest neighbor classifier”)

Total Least Squares One final least squares applicationOne final least squares application Fitting a line: vertical vs. perpendicular errorFitting a line: vertical vs. perpendicular error

Total Least Squares Distance from point to line: where n is normal vector to line, a is a constantDistance from point to line: where n is normal vector to line, a is a constant Minimize:Minimize:

Total Least Squares First, let’s pretend we know n, solve for aFirst, let’s pretend we know n, solve for a ThenThen

Total Least Squares So, let’s define and minimizeSo, let’s define and minimize

Total Least Squares Write as linear systemWrite as linear system Have An=0Have An=0 – Problem: lots of n are solutions, including n=0 – Standard least squares will, in fact, return n=0

Constrained Optimization Solution: constrain n to be unit lengthSolution: constrain n to be unit length So, try to minimize |An| 2 subject to |n| 2 =1So, try to minimize |An| 2 subject to |n| 2 =1 Expand in eigenvectors e i of A T A: where the i are eigenvalues of A T AExpand in eigenvectors e i of A T A: where the i are eigenvalues of A T A

Constrained Optimization To minimize subject to set  min = 1, all other  i = 0To minimize subject to set  min = 1, all other  i = 0 That is, n is eigenvector of A T A with the smallest corresponding eigenvalueThat is, n is eigenvector of A T A with the smallest corresponding eigenvalue