Optimal Column-Based Low-Rank Matrix Reconstruction SODA’12 Ali Kemal Sinop Joint work with Prof. Venkatesan Guruswami.

Slides:



Advertisements
Similar presentations
Numerical Linear Algebra in the Streaming Model
Advertisements

Lasserre Hierarchy, Higher Eigenvalues and Approximation Schemes for Graph Partitioning and PSD QIP Ali Kemal Sinop (joint work with Venkatesan Guruswami)
Shortest Vector In A Lattice is NP-Hard to approximate
3D Geometry for Computer Graphics
Chapter 28 – Part II Matrix Operations. Gaussian elimination Gaussian elimination LU factorization LU factorization Gaussian elimination with partial.
Component Analysis (Review)
Dimensionality Reduction For k-means Clustering and Low Rank Approximation Michael Cohen, Sam Elder, Cameron Musco, Christopher Musco, and Madalina Persu.
Dimensionality Reduction PCA -- SVD
Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.
1 Introduction to Quantum Information Processing QIC 710 / CS 667 / PH 767 / CO 681 / AM 871 Richard Cleve DC 2117 / RAC 2211 Lecture.
The Disputed Federalist Papers : SVM Feature Selection via Concave Minimization Glenn Fung and Olvi L. Mangasarian CSNA 2002 June 13-16, 2002 Madison,
10/11/2001Random walks and spectral segmentation1 CSE 291 Fall 2001 Marina Meila and Jianbo Shi: Learning Segmentation by Random Walks/A Random Walks View.
Slides by Olga Sorkine, Tel Aviv University. 2 The plan today Singular Value Decomposition  Basic intuition  Formal definition  Applications.
1cs542g-term High Dimensional Data  So far we’ve considered scalar data values f i (or interpolated/approximated each component of vector values.
Affine-invariant Principal Components Charlie Brubaker and Santosh Vempala Georgia Tech School of Computer Science Algorithms and Randomness Center.
From Greek Mythology to Modern Manufacturing: The Procrustes Problem By Dr. Dan Curtis Department of Mathematics Central Washington University.
3D Geometry for Computer Graphics
Information Retrieval in Text Part III Reference: Michael W. Berry and Murray Browne. Understanding Search Engines: Mathematical Modeling and Text Retrieval.
Dean H. Lorenz, Danny Raz Operations Research Letter, Vol. 28, No
1 Sampling Lower Bounds via Information Theory Ziv Bar-Yossef IBM Almaden.
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Matrix sparsification and the sparse null space problem Lee-Ad GottliebWeizmann Institute Tyler NeylonBynomial Inc. TexPoint fonts used in EMF. Read the.
Clustering In Large Graphs And Matrices Petros Drineas, Alan Frieze, Ravi Kannan, Santosh Vempala, V. Vinay Presented by Eric Anderson.
Computing Sketches of Matrices Efficiently & (Privacy Preserving) Data Mining Petros Drineas Rensselaer Polytechnic Institute (joint.
CSci 6971: Image Registration Lecture 2: Vectors and Matrices January 16, 2004 Prof. Chuck Stewart, RPI Dr. Luis Ibanez, Kitware Prof. Chuck Stewart, RPI.
Dimension reduction : PCA and Clustering Christopher Workman Center for Biological Sequence Analysis DTU.
Previously Two view geometry: epipolar geometry Stereo vision: 3D reconstruction epipolar lines Baseline O O’ epipolar plane.
Preference Analysis Joachim Giesen and Eva Schuberth May 24, 2006.
3D Geometry for Computer Graphics
Introduction to Boosting Aristotelis Tsirigos SCLT seminar - NYU Computer Science.
1cs542g-term Notes  Extra class next week (Oct 12, not this Friday)  To submit your assignment: me the URL of a page containing (links to)
Matrices CS485/685 Computer Vision Dr. George Bebis.
Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka Virginia de Sa (UCSD) Cogsci 108F Linear.
Introduction The central problems of Linear Algebra are to study the properties of matrices and to investigate the solutions of systems of linear equations.
Mathematical Programming in Support Vector Machines
SVD(Singular Value Decomposition) and Its Applications
©2003/04 Alessandro Bogliolo Background Information theory Probability theory Algorithms.
1 Introduction to Quantum Information Processing QIC 710 / CS 768 / PH 767 / CO 681 / AM 871 Richard Cleve QNC 3129 Lecture 18 (2014)
CHAPTER SIX Eigenvalues
Linear Algebra Review 1 CS479/679 Pattern Recognition Dr. George Bebis.
Chapter 5: The Orthogonality and Least Squares
+ Review of Linear Algebra Optimization 1/14/10 Recitation Sivaraman Balakrishnan.
1 Introduction to Quantum Information Processing QIC 710 / CS 667 / PH 767 / CO 681 / AM 871 Richard Cleve DC 2117 Lecture 16 (2011)
Linear Algebra (Aljabar Linier) Week 10 Universitas Multimedia Nusantara Serpong, Tangerang Dr. Ananda Kusuma
SVD: Singular Value Decomposition
Statistical Leverage and Improved Matrix Algorithms Michael W. Mahoney Yahoo Research ( For more info, see: )
Orthogonalization via Deflation By Achiya Dax Hydrological Service Jerusalem, Israel
Learning Spectral Clustering, With Application to Speech Separation F. R. Bach and M. I. Jordan, JMLR 2006.
Approximate Inference: Decomposition Methods with Applications to Computer Vision Kyomin Jung ( KAIST ) Joint work with Pushmeet Kohli (Microsoft Research)
Elementary Linear Algebra Anton & Rorres, 9 th Edition Lecture Set – 07 Chapter 7: Eigenvalues, Eigenvectors.
CSE 446 Dimensionality Reduction and PCA Winter 2012 Slides adapted from Carlos Guestrin & Luke Zettlemoyer.
Review of Linear Algebra Optimization 1/16/08 Recitation Joseph Bradley.
Advanced Computer Graphics Spring 2014 K. H. Ko School of Mechatronics Gwangju Institute of Science and Technology.
STATIC ANALYSIS OF UNCERTAIN STRUCTURES USING INTERVAL EIGENVALUE DECOMPOSITION Mehdi Modares Tufts University Robert L. Mullen Case Western Reserve University.
 In the previews parts we have seen some kind of segmentation method.  In this lecture we will see graph cut, which is a another segmentation method.
Instructor: Mircea Nicolescu Lecture 8 CS 485 / 685 Computer Vision.
TU/e Algorithms (2IL15) – Lecture 12 1 Linear Programming.
Camera Calibration Course web page: vision.cis.udel.edu/cv March 24, 2003  Lecture 17.
Unsupervised Learning II Feature Extraction
A Story of Principal Component Analysis in the Distributed Model David Woodruff IBM Almaden Based on works with Christos Boutsidis, Ken Clarkson, Ravi.
Introduction The central problems of Linear Algebra are to study the properties of matrices and to investigate the solutions of systems of linear equations.
Introduction The central problems of Linear Algebra are to study the properties of matrices and to investigate the solutions of systems of linear equations.
LECTURE 10: DISCRIMINANT ANALYSIS
Minimum Spanning Tree 8/7/2018 4:26 AM
LSI, SVD and Data Management
Structural Properties of Low Threshold Rank Graphs
Bounds for Optimal Compressed Sensing Matrices
Maths for Signals and Systems Linear Algebra in Engineering Lectures 13 – 14, Tuesday 8th November 2016 DR TANIA STATHAKI READER (ASSOCIATE PROFFESOR)
LECTURE 09: DISCRIMINANT ANALYSIS
On Clusterings: Good, Bad, and Spectral
Presentation transcript:

Optimal Column-Based Low-Rank Matrix Reconstruction SODA’12 Ali Kemal Sinop Joint work with Prof. Venkatesan Guruswami

Outline Introduction – Notation – Problem Definition – Motivations – Results Upper Bound Randomized Algorithm Summary 11:24:39 PM2SODA 2012: Ali Kemal Sinop

Notation: Vectors and Matrices X: m-by-n real matrix. X i :i th column of X. C: Subset of columns of X. X C : sub-matrix of X on C. 11:24:39 PMSODA 2012: Ali Kemal Sinop3

Formal Problem Definition Given m-by-n matrix X = [X 1 X 2... X n ], Find r columns, C, which minimizes: which is equal to: 11:24:39 PMSODA 2012: Ali Kemal Sinop4 Projection distance of X i to X C.

What is Distance to Span? Given any matrix A, and vector x, (Pythagorean Theorem) Thus 11:24:39 PMSODA 2012: Ali Kemal Sinop5 Orthonormal projection matrix onto null space of A.

Back to Formal Problem Definition Given m-by-n matrix X = [X 1 X 2... X n ], Find r columns, C, which minimizes: No books, No web pages, No images. Only geometry. 11:24:39 PMSODA 2012: Ali Kemal Sinop6

Problem Formulation Given m-by-n matrix X = [X 1 X 2... X n ], Find r columns, C, which minimizes: No books, No web pages, No images. Only geometry. 11:24:39 PMSODA 2012: Ali Kemal Sinop7 is the orthonormal projection matrix onto null space of X C. Reconstruction Error =

An Example n=2, m=2, r=1: 11:24:39 PMSODA 2012: Ali Kemal Sinop8 X1X1 X2X2 Origin 135 o For C={1},For C={2},

What is the minimum possible? X is m-by-n. X C is m-by-r:  Rank of X C is at most |C|=r. Replace column restriction with rank restriction: – Choose any matrix X(r) of rank-r – Minimizing 11:24:39 PMSODA 2012: Ali Kemal Sinop9 |C|≤r implies rank≤r

Low Rank Matrix Approximation Therefore X(r): Can be found by Singular Value Decomposition (SVD). 11:24:39 PMSODA 2012: Ali Kemal Sinop10 X(r): a rank-r matrix minimizing

Singular Values of X There exists m unique non-negative reals, Best rank-r reconstruction error: “Smooth rank” of X. – For example, if rank(X) = k, then 11:24:39 PMSODA 2012: Ali Kemal Sinop11

First Example n=2, m=2, r=1. Remember 11:24:39 PMSODA 2012: Ali Kemal Sinop12 X1X1 X2X2 Origin 135 o Quick check: Worst Possible?

Our Goal: Do as well as best rank-k Given target rank k, Allowed error ε>0, Choose smallest C:|C|=r, such that How does r depend on k and ε? 11:24:39 PMSODA 2012: Ali Kemal Sinop13 Best possible rank-k approximation error.

Practical Motivations [Drineas, Mahoney’09] DNA microarray: – Unsupervised feature selection for cancer detection. – Column Selection + K-means: Better classification. Many classification problems – Same idea. 11:24:39 PMSODA 2012: Ali Kemal Sinop14

Theoretical Applications Our motivation. [Guruswami, S’11] Approximation schemes for many graph partitioning problems. – Running time: Exponential in r=r(k,ε) where k=number of eigenvalues < 1-ε. [Guruswami, S’12] Significantly faster algorithm for sparsest cut and etc... – Running time: Exponential in r=r(k, ε) where k=number of eigenvalues < Φ/ε. 11:24:40 PMSODA 2012: Ali Kemal Sinop15 r = Number of columns needed to get within (1+ ε) factor of best rank-k approximation

Previous Results [Frieze, Kannan, Vempala’04] Introduced this problem. [Deshpande, Vempala’06] [Sarlos’06] [Deshpande, Rademacher’10] r=k when ε=k+1. 11:24:40 PMSODA 2012: Ali Kemal Sinop16

Recent Results [This paper] We showed – r=k/ε+k-1 columns suffice and r ≥ k/ε-o(k) necessary. – A randomized algorithm in time, – A deterministic algorithm in time Using [Deshpande, Rademacher’10]. ω=matrix multiplication. (Independently) [Boutsidis, Drineas, Magdon- Ismail’11] – r≤2 k / ε columns, – In randomized time O(knm/ε + k 3 ε -2/3 n) 11:24:40 PMSODA 2012: Ali Kemal Sinop17 r is optimal (up to low order terms).

Outline Introduction Upper Bound – Strategy – An Algebraic Expression – Eliminating Min – Wrapping Up Randomized Algorithm Summary 11:24:40 PM18SODA 2012: Ali Kemal Sinop

Upper Bound Input: m-by-n matrix X, target rank k, number of columns r. Problem: Relate to Our Approach: – Represent in an algebraic form. – Eliminate minimum by randomly sampling C. – Represent error as a function of σ’s. – Bound it in terms of 11:24:40 PMSODA 2012: Ali Kemal Sinop19 Best possible rank-k approximation error.

An Algebraic Expression Remember, our problem is: is hard to manipulate. An equivalent algebraic expression? 11:24:40 PMSODA 2012: Ali Kemal Sinop20

Base Case: r=1 A simple case. When C={c}: 11:24:40 PMSODA 2012: Ali Kemal Sinop21 XcXc XiXi

Case of r=2 Consider C={c,d} in 3-dimensions: 11:24:40 PMSODA 2012: Ali Kemal Sinop22 XdXd XiXi XcXc

General Case Fact: Volume 2 = determinant. Using Volume = Base-Volume * Height formula, Hence 11:24:40 PMSODA 2012: Ali Kemal Sinop23

Eliminating Min Volume Sampling [Deshpande, Rademacher, Vempala, Wang’06] – Choose C with probability 11:24:40 PMSODA 2012: Ali Kemal Sinop24

Symmetric Forms 11:24:40 PMSODA 2012: Ali Kemal Sinop25 Fact: For any k, k th elementary symmetric polynomial:

Schur Concavity 11:24:40 PMSODA 2012: Ali Kemal Sinop26 Hence This ratio is Schur-concave. In other words: << σ1σ1 σ2σ2 σkσk σ k+1 σmσm... σ1σ1 σ2σ2 σkσk σ k+1 σmσm... σ1σ1 σ2σ2 σkσk σ k+1 σmσm

Wrapping Up 11:24:40 PMSODA 2012: Ali Kemal Sinop27 For r=k/ε+k-1, this is (1+ε). QED

Algorithms for Choosing C (Main idea) A nice recursion: Randomized Algorithm 1.Choose j wp a.Can be done in time 2. 3.For all i, 4.Choose r-1 columns on these vectors. 11:24:40 PMSODA 2012: Ali Kemal Sinop28

Outline Introduction Upper Bound Randomized Algorithm Summary 11:24:40 PM29SODA 2012: Ali Kemal Sinop

Summary (Upper Bound) r=k/ε+k-1 columns suffice to achieve (1+ε)*best rank-k error. (Randomized) Such columns can be found in time (r T SVD ) = O(k) T SVD. (Lower Bound) k/ε-o(k) columns needed. 11:24:40 PMSODA 2012: Ali Kemal Sinop30 Thanks! Job market alert. Thanks! Job market alert.

11:24:40 PMSODA 2012: Ali Kemal Sinop31