Information Retrieval in Text Part III Reference: Michael W. Berry and Murray Browne. Understanding Search Engines: Mathematical Modeling and Text Retrieval.

Slides:



Advertisements
Similar presentations
The Mathematics of Information Retrieval 11/21/2005 Presented by Jeremy Chapman, Grant Gelven and Ben Lakin.
Advertisements

Introduction to Information Retrieval Outline ❶ Latent semantic indexing ❷ Dimensionality reduction ❸ LSI in information retrieval 1.
Generalised Inverses Modal Analysis and Modal Testing S. Ziaei Rad.
Dimensionality reduction. Outline From distances to points : – MultiDimensional Scaling (MDS) Dimensionality Reductions or data projections Random projections.
Dimensionality Reduction PCA -- SVD
MATH 685/ CSI 700/ OR 682 Lecture Notes
Lecture 19 Singular Value Decomposition
1cs542g-term High Dimensional Data  So far we’ve considered scalar data values f i (or interpolated/approximated each component of vector values.
Hinrich Schütze and Christina Lioma
Chapter 5 Orthogonality
Lecture 19 Quadratic Shapes and Symmetric Positive Definite Matrices Shang-Hua Teng.
1 Latent Semantic Indexing Jieping Ye Department of Computer Science & Engineering Arizona State University
Math for CSLecture 41 Linear Least Squares Problem Over-determined systems Minimization problem: Least squares norm Normal Equations Singular Value Decomposition.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 4 March 30, 2005
Singular Value Decomposition in Text Mining Ram Akella University of California Berkeley Silicon Valley Center/SC Lecture 4b February 9, 2011.
TFIDF-space  An obvious way to combine TF-IDF: the coordinate of document in axis is given by  General form of consists of three parts: Local weight.
Vector Space Model Any text object can be represented by a term vector Examples: Documents, queries, sentences, …. A query is viewed as a short document.
Singular Value Decomposition
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Chapter 2 Matrices Definition of a matrix.
Introduction to Information Retrieval Introduction to Information Retrieval Hinrich Schütze and Christina Lioma Lecture 18: Latent Semantic Indexing 1.
Computing Sketches of Matrices Efficiently & (Privacy Preserving) Data Mining Petros Drineas Rensselaer Polytechnic Institute (joint.
1 Neural Nets Applications Vectors and Matrices. 2/27 Outline 1. Definition of Vectors 2. Operations on Vectors 3. Linear Dependence of Vectors 4. Definition.
Information Retrieval in Text Part II Reference: Michael W. Berry and Murray Browne. Understanding Search Engines: Mathematical Modeling and Text Retrieval.
10-603/15-826A: Multimedia Databases and Data Mining SVD - part I (definitions) C. Faloutsos.
Lecture 20 SVD and Its Applications Shang-Hua Teng.
Ordinary least squares regression (OLS)
Multimedia Databases LSI and SVD. Text - Detailed outline text problem full text scanning inversion signature files clustering information filtering and.
Singular Value Decomposition and Data Management
Orthogonality and Least Squares
4 4.6 © 2012 Pearson Education, Inc. Vector Spaces RANK.
Orthogonal Sets (12/2/05) Recall that “orthogonal” matches the geometric idea of “perpendicular”. Definition. A set of vectors u 1,u 2,…,u p in R n is.
1cs542g-term Notes  Extra class next week (Oct 12, not this Friday)  To submit your assignment: me the URL of a page containing (links to)
Matrices CS485/685 Computer Vision Dr. George Bebis.
5.1 Orthogonality.
Chapter 1 – Linear Equations
CHAPTER SIX Eigenvalues
Chapter 2 Dimensionality Reduction. Linear Methods
1 Vector Space Model Rong Jin. 2 Basic Issues in A Retrieval Model How to represent text objects What similarity function should be used? How to refine.
CS246 Topic-Based Models. Motivation  Q: For query “car”, will a document with the word “automobile” be returned as a result under the TF-IDF vector.
1 Information Retrieval through Various Approximate Matrix Decompositions Kathryn Linehan Advisor: Dr. Dianne O’Leary.
Presentation to VII International Workshop on Advanced Computing and Analysis Techniques in Physics Research October, 2000.
AN ORTHOGONAL PROJECTION
SVD: Singular Value Decomposition
Orthogonality and Least Squares
CpSc 881: Information Retrieval. 2 Recall: Term-document matrix This matrix is the basis for computing the similarity between documents and queries. Today:
SINGULAR VALUE DECOMPOSITION (SVD)
Orthogonalization via Deflation By Achiya Dax Hydrological Service Jerusalem, Israel
Elementary Linear Algebra Anton & Rorres, 9th Edition
Scientific Computing Singular Value Decomposition SVD.
MATH 685/ CSI 700/ OR 682 Lecture Notes Lecture 4. Least squares.
A Note on Rectangular Quotients By Achiya Dax Hydrological Service Jerusalem, Israel
4.8 Rank Rank enables one to relate matrices to vectors, and vice versa. Definition Let A be an m  n matrix. The rows of A may be viewed as row vectors.
Elementary Linear Algebra Anton & Rorres, 9 th Edition Lecture Set – 07 Chapter 7: Eigenvalues, Eigenvectors.
A rule that combines two vectors to produce a scalar.
ITCS 6265 Information Retrieval & Web Mining Lecture 16 Latent semantic indexing Thanks to Thomas Hofmann for some slides.
Advanced Computer Graphics Spring 2014 K. H. Ko School of Mechatronics Gwangju Institute of Science and Technology.
4.8 Rank Rank enables one to relate matrices to vectors, and vice versa. Definition Let A be an m  n matrix. The rows of A may be viewed as row vectors.
Instructor: Mircea Nicolescu Lecture 8 CS 485 / 685 Computer Vision.
1 Chapter 8 – Symmetric Matrices and Quadratic Forms Outline 8.1 Symmetric Matrices 8.2Quardratic Forms 8.3Singular ValuesSymmetric MatricesQuardratic.
Singular Value Decomposition and Numerical Rank. The SVD was established for real square matrices in the 1870’s by Beltrami & Jordan for complex square.
Chapter 5 Chapter Content 1. Real Vector Spaces 2. Subspaces 3. Linear Independence 4. Basis and Dimension 5. Row Space, Column Space, and Nullspace 6.
Singular Value Decomposition and its applications
Introduction The central problems of Linear Algebra are to study the properties of matrices and to investigate the solutions of systems of linear equations.
Introduction The central problems of Linear Algebra are to study the properties of matrices and to investigate the solutions of systems of linear equations.
LSI, SVD and Data Management
Singular Value Decomposition
Lecture 13: Singular Value Decomposition (SVD)
Linear Algebra Lecture 41.
Lecture 20 SVD and Its Applications
Presentation transcript:

Information Retrieval in Text Part III Reference: Michael W. Berry and Murray Browne. Understanding Search Engines: Mathematical Modeling and Text Retrieval. SIAM Reading Assignment: Chapter 4.

Outline Matrix Decompositions  QR Factorization  Singular Value Decomposition  Updating Techniques

Matrix Decomposition To produce a reduced-rank approximation of the m  n term by document matrix A, one must identify the dependence between columns or rows of the matrix A. For a rank-k matrix, the k basis vectors of its column space serve in place of its n column vectors to represent its column space.

QR Factorization The QR factorization of matrix A is defined as where Q is an m  m orthogonal matrix  A square matrix is orthogonal if its columns are orthonormal. i.e., if q j denotes a column of the orthogonal matrix Q, then q j has unit Euclidean norm (|| q j || 2 = 1) for j = 1,2, …, m and it is orthogonal to all other columns of Q ((q j T q i )1/2 = 0 for all i ≠ j).  The rows of Q are also orthonormal, i.e. Q T Q = QQ T = I.  Such factorization exists for any matrix A.  There are many ways to do the factorization.

QR Factorization Given A = QR, the columns of the matrix A are all linear combinations of the columns of Q.  Thus, a subset of k of the columns of Q form a basis for the column space of A, where k = rank(A)

QR Factorization: Example

QR Factorization of the previous example can be represented as  Note that the first 7 columns of Q, Q 1, are orthonormal And hence constitute a basis for the column space of A. The bottom zero submatrix of R is not always guaranteed to be generated automatically from the QR factorization, and hence may need to apply column pivoting in order to guarantee the zero submatrix. Q 2 does not contribute to producing any nonzero value in A

QR Factorization One motivation for using QR factorization is that the basis vectors can be used to describe the semantic content of the corresponding text collection. The cosines of the angles  j between a query vector q and document vectors a j Note that for the query “Child Proofing” it gives exactly the same cosines. Why?

Frobenius Matrix Norm Definition: The Frobenius matrix norm of an m  n matrix B = [b ij ], ||.|| F is defined by

Low Rank Approximation for QR Factorization Initially, the rank of A is not known. However, after performing the QR factorization, its rank is obviously the rank of _______ With column pivoting, we know that there exists a permutation matrix P such that AP = QR where the larger entries of R are moved to the upper left corner. Such arrangement, if possible, partitions R where the smallest entries are isolated in the bottom submatrix.

Low Rank Approximation for QR Factorization

Computing Redefining R 22 to be the 4  2 zero matrix, the modified upper triangular matrix R has rank 5 rather than 7.  Hence, the matrix has rank ____ Show that ||E|| F = ||R 22 || F. Show that ||E|| F / ||A|| F = || R 22 || F / ||R|| F = Therefore, the relative change in R, 32.37%, yields the same relative change in A.  With r=4, the relative change is 76%.

Low Rank Approximation for QR Factorization: Example

Comparing Cosine Similarities for the Query: “Child Proofing” DocAr=5r=

Comparing Cosine Similarities for the Query: “Child Home Safety” DocAr=5r=

Singular Value Decomposition While QR factorization provides a reduced rank basis for the column space, no information is provided about the row space of A. SVD can provide  reduced rank approximation for both spaces  rank-k approximation to A of minimal change for any value of k.

Singular Value Decomposition A = U  V T where U: m  m orthogonal matrix whose columns define the left singular vectors of A V: n  n orthogonal matrix whose columns define the right singular vectors of A  : m  n diagonal matrix containing singular values  1  2  …   min{m,n} Such factorization exists for any matrix A.

Component Matrices of the SVD

SVD vs. QR What is the relationship between the rank of A and the ranks of the matrices in both factorizations? In QR, the first r A columns of Q form a basis for the column space, so do the first r A columns of U. The first r A rows of V T form a basis for the row space of A. The low rank-k approximation in SVD can be done by setting all but the k largest singular values in  to zero.

SVD Theorem: The low rank-k approximation of SVD is the closest rank-k approximation to A  Proven by Eckart and Young  It showed that the error in approximating A by A k is given by where A k = U k  k V k T  Hence, the error in approximating the original matrix is determined by  singular values (  k+1,  k+2,…,  rank(A) )

SVD: Example

||A – A 6 || F = …… Hence, the relative change in the matrix A is … Therefore, rank-5 approximation may be appropriate in our case. Determining the best rank approximation for any database depends on empirical testing  For very large databases, the number could be between 100 and 300.  Computational feasibility, rather than accuracy, determines the rank reduction k-rank approximation% Change Rank-67.4% Rank % Rank % Rank %

Low Rank Approximations Visual comparison of rank-reduced approximations to A can be misleading  Check rank-4 QR approximation vs. the more accurate rank-4 SVD approximation. Rank-4 SVD approximation shows associations made with terms, not originally in the document title  e.g. Term 4 (Health) and Term 8 (Safety) in Document 1 (Infant & Toddler First Aid).

Query Matching Given the query vector q, to be compared with the columns of the reduced-rank matrix A k.  Let e j denotes the j th canonical vector in I n. Then, A k e j represents _______________  It is easy to show that where

Query Matching An alternate formula for the cosine computation is  Note that which means that the number of retrieved documents using this query matching technique is larger.