Presentation is loading. Please wait.

Presentation is loading. Please wait.

Paper: Indexing by Latent Semantic Analysis for course cs630 presented by: Haiyan Qiao.

Similar presentations


Presentation on theme: "Paper: Indexing by Latent Semantic Analysis for course cs630 presented by: Haiyan Qiao."— Presentation transcript:

1 Paper: Indexing by Latent Semantic Analysis for course cs630 presented by: Haiyan Qiao

2 Problem Introduction Traditional term-matching method doesn’t work well in information retrieval We want to capture the concepts instead of words. Concepts are reflected in the words. However, –One term may have multiple meaning –Different terms may have the same meaning.

3 LSI (Latent Semantic Analysis) LSI approach tries to overcome the deficiencies of term-matching retrieval by treating the unreliability of observed term-document association data as a statistical problem. The goal is to find effective models to represent the relationship between terms and documents. Hence a set of terms, which is by itself incomplete and unreliable, will be replaced by some set of entities which are more reliable indicants.

4 SVD (Singular Value Decomposition) How to learn the concepts from data? SVD is applied to derive the latent semantic structure model. What is SVD? http://kwon3d.com/theory/jkinem/svd.html http://mathworld.wolfram.com/SingularValueDecomposition.html http://www.cs.ut.ee/~toomas_l/linalg/lin2/node13.html#SECTION000 13200000000000000

5 SVDcont’ SVD of the term-by-document matrix X: If the singular values of S0 are ordered by size, we only keep the first k largest values and get a reduced model: – doesn’t exactly match X and it gets closer as more and more singular values are kept –This is what we want. We don’t want perfect fit since we think some of 0’s in X should be 1 and vice versa. –It reflects the major associative patterns in the data, and ignores the smaller, less important influence and noise.

6 Fundamental Comparison Quantities from the SVD Model Comparing Two Terms: the dot product between two row vectors of reflects the extent to which two terms have a similar pattern of occurrence across the set of document. Comparing Two Documents: dot product between two column vectors of Comparing a Term and a Document

7 Example -Technical Memo Query: human-computer interaction Dataset: c1Human machine interface for Lab ABC computer application c2A survey of user opinion of computer system response time c3The EPS user interface management system c4System and human system engineering testing of EPS c5Relations of user-perceived response time to error measurement m1The generation of random, binary, unordered trees m2 The intersection graph of paths in trees m3 Graph minors IV: Widths of trees and well-quasi-ordering m4Graph minors: A survey

8 Example cont’ % 12-term by 9-document matrix >> X=[1 0 0 1 0 0 0 0 0; 1 0 1 0 0 0 0 0 0; 1 1 0 0 0 0 0 0 0; 0 1 1 0 1 0 0 0 0; 0 1 1 2 0 0 0 0 0 0 1 0 0 1 0 0 0 0; 0 0 1 1 0 0 0 0 0; 0 1 0 0 0 0 0 0 1; 0 0 0 0 0 1 1 1 0; 0 0 0 0 0 0 1 1 1; 0 0 0 0 0 0 0 1 1;];

9 Examplecont’ % X=T0*S0*D0', T0 and D0 have orthonormal columns and So is diagonal % T0 is the matrix of eigenvectors of the square symmetric matrix XX' % D0 is the matrix of eigenvectors of X’X % S0 is the matrix of eigenvalues in both cases >> [T0, S0] = eig(X*X'); >> T0 T0 = 0.1561 -0.2700 0.1250 -0.4067 -0.0605 -0.5227 -0.3410 -0.1063 -0.4148 0.2890 -0.1132 0.2214 0.1516 0.4921 -0.1586 -0.1089 -0.0099 0.0704 0.4959 0.2818 -0.5522 0.1350 -0.0721 0.1976 -0.3077 -0.2221 0.0336 0.4924 0.0623 0.3022 -0.2550 -0.1068 -0.5950 -0.1644 0.0432 0.2405 0.3123 -0.5400 0.2500 0.0123 -0.0004 -0.0029 0.3848 0.3317 0.0991 -0.3378 0.0571 0.4036 0.3077 0.2221 -0.0336 0.2707 0.0343 0.1658 -0.2065 -0.1590 0.3335 0.3611 -0.1673 0.6445 -0.2602 0.5134 0.5307 -0.0539 -0.0161 -0.2829 -0.1697 0.0803 0.0738 -0.4260 0.1072 0.2650 -0.0521 0.0266 -0.7807 -0.0539 -0.0161 -0.2829 -0.1697 0.0803 0.0738 -0.4260 0.1072 0.2650 -0.7716 -0.1742 -0.0578 -0.1653 -0.0190 -0.0330 0.2722 0.1148 0.1881 0.3303 -0.1413 0.3008 0.0000 0.0000 0.0000 -0.5794 -0.0363 0.4669 0.0809 -0.5372 -0.0324 -0.1776 0.2736 0.2059 0.0000 0.0000 0.0000 -0.2254 0.2546 0.2883 -0.3921 0.5942 0.0248 0.2311 0.4902 0.0127 -0.0000 -0.0000 -0.0000 0.2320 -0.6811 -0.1596 0.1149 -0.0683 0.0007 0.2231 0.6228 0.0361 0.0000 -0.0000 0.0000 0.1825 0.6784 -0.3395 0.2773 -0.3005 -0.0087 0.1411 0.4505 0.0318

10 Examplecont’ >> [D0, S0] = eig(X'*X); >> D0 D0 = 0.0637 0.0144 -0.1773 0.0766 -0.0457 -0.9498 0.1103 -0.0559 0.1974 -0.2428 -0.0493 0.4330 0.2565 0.2063 -0.0286 -0.4973 0.1656 0.6060 -0.0241 -0.0088 0.2369 -0.7244 -0.3783 0.0416 0.2076 -0.1273 0.4629 0.0842 0.0195 -0.2648 0.3689 0.2056 0.2677 0.5699 -0.2318 0.5421 0.2624 0.0583 -0.6723 -0.0348 -0.3272 0.1500 -0.5054 0.1068 0.2795 0.6198 -0.4545 0.3408 0.3002 -0.3948 0.0151 0.0982 0.1928 0.0038 -0.0180 0.7615 0.1522 0.2122 -0.3495 0.0155 0.1930 0.4379 0.0146 -0.5199 -0.4496 -0.2491 -0.0001 -0.1498 0.0102 0.2529 0.6151 0.0241 0.4535 0.0696 -0.0380 -0.3622 0.6020 -0.0246 0.0793 0.5299 0.0820

11 Examplecont’ >> S0=eig(X'*X) >> S0=S0.^0.5 S0 = 0.3637 0.5601 0.8459 1.3064 1.5048 1.6445 2.3539 2.5417 3.3409 % We only keep the largest two singular values % and the corresponding columns from the T and D

12 Examplecont’ >> T=[0.2214 -0.1132; 0.1976 -0.0721; 0.2405 0.0432; 0.4036 0.0571; 0.6445 -0.1673; 0.2650 0.1072; 0.3008 -0.1413; 0.2059 0.2736; 0.0127 0.4902; 0.0361 0.6228; 0.0318 0.4505;]; >> S = [ 3.3409 0; 0 2.5417 ]; >> D’ =[0.1974 0.6060 0.4629 0.5421 0.2795 0.0038 0.0146 0.0241 0.0820; -0.0559 0.1656 -0.1273 -0.2318 0.1068 0.1928 0.4379 0.6151 0.5299;] >> T*S*D’ 0.1621 0.4006 0.3790 0.4677 0.1760 -0.0527 0.1406 0.3697 0.3289 0.4004 0.1649 -0.0328 0.1525 0.5051 0.3580 0.4101 0.2363 0.0242 0.2581 0.8412 0.6057 0.6973 0.3924 0.0331 0.4488 1.2344 1.0509 1.2658 0.5564 -0.0738 0.1595 0.5816 0.3751 0.4168 0.2766 0.0559 0.2185 0.5495 0.5109 0.6280 0.2425 -0.0654 0.0969 0.5320 0.2299 0.2117 0.2665 0.1367 -0.0613 0.2320 -0.1390 -0.2658 0.1449 0.2404 -0.0647 0.3352 -0.1457 -0.3016 0.2028 0.3057 -0.0430 0.2540 -0.0966 -0.2078 0.1520 0.2212

13 Summary What is the common and difference between PCA and SVD? Both are related to standard eigenvalue- eigenvector, to remove noise or correlation and get the most important info. PCA is on covariance matrix and SVD works on original matrix.


Download ppt "Paper: Indexing by Latent Semantic Analysis for course cs630 presented by: Haiyan Qiao."

Similar presentations


Ads by Google