Download presentation

Presentation is loading. Please wait.

Published byTrace Lord Modified over 2 years ago

1
Dimensionality Reduction

2
High-dimensional == many features Find concepts/topics/genres: – Documents: Features: Thousands of words, millions of word pairs – Surveys – Netflix: 480k users x 177k movies Slides by Jure Leskovec2

3
Dimensionality Reduction Compress / reduce dimensionality: – 10 6 rows; 10 3 columns; no updates – random access to any cell(s); small error: OK Slides by Jure Leskovec3

4
Dimensionality Reduction Assumption: Data lies on or near a low d-dimensional subspace Axes of this subspace are effective representation of the data Slides by Jure Leskovec4

5
Why Reduce Dimensions? Why reduce dimensions? Discover hidden correlations/topics – Words that occur commonly together Remove redundant and noisy features – Not all words are useful Interpretation and visualization Easier storage and processing of the data Slides by Jure Leskovec5

6
SVD - Definition A [m x n] = U [m x r] r x r] (V [n x r] ) T A: Input data matrix – m x n matrix (e.g., m documents, n terms) U: Left singular vectors – m x r matrix (m documents, r concepts) : Singular values – r x r diagonal matrix (strength of each ‘concept’) (r : rank of the matrix A) V: Right singular vectors – n x r matrix (n terms, r concepts) Slides by Jure Leskovec6

7
SVD Slides by Jure Leskovec7 A m n m n U VTVT T

8
SVD Slides by Jure Leskovec8 A m n + 1u1v11u1v1 2u2v22u2v2 σ i … scalar u i … vector v i … vector T

9
SVD - Properties It is always possible to decompose a real matrix A into A = U V T, where U, , V: unique U, V: column orthonormal: – U T U = I; V T V = I (I: identity matrix) – (Cols. are orthogonal unit vectors) : diagonal – Entries (singular values) are positive, and sorted in decreasing order ( σ 1 σ 2 ... 0) Slides by Jure Leskovec9

10
SVD – Example: Users-to-Movies A = U V T - example: Slides by Jure Leskovec10 = SciFi Romnce xx Matrix Alien Serenity Casablanca Amelie

11
SVD – Example: Users-to-Movies A = U V T - example: Slides by Jure Leskovec11 = xx SciFi-concept Romance-concept SciFi Romnce Matrix Alien Serenity Casablanca Amelie

12
SVD - Example A = U V T - example: Slides by Jure Leskovec12 = xx SciFi-concept Romance-concept U is “user-to-concept” similarity matrix SciFi Romnce Matrix Alien Serenity Casablanca Amelie

13
SVD - Example A = U V T - example: Slides by Jure Leskovec13 = xx ‘strength’ of SciFi-concept SciFi Romnce Matrix Alien Serenity Casablanca Amelie

14
SVD - Example A = U V T - example: Slides by Jure Leskovec14 = xx V is “movie-to-concept” similarity matrix SciFi-concept SciFi Romnce Matrix Alien Serenity Casablanca Amelie

15
SVD - Example A = U V T - example: Slides by Jure Leskovec15 = xx SciFi-concept SciFi Romnce Matrix Alien Serenity Casablanca Amelie V is “movie-to-concept” similarity matrix

16
SVD - Interpretation #1 ‘movies’, ‘users’ and ‘concepts’: U: user-to-concept similarity matrix V: movie-to-concept sim. matrix : its diagonal elements: ‘strength’ of each concept Slides by Jure Leskovec16

17
SVD - interpretation #2 Slides by Jure Leskovec17 SVD gives best axis to project on: ‘best’ = min sum of squares of projection errors minimum reconstruction error v1v1 first right singular vector Movie 1 rating Movie 2 rating

18
SVD - Interpretation #2 A = U V T - example: Slides by Jure Leskovec18 xx v1v1 = v1v1 first right singular vector Movie 1 rating Movie 2 rating

19
SVD - Interpretation #2 A = U V T - example: Slides by Jure Leskovec19 xx variance (‘spread’) on the v 1 axis =

20
SVD - Interpretation #2 More details Q: How exactly is dim. reduction done? Slides by Jure Leskovec20 xx =

21
SVD - Interpretation #2 More details Q: How exactly is dim. reduction done? A: Set the smallest singular values to zero Slides by Jure Leskovec21 = xx A=

22
SVD - Interpretation #2 More details Q: How exactly is dim. reduction done? A: Set the smallest singular values to zero Slides by Jure Leskovec22 xx A= ~

23
SVD - Interpretation #2 More details Q: How exactly is dim. reduction done? A: Set the smallest singular values to zero: Slides by Jure Leskovec23 xx A= ~

24
SVD - Interpretation #2 More details Q: How exactly is dim. reduction done? A: Set the smallest singular values to zero: Slides by Jure Leskovec24 xx A= ~

25
SVD - Interpretation #2 More details Q: How exactly is dim. reduction done? A: Set the smallest singular values to zero Slides by Jure Leskovec25 ~ A= B= Frobenius norm: ǁ M ǁ F = Σ ij M ij 2 ǁ A-B ǁ F = Σ ij (A ij -B ij ) 2 is “small”

26
Slides by Jure Leskovec26 A U Sigma VTVT = B U VTVT = B is approx A

27
SVD – Best Low Rank Approx. Slides by Jure Leskovec27

28
SVD – Best Low Rank Approx. Slides by Jure Leskovec28 We apply: -- P column orthonormal -- R row orthonormal -- Q is diagonal

29
SVD – Best Low Rank Approx. Slides by Jure Leskovec29 U V T - U S V T = U ( - S) V T

30
SVD - Interpretation #2 Equivalent: ‘spectral decomposition’ of the matrix: Slides by Jure Leskovec30 = xx u1u1 u2u2 σ1σ1 σ2σ2 v1v1 v2v2

31
SVD - Interpretation #2 Equivalent: ‘spectral decomposition’ of the matrix Slides by Jure Leskovec31 = u1u1 σ1σ1 vT1vT1 u2u2 σ2σ2 vT2vT2 + +... n m n x 1 1 x m k terms Assume: σ 1 σ 2 σ 3 ... 0 Why is setting small σs the thing to do? Vectors u i and v i are unit length, so σ i scales them. So, zeroing small σs introduces less error.

32
SVD - Interpretation #2 Q: How many σ s to keep? A: Rule-of-a thumb: keep 80-90% of ‘energy’ (= σ i 2 ) Slides by Jure Leskovec32 =u1u1 σ1σ1 vT1vT1 u2u2 σ2σ2 vT2vT2 + +... n m assume: σ 1 σ 2 σ 3 ...

33
SVD - Complexity To compute SVD: – O(nm 2 ) or O(n 2 m) (whichever is less) But: – Less work, if we just want singular values – or if we want first k singular vectors – or if the matrix is sparse Implemented in linear algebra packages like – LINPACK, Matlab, SPlus, Mathematica... Slides by Jure Leskovec33

34
SVD - Conclusions so far SVD: A= U V T : unique – U: user-to-concept similarities – V: movie-to-concept similarities – : strength of each concept Dimensionality reduction: – keep the few largest singular values (80-90% of ‘energy’) – SVD: picks up linear correlations Slides by Jure Leskovec34

35
Case study: How to query? Q: Find users that like ‘Matrix’ and ‘Alien’ Slides by Jure Leskovec35 = SciFi Romnce xx Matrix Alien Serenity Casablanca Amelie

36
Case study: How to query? Q: Find users that like ‘Matrix’ A: Map query into a ‘concept space’ – how? Slides by Jure Leskovec36 = SciFi Romnce xx Matrix Alien Serenity Casablanca Amelie

37
Case study: How to query? Q: Find users that like ‘Matrix’ A: map query vectors into ‘concept space’ – how? Slides by Jure Leskovec37 q=q= Matrix Alien v1 q v2 Matrix Alien Serenity Casablanca Amelie Project into concept space: Inner product with each ‘concept’ vector v i

38
Case study: How to query? Q: Find users that like ‘Matrix’ A: map the vector into ‘concept space’ – how? Slides by Jure Leskovec38 v1 q q*v 1 q=q= Matrix Alien Serenity Casablanca Amelie v2 Matrix Alien Project into concept space: Inner product with each ‘concept’ vector v i

39
Case study: How to query? Compactly, we have: q concept = q V E.g.: Slides by Jure Leskovec39 movie-to-concept similarities = SciFi-concept q=q= Matrix Alien Serenity Casablanca Amelie

40
Case study: How to query? How would the user d that rated (‘Alien’, ‘Serenity’) be handled? d concept = d V E.g.: Slides by Jure Leskovec40 movie-to-concept similarities = SciFi-concept d= Matrix Alien Serenity Casablanca Amelie

41
Case study: How to query? Observation: User d that rated (‘Alien’, ‘Serenity’) will be similar to query “user” q that rated (‘Matrix’), although d did not rate ‘Matrix’! Slides by Jure Leskovec41 d= SciFi-concept q=q= Matrix Alien Serenity Casablanca Amelie Similarity = 0 Similarity ≠ 0

42
SVD: Drawbacks + Optimal low-rank approximation: in Frobenius norm - Interpretability problem: – A singular vector specifies a linear combination of all input columns or rows - Lack of sparsity: – Singular vectors are dense! Slides by Jure Leskovec42 U VTVT

Similar presentations

OK

1cs542g-term1-2006 High Dimensional Data So far we’ve considered scalar data values f i (or interpolated/approximated each component of vector values.

1cs542g-term1-2006 High Dimensional Data So far we’ve considered scalar data values f i (or interpolated/approximated each component of vector values.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on obesity prevention and awareness Ppt on home security system using gsm Ppt on australian continental divide Ppt on paintings and photographs related to colonial period dates Ppt on junk food and its effects Ppt on energy in hindi Ppt on labour cost accounting Ppt on service oriented architecture ppt Ppt on project financing in india Ppt on nature and humans