Presentation is loading. Please wait.

Presentation is loading. Please wait.

1/11 Tea Talk: Weighted Low Rank Approximations Ben Marlin Machine Learning Group Department of Computer Science University of Toronto April 30, 2003.

Similar presentations


Presentation on theme: "1/11 Tea Talk: Weighted Low Rank Approximations Ben Marlin Machine Learning Group Department of Computer Science University of Toronto April 30, 2003."— Presentation transcript:

1 1/11 Tea Talk: Weighted Low Rank Approximations Ben Marlin Machine Learning Group Department of Computer Science University of Toronto April 30, 2003

2 2/11 Paper Details: Authors: Nathan Srebro, Tommi Jaakkola (MIT) Title: Weighted Low Rank Approximations URL: http://www.ai.mit.edu/~nati/LowRank/icml.pdf Submitted: ICML 2003

3 3/11 Motivation: Missing Data: Weighted LRA naturally handles data matrices with missing elements by using a 0/1 weight matrix. Noisy Data: Weighted LRA naturally handles data matrices with different noise variance estimates σ ij for each of the elements of the matrix by setting W ij = 1/σ ij.

4 4/11 The Problem: Given an nxm data matrix D and an nxm weight matrix W, construct a rank-K approximation X=UV’ to D that minimizes error in the weighted Froebenius norm E WF. DU V’ = mm n W m X mK K

5 5/11 Relationship to standard SVD: Critical points of E WF can be local minima that are not global minima. wSVD does not admit a solution based on eigenvectors of the data matrix D. Adding the requirement that U and V are orthogonal results in a weighted low rank approximation analogous to SVD.

6 6/11 Optimization Approach: Main Idea: For a given V the optimal U v * can be calculated analytically, as can the gradient of the projected objective function E* WF (V)= E* WF (U v *, V). Thus, perform gradient descent on E* WF (V). Where d(W i ) is the mxm matrix with the i th row of W along the diagonal and D i is the i th row of D.

7 7/11 Missing Value Approach: Main Idea: Consider a model of the data matrix given by D=X+Z where Z is white Gaussian noise. The weighted cost of X is equivalent to the log-likelihood of the observed variables. This suggests an EM approach where in the E step the missing values in D are filled in according to the values in X creating a matrix F. In the M step X is re-estimated as the rank-K SVD of F.

8 8/11 Missing Value Approach: Extension to General Weights: Consider a system with several data matrices D n =X+Z n where the Z n are independent gaussian white noise. The maximum likelihood X in this case is found by taking the rank-K SVD of the mean of the F n ’s. Now consider a weighted rank-K approximation problem where W ij = w ij /N and w ij ={1,…,N}. Such a problem can be converted to the type of problem described above by observing D ij in w ij of a total of N D n ’s. For any N the mean of the N matrices F n is given by:

9 9/11 Missing Value Approach: EM Algorithm: This approach yields an extremely simple EM- Algorithm: E-Step: M-Step: Obtain U,V from SVD of F Set X t+1 = UV’ function X=wsvd(D,W,K) X=zeros(size(D)); Xold=inf*ones(size(D)); C=inf; while(sum(sum((X-Xold).^2))>eps) Xold=X; [U,S,V]=svd(W.*D+(1-W).*X); S(K+1:end,K+1:end)=0; X=U*S*V'; end

10 10/11 Example: DataWeights 0.86 1.06 0.300.58 0.450.651.051.19 0.380.900.98 0.330.750.92 0111 0110 1111 1011 1011 0.370.86 1.06 0.250.300.580.62 0.450.651.051.19 0.380.490.900.98 0.330.580.750.93 wSVD K=2 0.680.38 0.300.86 0.540.85 0.150.59 0.700.50 0.900.640.660.29 0.82 0.34 0.920.750.580.33 0.980.900.490.38 1.191.050.650.45 0.620.580.300.25 1.060.86 0.37 = Synthetic Rank 2 Matrix:

11 11/11 The End


Download ppt "1/11 Tea Talk: Weighted Low Rank Approximations Ben Marlin Machine Learning Group Department of Computer Science University of Toronto April 30, 2003."

Similar presentations


Ads by Google