# Distributed Nuclear Norm Minimization for Matrix Completion

## Presentation on theme: "Distributed Nuclear Norm Minimization for Matrix Completion"— Presentation transcript:

Distributed Nuclear Norm Minimization for Matrix Completion
Morteza Mardani, Gonzalo Mateos and Georgios Giannakis ECE Department, University of Minnesota Acknowledgments: MURI (AFOSR FA ) grant Cesme, Turkey June 19, 2012 1 1

Learning from “Big Data”
`Data are widely available, what is scarce is the ability to extract wisdom from them’ Hal Varian, Google’s chief economist Fast BIG Ubiquitous Revealing Productive Smart Messy 2 K. Cukier, ``Harnessing the data deluge,'' Nov 2

Context Imputation of network data
Preference modeling Imputation of network data Smart metering Network cartography Goal: Given few incomplete rows per agent, impute missing entries in a distributed fashion by leveraging low-rank of the data matrix. 3 3 3 3

Low-rank matrix completion
Consider matrix , set Sampling operator ? ? ? ? Given incomplete (noisy) data ? ? ? ? ? ? (as) has low rank Goal: denoise observed entries, impute missing ones ? ? Nuclear-norm minimization [Fazel’02],[Candes-Recht’09] Noisy Noise-free s.t. 4 4

Problem statement Network: undirected, connected graph
? ? ? ? ? ? n ? ? ? ? Goal: Given per node and single-hop exchanges, find (P1) Challenges Nuclear norm is not separable Global optimization variable 5 5

Separable regularization
Key result [Recht et al’11] Lxρ ≥rank[X] New formulation equivalent to (P1) (P2) Nonconvex; reduces complexity: Proposition 1. If stationary pt. of (P2) and , then is a global optimum of (P1). 6 6

Distributed estimator
(P3) Consensus with neighboring nodes Network connectivity (P2) (P3) Alternating-directions method of multipliers (ADMM) solver Method [Glowinski-Marrocco’75], [Gabay-Mercier’76] Learning over networks [Schizas et al’07] Primal variables per agent : n Message passing: 7 7

Distributed iterations
8 8

Attractive features Highly parallelizable with simple recursions
Unconstrained QPs per agent No SVD per iteration Low overhead for message exchanges is and is small Comm. cost independent of network size Recap: (P1) (P2) (P3) Centralized Convex Sep. regul. Nonconvex Consensus Nonconvex Stationary (P3) Stationary (P2) Global (P1) 9 9

Optimality Proposition 2. If converges to and , then: i)
ii) is the global optimum of (P1). ADMM can converge even for non-convex problems [Boyd et al’11] Simple distributed algorithm for optimal matrix imputation Centralized performance guarantees e.g., [Candes-Recht’09] carry over 10 10

Synthetic data Random network topology N=20, L=66, T=66 Data , 11 11

Real data Network distance prediction [Liau et al’12]
Abilene network data (Aug 18-22,2011) End-to-end latency matrix N=9, L=T=N 80% missing data Figures: 1) ROC 2) 3D plot of the detected anomalies like the proposal Relative error: 10% 12 Data: 12