SAND2009-2389C 1/17 Coupled Matrix Factorizations using Optimization Daniel M. Dunlavy, Tamara G. Kolda, Evrim Acar Sandia National Laboratories SIAM Conference.

Slides:



Advertisements
Similar presentations
Conclusion Kenneth Moreland Sandia National Laboratories Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,
Advertisements

Nonnegative Matrix Factorization with Sparseness Constraints S. Race MA591R.
Copyright 2011, Data Mining Research Laboratory Fast Sparse Matrix-Vector Multiplication on GPUs: Implications for Graph Mining Xintian Yang, Srinivasan.
1 Challenge the future Multi-scale mining of fMRI data with hierarchical structured sparsity – R. Jenatton et al, SIAM Journal of Imaging Sciences, 2012.
BiG-Align: Fast Bipartite Graph Alignment
Ensemble Emulation Feb. 28 – Mar. 4, 2011 Keith Dalbey, PhD Sandia National Labs, Dept 1441 Optimization & Uncertainty Quantification Abani K. Patra, PhD.
Dimensionality Reduction PCA -- SVD
The GraphSLAM Algorithm Daniel Holman CS 5391: AI Robotics March 12, 2014.
Ibrahim Hoteit KAUST, CSIM, May 2010 Should we be using Data Assimilation to Combine Seismic Imaging and Reservoir Modeling? Earth Sciences and Engineering.
Non-Negative Tensor Factorization with RESCAL Denis Krompaß 1, Maximilian Nickel 1, Xueyan Jiang 1 and Volker Tresp 1,2 1 Department of Computer Science.
Exploring Communication Options with Adaptive Mesh Refinement Courtenay T. Vaughan, and Richard F. Barrett Sandia National Laboratories SIAM Computational.
Distributed Nonnegative Matrix Factorization for Web-Scale Dyadic Data Analysis on MapReduce Chao Liu, Hung-chih Yang, Jinliang Fan, Li-Wei He, Yi-Min.
Andrei Sharf Dan A. Alcantara Thomas Lewiner Chen Greif Alla Sheffer Nina Amenta Daniel Cohen-Or Space-time Surface Reconstruction using Incompressible.
Kathryn Linehan Advisor: Dr. Dianne O’Leary
Informatics and Mathematical Modelling / Intelligent Signal Processing ISCAS Morten Mørup Approximate L0 constrained NMF/NTF Morten Mørup Informatics.
Sandia is a multi-program laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
Yuan Chen Advisor: Professor Paul Cuff. Introduction Goal: Remove reverberation of far-end input from near –end input by forming an estimation of the.
LTE Review (September 2005 – January 2006) January 17, 2006 Daniel M. Dunlavy John von Neumann Fellow Optimization and Uncertainty Estimation (1411) (8962.
Informatics and Mathematical Modelling / Intelligent Signal Processing 1 EUSIPCO’09 27 August 2009 Tuning Pruning in Sparse Non-negative Matrix Factorization.
Next. A Big Thanks Again Prof. Jason Bohland Quantitative Neuroscience Laboratory Boston University.
Gwangju Institute of Science and Technology Intelligent Design and Graphics Laboratory Multi-scale tensor voting for feature extraction from unstructured.
1 Information Retrieval through Various Approximate Matrix Decompositions Kathryn Linehan Advisor: Dr. Dianne O’Leary.
Non Negative Matrix Factorization
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
Introduction to tensor, tensor factorization and its applications
Solution for non-negative ffCO2 emissions ‒ Incorporate priors ‒ Solve, using StOMP [1] ‒ StOMP solution does not give non-negative ffCO2 emissions; a.
The Red Storm High Performance Computer March 19, 2008 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
LTSI (1) Faculty of Mech. & Elec. Engineering, University AL-Baath, Syria Ahmad Karfoul (1), Julie Coloigner (2,3), Laurent Albera (2,3), Pierre Comon.
Strategies for Solving Large-Scale Optimization Problems Judith Hill Sandia National Laboratories October 23, 2007 Modeling and High-Performance Computing.
Center for Evolutionary Functional Genomics Large-Scale Sparse Logistic Regression Jieping Ye Arizona State University Joint work with Jun Liu and Jianhui.
Badrul M. Sarwar, George Karypis, Joseph A. Konstan, and John T. Riedl
SINGULAR VALUE DECOMPOSITION (SVD)
The Effect of Dimensionality Reduction in Recommendation Systems
Danny Dunlavy, Andy Salinger Sandia National Laboratories Albuquerque, New Mexico, USA SIAM Parallel Processing February 23, 2006 SAND C Sandia.
CMU SCS KDD '09Faloutsos, Miller, Tsourakakis P5-1 Large Graph Mining: Power Tools and a Practitioner’s guide Task 5: Graphs over time & tensors Faloutsos,
Blind Information Processing: Microarray Data Hyejin Kim, Dukhee KimSeungjin Choi Department of Computer Science and Engineering, Department of Chemical.
A Clustering Method Based on Nonnegative Matrix Factorization for Text Mining Farial Shahnaz.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
An Efficient Greedy Method for Unsupervised Feature Selection
Sandia is a multi-program laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
A Dirichlet-to-Neumann (DtN)Multigrid Algorithm for Locally Conservative Methods Sandia National Laboratories is a multi program laboratory managed and.
CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
CoNMF: Exploiting User Comments for Clustering Web2.0 Items Presenter: He Xiangnan 28 June School of Computing National.
NONNEGATIVE MATRIX FACTORIZATION WITH MATRIX EXPONENTIATION Siwei Lyu ICASSP 2010 Presenter : 張庭豪.
Parallelizing the conjugate gradient algorithm for multilevel Toeplitz systems Jie Chen a and Tom L. H. Li b a Argonne National Laboratory b University.
Non-negative Matrix Factorization
Matrix Factorization and its applications By Zachary 16 th Nov, 2010.
Fall 1999 Copyright © R. H. Taylor Given a linear systemAx -b = e, Linear Least Squares (sometimes written Ax  b) We want to minimize the sum.
Cameron Rowe.  Introduction  Purpose  Implementation  Simple Example Problem  Extended Kalman Filters  Conclusion  Real World Examples.
Logistic Regression & Elastic Net
Multifidelity Optimization Using Asynchronous Parallel Pattern Search and Space Mapping Techniques Genetha Gray*, Joe Castro i, Patty Hough*, and Tony.
DATA MINING LECTURE 8 Sequence Segmentation Dimensionality Reduction.
Matrix Factorization & Singular Value Decomposition Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation,
Algorithm for non-negative matrix factorization Daniel D. Lee, H. Sebastian Seung. Algorithm for non-negative matrix factorization. Nature.
Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation,
Sparse nonnegative matrix factorization for protein sequence motifs information discovery Presented by Wooyoung Kim Computer Science, Georgia State University.
Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation,
A Kriging or Gaussian Process emulator has: an unadjusted mean (frequently a least squares fit: ), a correction / adjustment to the mean based on data,
Matrix Factorization and Collaborative Filtering
Large Graph Mining: Power Tools and a Practitioner’s guide
Outline Introduction NMF Chemistry Problem
Jeremy Watt and Aggelos Katsaggelos Northwestern University
Zhu Han University of Houston Thanks for Dr. Mingyi Hong’s slides
School of Computer Science & Engineering
Gleb Panteleev (IARC) Max Yaremchuk, (NRL), Dmitri Nechaev (USM)
Principal Nested Spheres Analysis
Iterative Non-Linear Optimization Methods
SPARSE TENSORS DECOMPOSITION SOFTWARE
Non-Negative Matrix Factorization
Presentation transcript:

SAND C 1/17 Coupled Matrix Factorizations using Optimization Daniel M. Dunlavy, Tamara G. Kolda, Evrim Acar Sandia National Laboratories SIAM Conference on Computational Science and Engineering March 4, 2009 Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.

2/17 Motivating Problems Data with multiple types of two-way relationships –Bibliometric analysis author-document, term-document, author-venue, etc. Can we predict potential co-authors? –Movie ratings movie-actor, user-movie, actor-award Can we predict useful movie ratings for other users? Consistent dimensionality reduction Improved interpretation through non-negativity constraints

3/17 Some Related Work Simultaneous factor analysis –Gramian matrices [Levin, 1966] –Test score covariance matrices over time [Millsap, et al., 1988] Simultaneous diagonalization –Population differentiation in biology [Thorpe, 1988] –Blind source separation [Ziehe et al., 2004] Generalized SVD Damped or constrained least squares [Van Loan, 1976] –Microarray data analysis [Alter, et al., 2003] –Multimicrophone speech filtering [Doclo and Moonen, 2002] Simultaneous Non-negative Matrix Factorization –Gene clustering in microarray data [Badea, 2007; 2008] Tensor decompositions –Data mining, chemometrics, neuroscience [Kolda, Acar, Bro, Park, Zhang, Berry, Chen, Martin, CSE09] matrices of same size only 2 matrices slow at least one common dimension

4/17 Coupled Non-negative Matrix Factorization (CNMF) Given Solve document-term document-author

5/17 Method: CNMF-ALS CNMF-ALS: Alternating Least Squares [Extends Berry, et al., 2006] linear least squares + simple projection to constraint boundary

6/17 Method: CNMF-MULT CNMF-MULT: Multiplicative Updates [Badea, 2007; Badea, 2008; extends Lee and Seung, 2001]

7/17 Method: CNMF-OPT CNMF-OPT: Projective Nonlinear CG, More-Thuente LS [Extends Acar, Kolda, and Dunlavy, 2009 and Lin, 2007]

8/17 Matlab Experiments Noise: mnpr*# var

9/17 Results: No noise, r = r*

10/17 Results: No noise, r = r*

11/17 Results: No noise, r = r*

12/17 Results: No noise, r = r*+1

13/17 Results: No noise, r=r*+1

14/17 Results: Noisy data, r=r*+1

15/17 Future Work Extending other promising methods to CNMF –Block principal pivoting based NMF [Park, et al. 2008] –Projected gradient NMF [Lin, 2007] –Projected Newton NMF [Kim, et al., 2008] CNMF-OPT extensions –Sparse data, regularization [Acar, Kolda, and Dunlavy, 2009] –Sparsity constraints [Park, et al. 2008] Numerical experiments –Scale to larger data sets –Comparisons on real data sets [Park, et al. 2008] Alternate models / problem formulations –Coupling matrix and tensor decompositions (CNMF/CNTF)

16/17 Conclusions Coupled matrix factorizations –Method for computing factorizations consistent along common dimensions in data Results – CNMF-OPT Fast and accurate –Overfactors well and handles noise well – CNMF-ALS Fast, but not accurate –Overfactoring is a big challenge – CNMF-MULT Accurate, but may be too slow (similar to NMF results) Future Work –Identified several promising paths forward

17/17 Thank You Coupled Matrix Factorizations using Optimization Danny Dunlavy