# “Non-negative Matrix Factorization (NMF) for Pattern Recognition” T. Ensari, J. Chorowski, J. M. Zurada University of Louisville, USA.

## Presentation on theme: "“Non-negative Matrix Factorization (NMF) for Pattern Recognition” T. Ensari, J. Chorowski, J. M. Zurada University of Louisville, USA."— Presentation transcript:

“Non-negative Matrix Factorization (NMF) for Pattern Recognition” T. Ensari, J. Chorowski, J. M. Zurada University of Louisville, USA

Outline  Definition  Why NMF?  NMF Algorithms  Clustering with NMF  Applications Areas: - Gene/Protein, Image, Audio and Text Data Analysis  NMF for Document Clustering  Several Types of NMF  Conclusion

Definition of NMF

 History: Proposed by Lee and Seung, Nature, 1999.  NMF can be used as an Unsupervised Dimension Reduction / Clustering Method

Why NMF?  Nonnegative constraints are physically meaningful. - Pixels in digital image  Biomedical Image Processing - Molecule concentration in bioinformatics (e.g. mRNA, protein, miRNA, etc.)  Microarray Analysis - Signal intentisities in mass spectrometry  Computational Proteomics  Speed: Fast convergence  It can be applied for several tasks ( Gene/Protein Microarray Data Analysis, Digital Image, Processing, Text Data Mining, etc.).  Hard and soft clustering are possible.

NMF Algorithms  Multiplicative Update Rule (Lee&Seung, 2000).  Gradient Descent (Hoyer, 2004).  Alternating Least Squares (Paatero, 1994).  NMF is algorithm dependent, so W and H are not unique !

NMF Algorithms

Clustering with NMF  NMF is one of the Dimension Reduction/Clustering Method and these are other methods in the literature: - k-Means Clustering - Singular Value Decomposition (SVD) - Self Organizing Maps (SOM) - Hierarchical Clustering - Principal Component Analysis (PCA) - Mixture of Gaussian

Clustering with NMF  What is clustering?  Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from (or unrelated to) the objects in other groups. Inter-cluster distances are maximized Intra-cluster distances are minimized

Clustering with NMF  Document Clustering: Grouping of text documents into meaningful clusters in an unsupervised manner. Government Science Arts

Clustering with NMF  It is u nsupervised clustering example.  For low-dimensional data sets, our eyes are excellent at clustering.  Cluster analysis becomes much more challenging (and much more interesting) if the data set is both large and high-dimensional.  The goal of cluster analysis is to find hidden structure in a data set.

Applications on Gene/Protein Data  Goal: Discover hidden patterns in large quantities of data produced from microarray experiments.  Explore data to identify structure without supervision.  Data can be represented in non-negative matrix (gene × samples).

Applications on Gene/Protein Data

Applications on Image Processing  Data Compression  Clustering Images  Finding Similar Images

Applications on Image Processing  Reconstructed images:

Applications on Audio Data  Audio demonstration: We can separate the sounds.

Applications on Audio Data  Amplitude spectrogram: Audio represented as a non-negative matrix.

Applications on Text Data a) Typical document matrix before clustering b) Document clustering with NMF (k=2) c) Document clustering with NMF (k=5)

Applications on Text Data  Data Compression  Finding Similar Terms  Finding Similar Documents  Cluster Documents  Topic Detection and Tracking

NMF for Document Clustering  20 News articles dataset: Dataset Number of Documents Number of Classes Newsgroups20,00020

NMF for Document Clustering  Synonyms  Noise in A data matrix For example: Century, Symbol,…

NMF for Document Clustering  Matlab Outputs after using NMF (k = 10): religion christian peopl god line detail valu moral server scienc talk object jesus saw mac built arab frank dwyer configur  RELIGION name uk mathew shall folk tree righteous pin speed ram ps mb meg isa centri slot ns simm mail drive help pleas info anybodi video manufactur monitor vga  COMPUTER name com server file help sandvik newton appl kent ignor spread window stein brad kill guess imagin final water org reveal river sourc israel isra arab civilian ncsu mb alan norton lebanon hasan nysernet hernlem lebanes net know object option thank summar advanc compil righteous anybodi driver latest site ftp ati window bio avail price street charg card uk mathew cost sorri plus display fix driver super vga ati ultra mb ship diamond beast armenian atheism version atheist exist god stein edu answer cs keith ve ac charley wingat mango umd contradictori imag ultb isc rit mozumd il version word god rather man brad keep shall said hear turkish org tree heart righteous receiv luke bless davidian ps isa turkey armenian sdpa armenia urartu  POLITICS card file po cwru hear format mous summar compil islam convert job email muslim luke bless bus diamond slot

Several Types of NMF  There are several types of NMF proposed in the literature, some of them are: - Sparse NMF - Quadratic NMF - Probabilistic NMF - Orthogonal NMF - Nonsmooth NMF - Weighted NMF - Convex NMF - Bayesian NMF - Gaussian NMF - Projective NMF

Conclusion  We can use NMF for - Dimensionality Reduction (Data Mining) - Clustering Analysis (Pattern Analysis)  Current NMF Research: - Algorithms - Alternative Objective Functions - Convergence Criterion - Updating NMF - Initializing NMF - Choosing k

Thank you… and Questions?

Download ppt "“Non-negative Matrix Factorization (NMF) for Pattern Recognition” T. Ensari, J. Chorowski, J. M. Zurada University of Louisville, USA."

Similar presentations