DIMENSIONALITY REDUCTION Computer Graphics CourseJune 2013.

Slides:



Advertisements
Similar presentations
Text mining Gergely Kótyuk Laboratory of Cryptography and System Security (CrySyS) Budapest University of Technology and Economics
Advertisements

Self-Organizing Maps Projection of p dimensional observations to a two (or one) dimensional grid space Constraint version of K-means clustering –Prototypes.
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Medical Image Registration Kumar Rajamani. Registration Spatial transform that maps points from one image to corresponding points in another image.
Self Organization of a Massive Document Collection
Unsupervised Networks Closely related to clustering Do not require target outputs for each input vector in the training data Inputs are connected to a.
Self-Organizing Map (SOM). Unsupervised neural networks, equivalent to clustering. Two layers – input and output – The input layer represents the input.
Graph Laplacian Regularization for Large-Scale Semidefinite Programming Kilian Weinberger et al. NIPS 2006 presented by Aggeliki Tsoli.
Multimedia DBs. Multimedia dbs A multimedia database stores text, strings and images Similarity queries (content based retrieval) Given an image find.
Non-linear Dimensionality Reduction CMPUT 466/551 Nilanjan Ray Prepared on materials from the book Non-linear dimensionality reduction By Lee and Verleysen,
Self Organizing Maps. This presentation is based on: SOM’s are invented by Teuvo Kohonen. They represent multidimensional.
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Lecture 21: Spectral Clustering
Principal Component Analysis
Radial Basis Functions
CONTENT BASED FACE RECOGNITION Ankur Jain 01D05007 Pranshu Sharma Prashant Baronia 01D05005 Swapnil Zarekar 01D05001 Under the guidance of Prof.
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Three Algorithms for Nonlinear Dimensionality Reduction Haixuan Yang Group Meeting Jan. 011, 2005.
A Global Geometric Framework for Nonlinear Dimensionality Reduction Joshua B. Tenenbaum, Vin de Silva, John C. Langford Presented by Napat Triroj.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Atul Singh Junior Undergraduate CSE, IIT Kanpur.  Dimension reduction is a technique which is used to represent a high dimensional data in a more compact.
NonLinear Dimensionality Reduction or Unfolding Manifolds Tennenbaum|Silva|Langford [Isomap] Roweis|Saul [Locally Linear Embedding] Presented by Vikas.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional.
 C. C. Hung, H. Ijaz, E. Jung, and B.-C. Kuo # School of Computing and Software Engineering Southern Polytechnic State University, Marietta, Georgia USA.
Manifold learning: Locally Linear Embedding Jieping Ye Department of Computer Science and Engineering Arizona State University
Computer Vision James Hays, Brown
The Tutorial of Principal Component Analysis, Hierarchical Clustering, and Multidimensional Scaling Wenshan Wang.
Summarized by Soo-Jin Kim
CSE 185 Introduction to Computer Vision Pattern Recognition.
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
Non Negative Matrix Factorization
Graphite 2004 Statistical Synthesis of Facial Expressions for the Portrayal of Emotion Lisa Gralewski Bristol University United Kingdom
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Adaptive nonlinear manifolds and their applications to pattern.
A Scalable Self-organizing Map Algorithm for Textual Classification: A Neural Network Approach to Thesaurus Generation Dmitri G. Roussinov Department of.
Classification / Regression Neural Networks 2
Self-Organizing Maps Corby Ziesman March 21, 2007.
Data Reduction. 1.Overview 2.The Curse of Dimensionality 3.Data Sampling 4.Binning and Reduction of Cardinality.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
Machine Learning Neural Networks (3). Understanding Supervised and Unsupervised Learning.
Classification Course web page: vision.cis.udel.edu/~cv May 12, 2003  Lecture 33.
Computer Vision Lab. SNU Young Ki Baik Nonlinear Dimensionality Reduction Approach (ISOMAP, LLE)
Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering.
CSE 185 Introduction to Computer Vision Face Recognition.
Manifold learning: MDS and Isomap
Jan Kamenický.  Many features ⇒ many dimensions  Dimensionality reduction ◦ Feature extraction (useful representation) ◦ Classification ◦ Visualization.
Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.
Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
381 Self Organization Map Learning without Examples.
CUNY Graduate Center December 15 Erdal Kose. Outlines Define SOMs Application Areas Structure Of SOMs (Basic Algorithm) Learning Algorithm Simulation.
Computational Biology Clustering Parts taken from Introduction to Data Mining by Tan, Steinbach, Kumar Lecture Slides Week 9.
Data Mining Course 2007 Eric Postma Clustering. Overview Three approaches to clustering 1.Minimization of reconstruction error PCA, nlPCA, k-means clustering.
Self-Organizing Maps (SOM) (§ 5.5)
Math 285 Project Diffusion Maps Xiaoyan Chong Department of Mathematics and Statistics San Jose State University.
CSC321: Lecture 25: Non-linear dimensionality reduction Geoffrey Hinton.
Color Image Segmentation Mentor : Dr. Rajeev Srivastava Students: Achit Kumar Ojha Aseem Kumar Akshay Tyagi.
Out of sample extension of PCA, Kernel PCA, and MDS WILSON A. FLORERO-SALINAS DAN LI MATH 285, FALL
A Self-organizing Semantic Map for Information Retrieval Xia Lin, Dagobert Soergel, Gary Marchionini presented by Yi-Ting.
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
Spectral Methods for Dimensionality
Nonlinear Dimensionality Reduction
Dimensionality Reduction
Data Mining, Neural Network and Genetic Programming
Machine Learning Basics
Outline Nonlinear Dimension Reduction Brief introduction Isomap LLE
Object Modeling with Layers
NonLinear Dimensionality Reduction or Unfolding Manifolds
Presentation transcript:

DIMENSIONALITY REDUCTION Computer Graphics CourseJune 2013

What is high dimensional data? ImagesVideos Documents Most data, actually!

What is high dimensional data?  Images – dimension 3·X·Y  Videos – dimension of image * number of frames  Documents  Most data, actually

 Images – dimension 3·X·Y  This is the number of bytes in the image file  We can treat each byte as a dimension  Each image is a point in high dimensional space  Which space?  “space of images of size X·Y” How many dimensions?

 But we can describe an image using less bytes!  “Blue sky, green grass, yellow road…”  “Drawing of a kong-fu rat” How many dimensions?

 Visualization: Understanding the structure of data Why do Dimensionality Reduction?

 Visualization: Understanding the structure of data  Fewer dimensions are easy to describe and find correlations (rules)  Compression of data for efficiency  Clustering  Discovering similarities between elements Why do Dimensionality Reduction?

 Curse of dimensionality     ……  All these vectors are the same Euclidean distance from each other  But some dimensions could be “worth more”  Can you work with 1,000 images of 1,000,000 dimensions? Why do Dimensionality Reduction?

 Image features:  Average colors  Histograms  FFT based features (Frequency space)  More…  Video features  Document features  Etc… How to reduce dimensions?

 Feature dimension is still quite high (512, 1024, etc)  What now? How to reduce dimensions?

 Simplest way: Project all points on a plane (2D) or a lower dimension sub-space Linear Dimensionality Reduction

 Simplest way: Project all points on a plane (2D)  Only one question: Which plane is the best?  PCA (SVD) Linear Dimensionality Reduction

 Simplest way: Project all points on a plane (2D)  Only one question: Which plane is the best?  PCA (SVD)  For specific applications:  CCA (correlation)  LDA (data with labels)  NMF (non-negative components)  ICA (multiple sources) Linear Dimensionality Reduction

 What if data is not linear?  No plane will work here Non-Linear Dimensionality Reduction

 MDS – MultiDimensional Scaling  Use only distances between elements  Try to reconstruct element positions from distances such that:  Reconstruction can happen in 1D, 2D, 3D, …  More dimensions = less error Non-Linear Dimensionality Reduction

 MDS – MultiDimensional Scaling  Classical MDS: an algebraic solution  Construct a squared proximity matrix using some normalization (“double centering”)  Extract d largest eigenvectors / eigenvalues  Multiply each eigenvector with sqrt(eigenvalue)  Each row is the coordinates of its corresponding point Non-Linear Dimensionality Reduction

 MDS – MultiDimensional Scaling  Classical MDS: an algebraic solution Non-Linear Dimensionality Reduction e1e2e3e4e5 x1 x2 x3 x4 x5 Each vector adds a dimension to the mapping …

 Non-metric MDS: Optimization problem  Example: Sammon’s projection  Start from random positions for each element  Define stress of the system:  In each step, move towards positions that reduce the stress (gradient descent)  Continue until convergence Non-Linear Dimensionality Reduction

 Spectral embedding:  Create a graph of nearest neighbors  Compute the graph laplacian (relates to probability of walking on each edge in a random walk)  Compute Eigenvalues – why?  Computing Eigenvalues is like multiplying the matrix by itself many many times (towards infinity), which is like performing random walks over and over until we reach a stable point  Again, the eigenvectors are the coordinates  Does not preserve distances like MDS – instead it groups together points that are likely neighbors Non-Linear Dimensionality Reduction

 Other non-linear methods  Locally Linear Embedding (LLE): express each point as a linear combination of its neighbors  Isomap: Takes adjacency graph as input, and calculate MDS of the geodesic distances (distances on the graph)  Self Organizing Maps (SOM): Next part… Non-Linear Dimensionality Reduction

SELF ORGANIZING MAPS & RECENT APPLICATIONS Computer Graphics CourseJune 2013

Self Organizing Maps (SOM)  Originated from neural networks  Created by Kohonen, 1982  Also known as Kohonen Maps  Teuvo Kohonen: A Finnish researcher, learning and neural networks  Due to SOM, became the most cited Finnish scientist!  More than 8,000 citations  So what is it?

What is a SOM?  A type of neural network  What is a neuron?  A function with several inputs and one output  In this case – usually a linear combination of the input according to weights

What is a SOM? neurons input (x k ) weights (m ik ) no connection (feedback/feed forward) between neurons

Training a SOM  Start from random weights  For each input X(t) at iteration t:  Find the Best Matching Cell (BMC) (also called Best Matching Unit or BMU) for X(t)  Update weights for each neuron close to the BMU  Weights are updated according to a decaying learning rate and radius

Training a SOM neurons (m i ) X(1) BMC(1) X(2) BMC(2)

Training a SOM – The Math  Best Matching Cell: m c for which is minimal  Another option for BMC: maximal dot product x(t) T m c (t)  Weight adaptation:  is a learning rate dependant of both the time and the distance of m i from the BMC m c

Training a SOM – The Math  Example (motion map): distance between BMC and m i learning ratekernel width maximum number of iterations height and width of the neuron map

Training a SOM – The Math  Example (motion map): =0.25*(H+W)*(1-t/n L ) distance between BMC and m i learning ratekernel width maximum number of iterationsheight and width of the neuron map

Presenting a SOM  Option 1: at each node present the data that relates to vector m i (3D data, colors, continuous spaces)  So for a color map with 3 inputs, if a neuron weights are (0.7, 0.2, 0.3) we would show a reddish color with 0.7 red component, 0.2 green component and 0.3 blue component  For a map of points on the plane with 2 inputs, we would draw a point for each neuron in position (W x, W y )

Presenting a SOM  Option 1: at each node present the data that relates to vector m i (3D data, colors, continuous spaces)

Presenting a SOM  Option 2: give each neuron a representation from the training set X which is closest to vector m i

More Examples

Motion Map  Motion Map: Image-based Retrieval and Segmentation of Motion Data  Sakamato, Kuriyama, Kenko  SCA: Symposium on Computer Animation 2004  Goal: Presenting the user with a grid of postures in order to select a clip of motion data from a large database  Perform clustering on the SOM instead of the abstract data

Motion Map  Example results: 436 posture samples from 55K frames of 51 motion files

Motion Map  Example results: Clustering based on SOM

Motion Map - Details  A map of posture samples is created from all motion files together  Each sample similarity to its closest sample is over a given threshold to reduce computation time  A standard SOM is calculated  Each posture is then connected to a hash table of the motion files that contain similar postures  Clustering the SOM enables display of a simplified map to the user (next page)

Motion Map - Details  Simplified map after SOM clustering: 17 dance styles

Procedural Texture Preview  Eurographics 2012  Goal: Present the user with a single image which shows all possibilities of a procedural texture  Method overview:  Selecting candidate vectors of parameters which maximize completeness, variety and smoothness  Organizing the candidates in a SOM  Synthesis of a continuous map

Procedural Texture Preview  Results thumbnails of random parameters texture preview in a single image texture parameters

Procedural Texture Preview - Details  Selecting candidates for the parameters map using the following optimizations: C = a set of dense samples X = the candidates in the parameter map  Completeness: minimize  Variety: maximize  Smoothness: minimize

Procedural Texture Preview - Details  A standard SOM will jointly optimize the completeness and the smoothness  To optimize the variety as well, the SOM implementation switches between minimizing Ev and maximizing Ec  Instead of regular learning rate, at each step the candidates (weights vectors) are replaced by a new candidate according to the above optimizations

Procedural Texture Preview - Details  After the candidate selection, an image is synthesized which smoothly combines all selected candidates  Stitching is done using standard patch based texture synthesis methods (Graphcut Textures, Kwarta et al, TOG 2003)

Procedural Texture Preview  Some more results

That’s all folks!  Questions?