What is high dimensional data? ImagesVideos Documents Most data, actually!
What is high dimensional data? Images – dimension 3·X·Y Videos – dimension of image * number of frames Documents Most data, actually
Images – dimension 3·X·Y This is the number of bytes in the image file We can treat each byte as a dimension Each image is a point in high dimensional space Which space? “space of images of size X·Y” How many dimensions?
But we can describe an image using less bytes! “Blue sky, green grass, yellow road…” “Drawing of a kong-fu rat” How many dimensions?
Visualization: Understanding the structure of data Why do Dimensionality Reduction?
Visualization: Understanding the structure of data Fewer dimensions are easy to describe and find correlations (rules) Compression of data for efficiency Clustering Discovering similarities between elements Why do Dimensionality Reduction?
Curse of dimensionality …… All these vectors are the same Euclidean distance from each other But some dimensions could be “worth more” Can you work with 1,000 images of 1,000,000 dimensions? Why do Dimensionality Reduction?
Image features: Average colors Histograms FFT based features (Frequency space) More… Video features Document features Etc… How to reduce dimensions?
Feature dimension is still quite high (512, 1024, etc) What now? How to reduce dimensions?
Simplest way: Project all points on a plane (2D) or a lower dimension sub-space Linear Dimensionality Reduction
Simplest way: Project all points on a plane (2D) Only one question: Which plane is the best? PCA (SVD) Linear Dimensionality Reduction
Simplest way: Project all points on a plane (2D) Only one question: Which plane is the best? PCA (SVD) For specific applications: CCA (correlation) LDA (data with labels) NMF (non-negative components) ICA (multiple sources) Linear Dimensionality Reduction
What if data is not linear? No plane will work here Non-Linear Dimensionality Reduction
MDS – MultiDimensional Scaling Use only distances between elements Try to reconstruct element positions from distances such that: Reconstruction can happen in 1D, 2D, 3D, … More dimensions = less error Non-Linear Dimensionality Reduction
MDS – MultiDimensional Scaling Classical MDS: an algebraic solution Construct a squared proximity matrix using some normalization (“double centering”) Extract d largest eigenvectors / eigenvalues Multiply each eigenvector with sqrt(eigenvalue) Each row is the coordinates of its corresponding point Non-Linear Dimensionality Reduction
MDS – MultiDimensional Scaling Classical MDS: an algebraic solution Non-Linear Dimensionality Reduction e1e2e3e4e5 x1 x2 x3 x4 x5 Each vector adds a dimension to the mapping …
Non-metric MDS: Optimization problem Example: Sammon’s projection Start from random positions for each element Define stress of the system: In each step, move towards positions that reduce the stress (gradient descent) Continue until convergence Non-Linear Dimensionality Reduction
Spectral embedding: Create a graph of nearest neighbors Compute the graph laplacian (relates to probability of walking on each edge in a random walk) Compute Eigenvalues – why? Computing Eigenvalues is like multiplying the matrix by itself many many times (towards infinity), which is like performing random walks over and over until we reach a stable point Again, the eigenvectors are the coordinates Does not preserve distances like MDS – instead it groups together points that are likely neighbors Non-Linear Dimensionality Reduction
Other non-linear methods Locally Linear Embedding (LLE): express each point as a linear combination of its neighbors Isomap: Takes adjacency graph as input, and calculate MDS of the geodesic distances (distances on the graph) Self Organizing Maps (SOM): Next part… Non-Linear Dimensionality Reduction
Self Organizing Maps (SOM) Originated from neural networks Created by Kohonen, 1982 Also known as Kohonen Maps Teuvo Kohonen: A Finnish researcher, learning and neural networks Due to SOM, became the most cited Finnish scientist! More than 8,000 citations So what is it?
What is a SOM? A type of neural network What is a neuron? A function with several inputs and one output In this case – usually a linear combination of the input according to weights
What is a SOM? neurons input (x k ) weights (m ik ) no connection (feedback/feed forward) between neurons
Training a SOM Start from random weights For each input X(t) at iteration t: Find the Best Matching Cell (BMC) (also called Best Matching Unit or BMU) for X(t) Update weights for each neuron close to the BMU Weights are updated according to a decaying learning rate and radius
Training a SOM neurons (m i ) X(1) BMC(1) X(2) BMC(2)
Training a SOM – The Math Best Matching Cell: m c for which is minimal Another option for BMC: maximal dot product x(t) T m c (t) Weight adaptation: is a learning rate dependant of both the time and the distance of m i from the BMC m c
Training a SOM – The Math Example (motion map): distance between BMC and m i learning ratekernel width maximum number of iterations height and width of the neuron map
Training a SOM – The Math Example (motion map): =0.25*(H+W)*(1-t/n L ) distance between BMC and m i learning ratekernel width maximum number of iterationsheight and width of the neuron map
Presenting a SOM Option 1: at each node present the data that relates to vector m i (3D data, colors, continuous spaces) So for a color map with 3 inputs, if a neuron weights are (0.7, 0.2, 0.3) we would show a reddish color with 0.7 red component, 0.2 green component and 0.3 blue component For a map of points on the plane with 2 inputs, we would draw a point for each neuron in position (W x, W y )
Presenting a SOM Option 1: at each node present the data that relates to vector m i (3D data, colors, continuous spaces)
Presenting a SOM Option 2: give each neuron a representation from the training set X which is closest to vector m i
Motion Map Motion Map: Image-based Retrieval and Segmentation of Motion Data Sakamato, Kuriyama, Kenko SCA: Symposium on Computer Animation 2004 Goal: Presenting the user with a grid of postures in order to select a clip of motion data from a large database Perform clustering on the SOM instead of the abstract data
Motion Map Example results: 436 posture samples from 55K frames of 51 motion files
Motion Map Example results: Clustering based on SOM
Motion Map - Details A map of posture samples is created from all motion files together Each sample similarity to its closest sample is over a given threshold to reduce computation time A standard SOM is calculated Each posture is then connected to a hash table of the motion files that contain similar postures Clustering the SOM enables display of a simplified map to the user (next page)
Motion Map - Details Simplified map after SOM clustering: 17 dance styles
Procedural Texture Preview Eurographics 2012 Goal: Present the user with a single image which shows all possibilities of a procedural texture Method overview: Selecting candidate vectors of parameters which maximize completeness, variety and smoothness Organizing the candidates in a SOM Synthesis of a continuous map
Procedural Texture Preview Results thumbnails of random parameters texture preview in a single image texture parameters
Procedural Texture Preview - Details Selecting candidates for the parameters map using the following optimizations: C = a set of dense samples X = the candidates in the parameter map Completeness: minimize Variety: maximize Smoothness: minimize
Procedural Texture Preview - Details A standard SOM will jointly optimize the completeness and the smoothness To optimize the variety as well, the SOM implementation switches between minimizing Ev and maximizing Ec Instead of regular learning rate, at each step the candidates (weights vectors) are replaced by a new candidate according to the above optimizations
Procedural Texture Preview - Details After the candidate selection, an image is synthesized which smoothly combines all selected candidates Stitching is done using standard patch based texture synthesis methods (Graphcut Textures, Kwarta et al, TOG 2003)