René Vidal Center for Imaging Science

Clustering Linear and Nonlinear Manifolds using Generalized Principal Components Analysis
René Vidal Center for Imaging Science Institute for Computational Medicine Johns Hopkins University Title: Segmentation of Dynamic Scenes and Textures Abstract: Dynamic scenes are video sequences containing multiple objects moving in front of dynamic backgrounds, e.g. a bird floating on water. One can model such scenes as the output of a collection of dynamical models exhibiting discontinuous behavior both in space, due to the presence of multiple moving objects, and in time, due to the appearance and disappearance of objects. Segmentation of dynamic scenes is then equivalent to the identification of this mixture of dynamical models from the image data. Unfortunately, although the identification of a single dynamical model is a well understood problem, the identification of multiple hybrid dynamical models is not. Even in the case of static data, e.g. a set of points living in multiple subspaces, data segmentation is usually thought of as a "chicken-and-egg" problem. This is because in order to estimate a mixture of models one needs to first segment the data and in order to segment the data one needs to know the model parameters. Therefore, static data segmentation is usually solved by alternating data clustering and model fitting using, e.g., the Expectation Maximization (EM) algorithm. Our recent work on Generalized Principal Component Analysis (GPCA) has shown that the "chicken-and-egg" dilemma can be tackled using algebraic geometric techniques. In the case of data living in a collection of (static) subspaces, one can segment the data by fitting a set of polynomials to all data points (without first clustering the data) and then differentiating these polynomials to obtain the model parameters for each group. In this talk, we will present ongoing work addressing the extension of GPCA to time-series data living in a collection of multiple moving subspaces. The approach combines classical GPCA with newly developed recursive hybrid system identification algorithms. We will also present applications of DGPCA in image/video segmentation, 3-D motion segmentation, dynamic texture segmentation, and heart motion analysis.

Data segmentation and clustering
Given a set of points, separate them into multiple groups Discriminative methods: learn boundary Generative methods: learn mixture model, using, e.g. Expectation Maximization

Dimensionality reduction and clustering
In many problems data is high-dimensional: can reduce dimensionality using, e.g. Principal Component Analysis Image compression Recognition Faces (Eigenfaces) Image segmentation Intensity (black-white) Texture

Clustering data on non Euclidean spaces
Mixtures of linear spaces Mixtures of algebraic varieties Mixtures of Lie groups “Chicken-and-egg” problems Given segmentation, estimate models Given models, segment the data Initialization? Need to combine Algebra/geometry, dynamics and statistics

Part I Linear Subspaces
Title: Segmentation of Dynamic Scenes and Textures Abstract: Dynamic scenes are video sequences containing multiple objects moving in front of dynamic backgrounds, e.g. a bird floating on water. One can model such scenes as the output of a collection of dynamical models exhibiting discontinuous behavior both in space, due to the presence of multiple moving objects, and in time, due to the appearance and disappearance of objects. Segmentation of dynamic scenes is then equivalent to the identification of this mixture of dynamical models from the image data. Unfortunately, although the identification of a single dynamical model is a well understood problem, the identification of multiple hybrid dynamical models is not. Even in the case of static data, e.g. a set of points living in multiple subspaces, data segmentation is usually thought of as a "chicken-and-egg" problem. This is because in order to estimate a mixture of models one needs to first segment the data and in order to segment the data one needs to know the model parameters. Therefore, static data segmentation is usually solved by alternating data clustering and model fitting using, e.g., the Expectation Maximization (EM) algorithm. Our recent work on Generalized Principal Component Analysis (GPCA) has shown that the "chicken-and-egg" dilemma can be tackled using algebraic geometric techniques. In the case of data living in a collection of (static) subspaces, one can segment the data by fitting a set of polynomials to all data points (without first clustering the data) and then differentiating these polynomials to obtain the model parameters for each group. In this talk, we will present ongoing work addressing the extension of GPCA to time-series data living in a collection of multiple moving subspaces. The approach combines classical GPCA with newly developed recursive hybrid system identification algorithms. We will also present applications of DGPCA in image/video segmentation, 3-D motion segmentation, dynamic texture segmentation, and heart motion analysis.

Principal Component Analysis (PCA)
Given a set of points x1, x2, …, xN Geometric PCA: find a subspace S passing through them Statistical PCA: find projection directions that maximize the variance Solution (Beltrami’1873, Jordan’1874, Hotelling’33, Eckart-Householder-Young’36) Applications: data compression, regression, computer vision (eigenfaces), pattern recognition, genomics As we all know, Principal Component Analysis problem refers to the problem of estimating a SINGLE subspace from sample data points. Although there are various ways of solving PCA, a simple solution consist of building a matrix with all the data points, computing its SVD, and then extracting a basis for the subspace from the columns of the U matrix, and the dimension of the subspace from the rank of the U matrix. There is no question that PCA is one of the most popular techniques for dimensionality reduction in various engineering disciplines. In computer vision, in particular, a successful application has been found in face recognition under the name of eigenfaces. Basis for S

Generalized Principal Component Analysis
Given a set of points lying in multiple subspaces, identify The number of subspaces and their dimensions A basis for each subspace The segmentation of the data points “Chicken-and-egg” problem Given segmentation, estimate subspaces Given subspaces, segment the data Given a set of data points lying on a collection of linear subspaces, without knowing which point corresponds to which subspace and without even knowing the number of subspaces, estimate the number of subspaces, a basis for each subspace and the segmentation of the data. As stated, the GPCA problem is one of those challenging CHICKEN-AND-EGG problems in the following sense. If we knew the segmentation of the data points, and hence the number of subspaces, then we could solve the problem by applying standard PCA to each group. On the other hand, if we had a basis for each subspace, we could easily assign each point to each cluster. The main challenge is that we neither know the segmentation of the data points nor a basis for each subspace, and the questions is how to estimate everything from the data only. Previous geometric approaches to GPCA have been proposed in the context of motion segmentation, and under the assumption that the subspaces do NOT intersect. Such approaches are based on first clustering the data, using for example K-means or spectral clustering techniques, and then applying standard PCA to each group. Statistical methods, on the other hand, model the data as a mixture model whose parameters are the mixing proportions and the basis for each subspace. The estimation of the mixture is a (usually non-convex) optimization problem that can be solved with various techniques. One of the most popular ones is the expectation maximization algorithm (EM) which alternates between the clustering of the data and the estimation of the subspaces. Unfortunately, the convergence of such iterative algorithms depends on good initialization. At present, initialization is mostly done either at random, using another iterative algorithm such as K-means, or else in an ad-hoc fashion depending on the particular application. It is here where our main motivation comes into the picture. We would like to know if we can find a way of initializing iterative algorithms in an algebraic fashion. For example, if you look at the two lines in this picture, there has to be a way of estimating those two lines directly from data in a one shot solution, at least in the ABSENCE of noise.

Basic ideas behind GPCA
Towards an analytic solution to subspace clustering Can we estimate ALL models simultaneously using ALL data? When can we do so analytically? In closed form? Is there a formula for the number of models? Will consider the most general case Subspaces of unknown and possibly different dimensions Subspaces may intersect arbitrarily (not only at the origin) GPCA is an algebraic geometric approach to data segmentation Number of subspaces = degree of a polynomial Subspace basis = derivatives of a polynomial Subspace clustering is algebraically equivalent to Polynomial fitting Polynomial differentiation We are therefore interested in finding an analytic solution to the above segmentation problem. In particular we would like to know if we can estimate all the subspaces simultaneously from all the data without having to iterate between data clustering and subspace estimation. Furthermore we would like to know if we can do so in an ANALYTIC fashion, and even more, we would like to understand under what conditions we can do it in closed form and whether there is a formula for the number of subspaces. It turns out that all these questions can be answered using tools from algebraic geometry. Since at first data segmentation and algebraic geometry may seem to be totally disconnected, let me first tell you what the connection is. We will model the number of subspaces as the degree of a polynomial and the subspace parameters (basis) as the roots of a polynomial, or more precisely as the factors of the polynomial. With this algebraic geometric algebraic geometric interpretation in mind, we will be able to prove the following the GPCA problem. First, there are some geometric constraints that do not depend on the segmentation of the data. Therefore, the chicken-and-egg dilemma can be resolved since the segmentation of the dada can be algebraically eliminated. Second, one can prove that the problem has a unique solution, which can be obtained in closed form if and only if the number of subspaces is less than or equal to four And third, the solution can be computed using linear algebraic techniques. In the presence of noise, one can use the algebraic solution to GPCA to initialize any iterative algorithm, for example EM. In the case of zero-mean Gaussian noise, we will show that the E-step can be algebraically eliminated, since we will derive an objective function that depends on the motion parameters only.

Introductory example: algebraic clustering in 1D
A -> c Number of groups?

How to compute n, c, b’s? Number of clusters Cluster centers Solution is unique if Solution is closed form if A -> c

What about dimension 2? What about higher dimensions? Complex numbers in higher dimensions? How to find roots of a polynomial of quaternions? Instead Project data onto one or two dimensional space Apply same algorithm to projected data

Representing one subspace
One plane One line One subspace can be represented with Set of linear equations Set of polynomials of degree 1

Representing n subspaces
Two planes One plane and one line Plane: Line: A union of n subspaces can be represented with a set of homogeneous polynomials of degree n De Morgan’s rule

Fitting polynomials to data points
Polynomials can be written linearly in terms of the vector of coefficients by using polynomial embedding Coefficients of the polynomials can be computed from nullspace of embedded data Solve using least squares N = #data points Veronese map

Finding a basis for each subspace
To learn a mixture of subspaces we just need one positive example per class Polynomial Differentiation (GPCA-PDA) [CVPR’04]

Choosing one point per subspace
With noise and outliers Polynomials may not be a perfect union of subspaces Normals can estimated correctly by choosing points optimally Distance to closest subspace without knowing segmentation?

GPCA for hyperplane segmentation
Coefficients of the polynomial can be computed from null space of embedded data matrix Solve using least squares N = #data points Number of subspaces can be computed from the rank of embedded data matrix Normal to the subspaces can be computed from the derivatives of the polynomial

Motion segmentation using GPCA
Apply polynomial embedding to 5-D points Veronese map

Experimental results: Kanatani sequences
Sequence A Sequence B Sequence C Percentage of correct classification

Segmenting non-moving dynamic textures
One dynamic texture lives in the observability subspace Multiple textures live in multiple subspaces Cluster the data using GPCA z t + 1 = A v y C w water steam

Segmenting moving dynamic textures

Segmenting moving dynamic textures
Ocean-bird

Temporal video segmentation
Segmenting N=30 frames of a sequence containing n=3 scenes Host Guest Both Image intensities are output of linear system Apply GPCA to fit n=3 observability subspaces dynamics appearance images x t + 1 = A v y C w

Temporal video segmentation
Segmenting N=60 frames of a sequence containing n=3 scenes Burning wheel Burnt car with people Burning car Image intensities are output of linear system Apply GPCA to fit n=3 observability subspaces dynamics appearance images x t + 1 = A v y C w

Part 2 Submanifolds in Riemannian Spaces
Title: Segmentation of Dynamic Scenes and Textures Abstract: Dynamic scenes are video sequences containing multiple objects moving in front of dynamic backgrounds, e.g. a bird floating on water. One can model such scenes as the output of a collection of dynamical models exhibiting discontinuous behavior both in space, due to the presence of multiple moving objects, and in time, due to the appearance and disappearance of objects. Segmentation of dynamic scenes is then equivalent to the identification of this mixture of dynamical models from the image data. Unfortunately, although the identification of a single dynamical model is a well understood problem, the identification of multiple hybrid dynamical models is not. Even in the case of static data, e.g. a set of points living in multiple subspaces, data segmentation is usually thought of as a "chicken-and-egg" problem. This is because in order to estimate a mixture of models one needs to first segment the data and in order to segment the data one needs to know the model parameters. Therefore, static data segmentation is usually solved by alternating data clustering and model fitting using, e.g., the Expectation Maximization (EM) algorithm. Our recent work on Generalized Principal Component Analysis (GPCA) has shown that the "chicken-and-egg" dilemma can be tackled using algebraic geometric techniques. In the case of data living in a collection of (static) subspaces, one can segment the data by fitting a set of polynomials to all data points (without first clustering the data) and then differentiating these polynomials to obtain the model parameters for each group. In this talk, we will present ongoing work addressing the extension of GPCA to time-series data living in a collection of multiple moving subspaces. The approach combines classical GPCA with newly developed recursive hybrid system identification algorithms. We will also present applications of DGPCA in image/video segmentation, 3-D motion segmentation, dynamic texture segmentation, and heart motion analysis.

Overview of our approach
Propose a novel framework for simultaneous dimensionality reduction and clustering of data lying in different submanifolds of a Riemannian space. Extend nonlinear dimensionality reduction techniques from one submanifold of to m submanifolds of a Riemannian space. Show that when the different submanifolds are separated, all the points from one connected submanifold can be mapped to a single point in a low-dimensional space.

Nonlinear Dimensionality Reduction
Two kinds of dimensionality reduction: global and local techniques. Global techniques preserve global properties of the data lying on a submanifold similar to PCA for a linear subspace Eg: ISOMAP and Kernel PCA Local techniques preserve local properties, obtained from the small neighborhoods around the datapoints. also retain the global properties of the data Eg: Locally Linear Embedding (LLE), Laplacian Eigenmaps, Hessian LLE

Local Nonlinear Dimensionality Reduction
We show that segmentation of the data can be obtained from the null space of a matrix built from the local representation, using local NDR. When the different submanifolds are separated, there is a mapping from all the points in one connected submanifold to a single point in the low-dimensional space. effectively reduces to a standard central clustering problem. We will illustrate using Locally Linear Embedding (LLE).

Local Nonlinear Dimensionality Reduction
Focus on local NDR methods. Each method operates in a similar manner as follows: Step 1: Find the k-nearest neighbors of each data point according to the Euclidean distance Step 2: Compute a matrix , that depends on the weights represents the local geometry of the data points and incorporates reconstruction cost function in low dimension. Step 3: Solve a sparse eigenvalue problem on matrix The first eigenvector is the constant vector corresponding to eigenvalue 0.

Extending LLE to Riemannian Manifolds
Information about the local geometry of the manifold is essential only in the first two steps of each algorithm, modifications are made only to these two stages. Key issues: how to select the kNN by incorporating the Riemannian distance how to compute the matrix representing the local geometry using the new metric.

Extending LLE to Riemannian Manifolds
LLE involves writing each data point as a linear combination of its neighbors. Euclidean case, simply a least-squares problem Riemannian case: one needs to solve an interpolation problem on the manifold. How should the data points be interpolated? What cost function should be minimized?

Clustering on Riemannian Manifolds
If the data lie in a disconnected union of k-connected submanifolds, the matrix is block-diagonal with blocks. When the assumption of separated submanifolds is violated, we have and , respectively. Hence, we can cluster the data into groups by applying k-means to the columns of a matrix whose rows are the eigenvectors in the null space of .

Motion Segmentation (SPSD(3) space)
For Lambertian surfaces, spatial-temporal structure tensor

Segmenting fiber bundles in DTI (SPSD(3) space)
In DTI, the diffusion at each voxel is represented by a covariance matrix Cingulum Corpus Callosum Segmentation gives the separation between the two bundles. Corpus callosum: the red tensors pointing out of the plane and resembles the letter ‘C’. Cingulum: left to the corpus callosum with the green tensors oriented vertically.

Clustering of Probability Density Functions (Hilbert Unit Sphere)
Under square-root representation, the space of functions Unit sphere, geometry is well-known

Summary and Future Work
In this paper, we have presented a new way of looking at segmentation problems that is based on algebraic geometric techniques. In essence we showed that segmentation is equivalent to polynomial factorization and presented an algebraic solution for the case of mixtures of subspaces, which is a natural generalization of the well known Principal Component Analysis (PCA). Of course, as this theory is fairly new, there are still a lot of open problems. For example, how to do polynomial factorization in the presence of noise, or how to deal with outliers. We are currently working on such problems and expect to report our results in the near future. Also, we are currently working on generalizing the ideas in this paper to the cases of subspaces of different dimensions and to other types of data. In fact, I cannot resist doing some advertising for my talk tomorrow, where I will talk about the segmentation of multiple rigid motions, which corresponds to the case of bilinear data. Finally, before taking your questions, I would like to say that I think it is time that we start thinking about putting segmentation problems in a theoretical footing. I think this can be done by building a mathematical theory of segmentation that not only incorporates well known statistical modeling techniques, but also the geometric constraints inherent in various applications in computer vision. We hope that by combining the algebraic geometric approach to segmentation presented in this paper with the more standard statistical approaches, we should be able to build a new mathematical theory of segmentation in the followi should be able to build such a theory, which a nice mathematical theory of segmentation. Such a theory should find applications in various disciplines, including of course lots of segmentation and recognition problems in computer vision.

Thank You! For more information, Vision, Dynamics and Learning Lab @
Johns Hopkins University Thank You!

René Vidal Center for Imaging Science

Similar presentations

Presentation on theme: "René Vidal Center for Imaging Science"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

René Vidal Center for Imaging Science

Similar presentations

Presentation on theme: "René Vidal Center for Imaging Science"— Presentation transcript:

Similar presentations

About project

Feedback