Segmentation by Clustering

Name: Segmentation by Clustering
Uploaded: 2017-07-03T16:40:37+00:00
Duration: PTM22S47
Channel: Shanon Eaton
Description: Segmentation by Clustering

Segmentation by Clustering
EECS 274 Computer Vision Segmentation by Clustering

Segmentation and grouping
Motivation: Obtain a compact representation from an image/motion sequence/set of tokens Should support application Broad theory is absent at present Reading: FP Chapter 9-10, S Chapter 5 Grouping (or clustering) collect together tokens that “belong together” Fitting associate a model with tokens issues which model? which token goes to which element? how many elements in the model?

How do they assemble? A disturbing possibility is that they all lie on a sphere -- but then, if we didn’t know that the tokens belonged together, where did the sphere come from? Why do these tokens belong together (from human vision)? because they form surfaces? because they form the same texture?

Segmentation: examples
Summarizing videos: shots Finding matching parts Finding people Finding buildings in satellite images Searching a collection of images Object detection Object recognition

Models problems Forming image segments: superpixels, image regions
Fitting lines to edge points Fitting a fundamental matrix to a set of feature points: If the correspondence is right, there is a fundamental matrix connecting the points

Segmentation as clustering
Partitioning Decompose into regions that have coherent color and texture Decompose into extended blobs, consisting of regions that have coherent color, texture, and motion Take a video sequence and decompose it into shots Grouping Collecting together tokens that, taken together, form a line Collecting together tokens that seem to share a fundamental matrix What is the appropriate representation?

General ideas Tokens Top down segmentation Bottom up segmentation
whatever we need to group (pixels, points, surface elements, etc., etc.) Top down segmentation tokens belong together because they lie on the same object Bottom up segmentation tokens belong together because they are locally coherent These two are not mutually exclusive

Grouping and Gestalt Context affects how things are perceived
Gestalt school of psychology Studying shape as a whole instead of just a collection of simple and curves Grouping means the tendency of visual system to assemble components together and perceive them together The famous Muller-Lyer illusion; the point is that the horizontal bar has properties that come only from its membership in a group (it looks shorter in the lower picture, but is actually the same size) and that these properties can’t be discounted--- you can’t look at this figure and ignore the arrowheads and thereby make the two bars seem to be the same size. Muller-Lyer illusion

Some Gestalt properties
emergence reification

Figure and ground A driving force behind the gestalt movement is the observation that it isn’t enough to think about pictures in terms of separating figure and ground (e.g. foreground and background). This is (partially) because there are too many different possibilities in pictures like this one. Is a square with a hole in it the figure? or a white circle? or what? (Left) White circle/black rectangle: which is figure (foreground) and which is ground (background)? (Right) What object is in this picture?

Basic ideas of grouping in humans
Figure-ground discrimination grouping can be seen in terms of allocating some elements to a figure, some to ground impoverished theory Gestalt properties elements in a collection of elements can have properties that result from relationships (Muller-Lyer effect) gestalt qualitat (form quality) A series of factors affect whether elements should be grouped together Gestalt factors

Gestalt movement Some criteria that tend to cause tokens to be grouped.

Examples of Gestalt factors
More such

Example Occlusion cues seem to be very important in grouping. Most people find it hard to read the 5 numerals in this picture

Occlusion as a cue but easy in this

Grouping phenomena The story is in the book (figure 14.7)

Illusory contours Illusory contours; a curious phenomenon where you see an object that appears to be occluding. Tokens appear to be grouped together as they provide a cue to the presence of an occluding object whose contour has no contrast

Background subtraction
If we know what the background looks like, it is easy to identify “interesting bits” Applications person in an office tracking cars on a road surveillance Approach: use a moving average to estimate background image subtract from current frame large absolute values are interesting pixels trick: use morphological operations to clean up pixels

See caption, figure 14.10 Results using 80 x 60 frames (a) Average of all 120 frames. (b) thresholding. (c) results with low threshold (with missing pixels) . (d) background computed using the EM-based algorithm. (e) background subtraction results (with missing pixels).

See caption, figure 14.11 An important point here is that background subtraction works quite badly in the presence of high spatial frequencies, because when we Estimate the background, we’re almost always going to smooth it. Results using 160 x 120 frames (a) Average of all 120 frames. (b) thresholding. (c) results with low threshold (with missing pixels) . (d) background computed using the EM-based algorithm. (e) background subtraction results (with missing pixels).

Shot boundary detection
Find the shots in a sequence of video shot boundaries usually result in big differences between succeeding frames allows for objects to move within a given shot Strategy: compute interframe distances declare a boundary where these are big Possible distances frame differences histogram differences block comparisons edge differences Applications: representation for movies, or video sequences find shot boundaries obtain “most representative” frame supports search

Interactive segmenation

Grabcut

Image matting

Superpixels

Segmentation as clustering
Cluster together (pixels, tokens, etc.) that belong together Agglomerative clustering attach closest to cluster it is closest to repeat Divisive clustering split cluster along best boundary Point-Cluster distance single-link clustering (distance between two closest elements) complete-link clustering (maximum distance between two elements) group-average clustering Dendrograms (tree diagram) yield a picture of output as clustering process continues

A data set A dendrogram

K-Means Choose a fixed number of clusters
Choose cluster centers and point-cluster allocations to minimize error Can’t do this by search, because there are too many possible allocations. Algorithm fix cluster centers; allocate points to closest cluster fix allocation; compute best cluster centers x could be any set of features for which we can compute a distance (careful about scaling)

K-means clustering using intensity alone and color alone
Image Clusters on intensity Clusters on color I gave each pixel the mean intensity or mean color of its cluster --- this is basically just vector quantizing the image intensities/colors. Notice that there is no requirement that clusters be spatially localized and they’re not. K-means clustering using intensity alone and color alone (using 5 clusters) How many clusters do we need in an image?

K-means using color alone, 11 segments (6 kinds of vegetables)
Image Clusters on color K-means using color alone, 11 segments (6 kinds of vegetables)

K-means using color alone, 11 segments (not using texture or position) - segments may be disconnected and scattered

K-means using color and position, 20 segments
- peppers are now separated but, cabbage is still broken up Here I’ve represented each pixel as (r, g, b, x, y), which means that segments prefer to be spatially coherent. These are just some of 20 segments.

Graph-theoretic clustering
Represent tokens using a weighted graph. affinity matrix similar have large weights Cut up this graph to get connected components with strong interior links by cutting edges with low weights A graph is a set of vertices V and edges E, G={V,E} A weighted graph is one in which each edge is associated with a weight Every graph consists of a disjoint set of connected components

Flow networks A flow network G=(V,E): a directed graph, where each edge (u,v)E has a nonnegative capacity c(u,v)>=0. If (u,v)E, we assume that c(u,v)=0. Two distinct vertices: a source s and a sink t. s t 16 12 20 10 4 9 7 13 14

Maximum flow minimum cut
Also known as s-t cut Several deterministic and randomized algorithms Resulting Flow = 23

Weighted graph 9 points/pixels
9 x 9 Weighted matrix that looks like block diagonal A cut with 2 connected components

(Left) a set of points (Right) affinity matrix computed with a decaying exponential distance (large values  bright pixels, and small values  dark pixels)

Graph cut and normalized cut
Cut a graph V into 2 connected components with minimal cost cut(A,B): sum of weights of all edges between A and B assoc(A,V): sum of all the weights associated with nodes in A

Affinity Intensity Distance Texture
here c(x) is a vector of filter outputs. A natural thing to do is to square the outputs of a range of different filters at different scales and orientations, smooth the result, and rack these into a vector. Texture

Scale affects affinity
This is figure 14.18 σ=0.1 σ=0.2 σ=1

Extracting a single good cluster
Simplest idea: we want a vector a giving the association between each element and a cluster We want elements within this cluster to, on the whole, have strong affinity with one another We could maximize But need the constraint Solving constrained optimization with Lagrange multiplier This is an eigenvalue problem - choose the eigenvector of A with largest eigenvalue Cluster weights are the elements of the eigenvectors  association of element i with cluster n × affinity between i and j × association of element j and cluster n

Example eigenvector points eigenvector matrix

Clustering with eigenvectors
If we reorder the nodes, the rows and columns of A are shuffled We expect to deal with a few clusters with large association We could shuffle the rows and columns of M to form a matrix that is roughly block diagonal Shuffling M simply shuffle the elements of its eigenvectors so We can reason about the eigenvectors by working with a shuffled version of M

Example data = 1 8 7 2 9 3 2 7 affinity = 1.0000 0.0144 0.0089 0.4931
affinity = nEigVec =

Example data = 1 8 2 7 9 3 7 2 affinity = 1.0000 0.4931 0.0089 0.0144
2 7 affinity = nEigVec =

Eigenvectors and clusters
Eigenvectors of block-diagonal matrices consist of eigenvectors of the blocks padded with zeros Each block has an eigenvector corresponding to a rather large eigenvalue Expect that, if there are c significant clusters, the eigenvectors corresponding to the c largest eigenvalue

Large eigenvectors and clusters

Eigen gap and clusters σd=0.1 σd=0.2 σd=1

More than two segments Two options
Recursively split each side to get a tree, continuing till the eigenvalue are too small Use the other eigenvectors

Normalized cuts Current criterion evaluates within cluster similarity, but not across cluster difference Can also use Cheeger constant Instead, we would like to maximize the within cluster similarity compared to the across cluster difference Write graph as V, one cluster as A and the other as B Maximize i.e. construct A, B such that their within cluster similarity is high compared to their association with the rest of the graph cut(A,B): sum of weights of all edges between A and B assoc(A,V): sum of all the weights associated with nodes in A

Normalized cuts Write a vector y whose elements are 1 if item is in A, -b if it’s in B Write the matrix of the graph as W, and the matrix which has the row sums of W on its diagonal as D, 1 is the vector with all ones. Criterion becomes and we have a constraint This is hard to do, because y’s values are quantized

Normalized cuts and spectral graph theory
Instead, solve the generalized eigenvalue problem which gives Now look for a quantization threshold that maximizes the criterion --- i.e all components of y above that threshold go to one, all below go to -b

Using affinity based on texture and intensity.
This is figure caption there explains it all. Using affinity based on texture and intensity. Figure from “Image and video segmentation: the normalized cut framework”, by Shi and Malik, copyright IEEE, 1998

Using spatio-temoral affinity metric.
This is figure 14.24, whose caption gives the story Using spatio-temoral affinity metric. Figure from “Normalized cuts and image segmentation,” Shi and Malik, copyright IEEE, 2000

Spectral clustering Ng, Jordan and Weiss, NIPS 01

Comparisons Ng, Jordan and Weiss, NIPS 01 2 clusters 2 clusters

Spectral graph theory See EECS 275 Lecture 23 for more detail
generalized eigenvalue problem See EECS 275 Lecture 23 for more detail See also Laplacian eigenmap, Laplace Beltrami operator, Laplacian eigenmap, spectral clustering, graph cut, minimal-cut maximum-flow, normalized cut, random walk, heat kernel, diffusion map

More information G. Scott and H. Longuet-Higgins, Feature grouping by relocalisation of eigenvectors of the proximity matrix, BMVC, 1990 F. Chung, Spectral graph theory, AMS 1997 Y. Weiss, Segmentation using eigenvectors: a unifying view, ICCV 1999 L. Saul et al., Spectral methods for dimensionality reduction, in Semisupervised learning, 2006 U von Luxburg, A tutorial on spectral clustering, Statistics and Computing, 2007

Related topics Watershed Multiscale segmentation
Mean-shift segmentation Parsing tree/stochastic grammar Graph cut Objet cut Grab cut Random walk Energy-based method

Segmentation by Clustering

Similar presentations

Presentation on theme: "Segmentation by Clustering"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Segmentation by Clustering

Similar presentations

Presentation on theme: "Segmentation by Clustering"— Presentation transcript:

Similar presentations

About project

Feedback