Presentation is loading. Please wait.

Presentation is loading. Please wait.

Motion Segmentation from Clustering of Sparse Point Features Using Spatially Constrained Mixture Models Shrinivas Pundlik Committee members Dr. Stan Birchfield.

Similar presentations


Presentation on theme: "Motion Segmentation from Clustering of Sparse Point Features Using Spatially Constrained Mixture Models Shrinivas Pundlik Committee members Dr. Stan Birchfield."— Presentation transcript:

1 Motion Segmentation from Clustering of Sparse Point Features Using Spatially Constrained Mixture Models Shrinivas Pundlik Committee members Dr. Stan Birchfield (chair) Dr. Adam Hoover Dr. Ian Walker Dr. Damon Woodard

2 Motion Segmentation Gestalt insight: grouping forms the basis of human perception Gestalt Laws: Factors that affect the grouping process (cues) Motion segmentation: segmenting images based on common motion points moving together are grouped together similarityproximity common motion (common fate) continuity Typically, motion segmentation uses common motion + proximity

3 Applications of Motion Segmentation object detection  pedestrian detection tracking  vehicle tracking robotics surveillance image and video compression scene reconstruction video manipulation / editing  video matting  video annotation  motion magnification Video editing Criminisi et al., 2006 Vehicle tracking Kanhere et al., 2005 Pedestrian detection Viola et al., 2003

4 Previous Work Approach Wang and Adelson 1994 Xiao and Shah 2005 Ayer and Sawhney 1995 Willis et al. 2003 Motion Layer Estimation Multi Body Factorization Object Level Grouping Miscellaneous Costeria and Kanade 1995 Sivic et al. 2004 Kanhere et al. 2005 Ke and Kanade 2002 Black and Fleet 1998 Birchfield 1999 Levine and Weiss 2006 Vidal and Sastry 2003 Yan and Pollefeys 2006 Gruber and Weiss 2006 Jojic and Frey 2001 Algorithm Shi and Malik 1998 Expectation Maximization Graph Cuts Belief Propagation Normalized Cuts Jojic and Frey 2001 Smith et al. 2004 Kokkinos and Maragos 2004 Xiao and Shah 2005 Willis et al. 2003 Criminisi et al. 2006 Kumar et al. 2005 Variational Methods Cremers and Soatto 2005 Brox et al. 2005 Nature of Data Dense Motion Motion + Image Cues Sparse Features Sivic et al. 2004 Kanhere et al. 2005 Rothganger et al. 2004 Cremers and Soatto 2005 Brox et al. 2005 Kumar et al. 2005 Criminisi et al. 2006 Xiao and Shah 2005

5 Challenges: Short Term 1. statue 2. wall 4. grass 3. trees 5. biker 6. pedestrian + + + + + +  computation of motion in the scene influence of the neighboring motion  number of objects / regions in the scene  initialization of motion parameters  description of complex motions (articulated human motion)

6 Challenges: Long Term t threshold slow mediumfast x time window batch processing incremental processing threshold slowmediumfast t x crawling time window  batch processing vs. incremental processing updating the reference frame  maintain existing groups growing existing regions splitting  adding new groups (new objects) (deleting invisible groups)

7 Objectives motion computation clustering (two-frame) long-term maintenance of groups observed data parameter estimation group assignment motion models translation affine complex models Feature Tracking Motion Segmentation Mixture Model Framework Articulated Human Motion Models motion segmentation using sparse point features automatically determine the number of groups handling dynamic sequences real time performance handling complex motions

8 Overview of the Topics Feature Tracking: Tracking sparse point features for computation of image motion and its extension to joint feature tracking.  S. T. Birchfield and S. J. Pundlik, “Joint Tracking of Features and Edges”, CVPR, 2008. Motion Segmentation: Clustering point features in videos based on their motion and spatial connectivity.  S. J. Pundlik and S. T. Birchfield, “ Motion Segmentation at Any Speed”, BMVC 2006.  S. J. Pundlik and S. T. Birchfield, “Real Time Motion Segmentation of Sparse Feature Points at Any Speed”, IEEE Trans. on Systems, Man, and Cybernetics, 2008. Articulated Human Motion Models: Learning human walking motion from various pose and view angles for segmentation and pose estimation (a special handling of a complex motion model) Iris Segmentation: Texture and intensity based segmentation of non-ideal iris images.  S. J. Pundlik, D. L. Woodard and S. T. Birchfield, “Non Ideal Iris Segmentation Using Graph Cuts”, CVPR Workshop on Biometrics, 2008.

9 Point Features Popular features:  Harris corner feature [Harris & Stephens 1987, Schmid et al. 2000]  Shi-Tomasi feature [Shi & Tomasi 1994]  Forstner corner feature [Forstner 1994]  Scale invariant feature transform (SIFT) [Lowe 2000]  Gradient Location and Orientation Histogram (GLOH) [Mikolajczyk and Schmid 2005]  Features from accelerated segment test (FAST) [Rosten and Drummond 2005]  Speeded up robust features (SURF) [Bay et al. 2006]  DAISY [Tola et al. 2008] gradients point features input capturing the information content

10 Utility of Point Features Advantages:  highly repeatable and extensible (work for a variety of images)  efficient to compute (real time implementations available)  local methods for processing (tracking through multiple frames) tracking multiple point features = sparse optical flow sparse point feature tracks yield the image motion

11 Tracking Point Features : Lucas-Kanade (optic flow constraint equation) image pixel displacement image spatial derivatives image temporal derivative Estimate the pixel displacement u = ( u, v ) T by minimizing: Differentiating with respect to u and v, setting the derivatives to zero leads to a linear system: Assume constant brightness: Iterate using Newton-Raphson method Gradient covariance matrix convolution kernel

12 Detection of Point Features intensity x y no feature 1 low intensity variation e max = 5.15, e min = 3.13 two small eigenvalues intensity x y edge feature 2 unidirectional intensity variation e max = 1026.9, e min = 29.9 a small and a large eigenvalue intensity y x good feature 3 bidirectional intensity variation e max = 1672.44, e min = 932.4 two large eigenvalues 1 3 2 Gradient covariance matrix: eigenvalues of Z threshold Good feature: Z = convolution kernel image gradients >

13 Dense Optical Flow: Horn-Schunck Horn-Schunck: find global displacement functions u(x,y) and v(x,y) by minimizing: data term (optical flow constraint) smoothness term regularization parameter Solve using Euler-Lagrange: Laplacian Approximation leads to a sparse system: average displacement in the neighborhood a constant

14 Need for a Joint Approach Lucas-Kanade (1981) Horn-Schunck (1981)  local method (local smoothing)  pixel displacement: constant within a small neighborhood  robust under noise  produces sparse optical flow  global method (global smoothing)  pixel displacement: a smooth function over the image domain  sensitive to noise  produces dense optical flow use global smoothing to improve feature tracking use local smoothing to improve dense optical flow Joint Feature Tracking Combined Local-Global approach (Bruhn et al., 2004)

15 Joint Lucas-Kanade (JLK) data term (optical flow constraint) smoothness term (regularization) Joint Lucas-Kanade energy functional: number of feature points Differentiating E JLK w.r.t. (u,v) gives a 2N x 2N system whose (2i-1) and (2i) th rows are given by: Sparse system is solved using Jacobi iterations expected values

16 Results of JLK low texture repetitive texture

17 Overview of the Topics Feature Tracking: Tracking sparse point features for computation of image motion and its extension to joint feature tracking.  S. T. Birchfield and S. J. Pundlik, “Joint Tracking of Features and Edges”, CVPR, 2008. Motion Segmentation: Clustering point features in videos based on their motion and spatial connectivity.  S. J. Pundlik and S. T. Birchfield, “ Motion Segmentation at Any Speed”, BMVC 2006.  S. J. Pundlik and S. T. Birchfield, “Real Time Motion Segmentation of Sparse Feature Points at Any Speed”, IEEE Trans. on Systems, Man, and Cybernetics, 2008. Articulated Human Motion Models: Learning human walking motion from various pose and view angles for segmentation and pose estimation (a special handling of a complex motion model) Iris Segmentation: Texture and intensity based segmentation of non-ideal iris images.  S. J. Pundlik, D. L. Woodard and S. T. Birchfield, “Non Ideal Iris Segmentation Using Graph Cuts”, CVPR Workshop on Biometrics, 2008.

18 Mixture Models Basics Posterior Probability of drawing a Red sample likelihood of the sample being Red (measurement) prior probability of the Red bin P(Red|sample) P(sample|Red) P(Red) how Red is the drawn sample? how big is the Red bin? 3 bins (components) sample P(sample) = P(sample|Red)P(Red) + P(sample|Green) P(Green) + P(sample|Blue)P(Blue) probability of drawing a sample from a mixture of three bins: challenge: only available information is the drawn sample! Mixture Model: likelihoods and priors for all the components

19 Mixture Model Example: GMM Parameters of a Gaussian density, θ : mean (μ) and variance (σ 2 ) grayscale values θ 1 = {μ 1, σ 1 } θ 2 = {μ 2, σ 2 } θ 3 = {μ 3, σ 3 } θ 4 = {μ 4, σ 4 } Gaussian density for the j th component: i th pixel conditioned on parameters of the j th Gaussian density

20 Learning Mixture Models Mixture model defined as: number of components (known) mixing weights (unknown) component density density parameters (unknown) Learning mixture models (parameter estimation): Estimate mixing weights and component density parameters parameter estimation class association (segmentation) observed data point (known) Circular nature of the problem:

21 Expectation Maximization EM: an iterative two step algorithm for parameter estimation 1.Initialize: a.number of components K b.component density parameters θ for all components c.mixing weights π d.convergence criterion 2.repeat until convergence E STEP a.for all N data points i. compute likelihood from the component density ii. estimate weights, w M STEP b. estimate mixing weights c. estimate component density parameters E Step: Find expectation of the likelihood function (Segmentation / label assignment) M Step: Maximize the likelihood function (parameter estimation based on segmentation) convergence: when the likelihood cannot be further maximized (when estimates do not change between successive iterations )

22 Various Mixture Models data term (how closely the data follow the models) smoothness term (spatial interaction of the data elements) one prior for each component (mixing weights) prior distribution for each data element (label probabilities) neighbors mostly have similar labels (loose constraint) enforce spatial connectivity of labels Finite Mixture Model FMM Spatially Variant Finite Mixture Model (ML) ML-SVFMM [1] Spatially Variant Finite Mixture Model (MAP) MAP-SVFMM [1] Spatially Constrained Finite Mixture Model SCFMM EM algorithm Greedy EM algorithm 1.S. Sanjay-Gopal and T. Hebert, “Bayesian Pixel Classification Using Spatially Variant Finite Mixtures and Generalized EM Algorithm”, IEEE Tran. on Image Processing, 1998.

23 Greedy-EM (Iterative Region Growing) start location 1 start location 2 start location 3 consider a 4-connected grid Properties of Greedy EM:  enforces spatial connectivity of labels (SCFMM)  automatically determines the number of groups  local initialization of parameters  primary user defined parameters: inclusion criterion minimum number of elements in a group

24 Grouping Point Features Between two frames, Repeat  Randomly select seed feature  Fit motion model to neighbors  Repeat until group does not change: Discard all features except the one near the centroid Grow group by recursively including neighboring features with similar motion Update the motion model Until all features have been considered grouping features from a single seed point original seed original seed centroid original seed centroid original seed centroid original seed original seed centroid

25 Grouping Consistent Features input: point features tracked between two frames output: groups of point features for N seed points  group point features  gather sets of features always grouped together seed 1 seed 2 seed 3 consistent feature group

26 Grouping Consistent Features + = 1 11 1 1 1 abcd d c b a 11 11 1 1 1 abcd d c b a abcd d c b a 2 2 2 2 2 a b c d a b c d a b c d Consistency check: Features that are always grouped together, no matter the seed point In practice, we use 7 seed points seed point

27 Consistent Features: Multiple Groups Feature groups obtained for various iterations consistent feature groups

28 Maintaining Groups Over Time 6 6 9 2 7 6 3 5 4 1 2 6 8 3 5 7 track features find consistent groups lost features newly added features if Х 2 test fails 1 4 8 3 7 9 5 frame kframe k + n 6 3 7 89 5 either features are regrouped or multiple groups are found

29 Experimental Results mobile-calendar freethrow car-map robots statue

30 Videos statue sequence mobile-calendar sequence

31 Results Over Time freethrow mobile-calendar statue car-map robots vehicles Algorithm dynamically determines the number of feature groups

32 Comparison with Other Approaches AlgorithmRun Time (sec/frame) Max. number of groups Xiao and Shah (PAMI, 2005) 5204 Kumar et al. (ICCV, 2005) 5006 Smith et al. (PAMI, 2004) 1803 Rothganger et al. (CVPR, 2004) 303 Jojic and Frey (CVPR, 2001) 13 Cremers and Soatto (IJCV, 2005) 404 Our algorithm (TSMC, 2008) 0.168

33 Effect of Joint Feature Tracking input standard Lucas-Kanade Joint Lucas-Kanade

34 Overview of the Topics Feature Tracking: Tracking sparse point features for computation of image motion and its extension to joint feature tracking.  S. T. Birchfield and S. J. Pundlik, “Joint Tracking of Features and Edges”, CVPR, 2008. Motion Segmentation: Clustering point features in videos based on their motion and spatial connectivity.  S. J. Pundlik and S. T. Birchfield, “ Motion Segmentation at Any Speed”, BMVC 2006.  S. J. Pundlik and S. T. Birchfield, “Real Time Motion Segmentation of Sparse Feature Points at Any Speed”, IEEE Trans. on Systems, Man, and Cybernetics, 2008. Articulated Human Motion Models: Learning human walking motion from various pose and view angles for segmentation and pose estimation (a special handling of a complex motion model) Iris Segmentation: Texture and intensity based segmentation of non-ideal iris images.  S. J. Pundlik, D. L. Woodard and S. T. Birchfield, “Non Ideal Iris Segmentation Using Graph Cuts”, CVPR Workshop on Biometrics, 2008.

35 Articulated Motion Models Objectives:  learn articulated human motion models  motion only, no appearance  viewpoint and scale invariant detection  varying lighting conditions (day and night time sequences)  detection in presence of camera and background motion  pose estimation Theme: Sparse Motion alone captures a wealth of information Purpose of human motion analysis:  pedestrian detection/surveillance  action recognition  pose estimation Traditional approaches use:  appearance  frame differencing

36 Use of Motion Capture Data motion capture (mocap) data in 3D train high-level descriptors (appearance or motion based) that describe articulated motion at a global level for detection learn the motion of individual joints from the training data and aggregate the information to detect human motion Bottom-Up Approach Top-Down Approach hand foot 2 foot 1 center displacement of the limbs w.r.t. the body center

37 Approach Overview

38 Training 3D motion capture pointsangular viewpoints walking poses

39 Motion Descriptor Gaussian weight maps for the various means and orientations that constitute the motion descriptor spatial arrangement of the descriptor bins w.r.t. the body center bin values of the motion descriptor describing human subjects from various viewpoints and pose configurations views poses confusion matrix for 64 training descriptors

40 Segmentation Results View-invariant segmentation of articulated motion using a motion descriptor right profile left profile angular front Segmentation of articulated motion in a challenging sequence involving camera and background motion

41 Pose Estimation Results front view nighttime sequence right-profile view angular view

42 Videos of Detection and Pose Estimation

43 Overview of the Topics Feature Tracking: Tracking sparse point features for computation of image motion and its extension to joint feature tracking.  S. T. Birchfield and S. J. Pundlik, “Joint Tracking of Features and Edges”, CVPR, 2008. Motion Segmentation: Clustering point features in videos based on their motion and spatial connectivity.  S. J. Pundlik and S. T. Birchfield, “ Motion Segmentation at Any Speed”, BMVC 2006.  S. J. Pundlik and S. T. Birchfield, “Real Time Motion Segmentation of Sparse Feature Points at Any Speed”, IEEE Trans. on Systems, Man, and Cybernetics, 2008. Articulated Human Motion Models: Learning human walking motion from various pose and view angles for segmentation and pose estimation (a special handling of a complex motion model) Iris Segmentation: Texture and intensity based segmentation of non-ideal iris images.  S. J. Pundlik, D. L. Woodard and S. T. Birchfield, “Non Ideal Iris Segmentation Using Graph Cuts”, CVPR Workshop on Biometrics, 2008.

44 Iris Image Segmentation non-ideal iris image segmentation using texture and intensity Ideas: local intensity variations (computed from gradient magnitude and point features) can be used for texture representation that segments eyelash and non-eyelash regions possible segments based on image intensity: iris, pupil and background higher density of point features higher gradient magnitude lower density of point features lower gradient magnitude input image point features gradient magnitude background iris pupil Coarse Texture Computation eye eyelash non-eyelash irispupil background textured regions un-textured regions (Four Regions)

45 Iris Segmentation and Recognition Input Iris ImagePreprocessed inputIris SegmentationIris RefinementIris MaskIris Ellipse Specular Reflections - Iris segmentation: Iris recognition:  unwrap and normalize the iris mask  generate iris signature from iris mask (using texture in the iris)  compare iris signature using Hamming distance

46 Image Segmentation Results iris background pupil eyelashes Input Image Segmentation Iris Mask

47 Iris Recognition Iris recognition using our segmentation algorithm West Virginia Non-Ideal Database West Virginia Off-Axis Database 1868 images 467 classes, 4 images/class 584 images 146 classes, 4 images/class

48 Conclusions and Future Work Motion segmentation based on sparse feature clustering  spatially constrained mixture model and greedy EM algorithm  automatically determines number of groups  real-time performance  ability to handle long, dynamic sequences and arbitrary number of feature groups Joint feature tracking  incorporation of neighboring feature motion  improved performance in areas of low-texture or repetitive texture Detection of articulated motion  motion based approach for learning high-level human motion models  segment and track human motion in varying pose, scale, and lighting conditions  view invariant pose estimation Iris segmentation  graph cuts based dense segmentation using texture and intensity  combines appearance and eye geometry  handles non-ideal iris image with occlusion, illumination changes, and eye rotation Future Work  integration of motion segmentation, joint feature tracking, and articulated motion segmentation  dense segmentation from the sparse feature groups  handling non-rigid motions, non-textured regions, and occlusions  combining sparse feature groups, discontinuities, and image contours for a novel representation of video

49 Questions?


Download ppt "Motion Segmentation from Clustering of Sparse Point Features Using Spatially Constrained Mixture Models Shrinivas Pundlik Committee members Dr. Stan Birchfield."

Similar presentations


Ads by Google