Simultaneous Segmentation and 3D Pose Estimation of Humans Philip H.S. Torr Pawan Kumar, Pushmeet Kohli, Matt Bray Oxford Brookes University Arasanathan.

Slides:



Advertisements
Similar presentations
Using Strong Shape Priors for Multiview Reconstruction Yunda SunPushmeet Kohli Mathieu BrayPhilip HS Torr Department of Computing Oxford Brookes University.
Advertisements

POSE–CUT Simultaneous Segmentation and 3D Pose Estimation of Humans using Dynamic Graph Cuts Mathieu Bray Pushmeet Kohli Philip H.S. Torr Department of.
Shape Context and Chamfer Matching in Cluttered Scenes
1 Hierarchical Part-Based Human Body Pose Estimation * Ramanan Navaratnam * Arasanathan Thayananthan Prof. Phil Torr * Prof. Roberto Cipolla * University.
Primal-dual Algorithm for Convex Markov Random Fields Vladimir Kolmogorov University College London GDR (Optimisation Discrète, Graph Cuts et Analyse d'Images)
Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts.
OBJ CUT & Pose Cut CVPR 05 ECCV 06
O BJ C UT M. Pawan Kumar Philip Torr Andrew Zisserman UNIVERSITY OF OXFORD.
Combinatorial Optimization and Computer Vision Philip Torr.
Solving Markov Random Fields using Dynamic Graph Cuts & Second Order Cone Programming Relaxations M. Pawan Kumar, Pushmeet Kohli Philip Torr.
Lena Gorelick Joint work with Frank Schmidt and Yuri Boykov Rochester Institute of Technology, Center of Imaging Science January 2013 TexPoint fonts used.
Introduction to Markov Random Fields and Graph Cuts Simon Prince
Joint Optimisation for Object Class Segmentation and Dense Stereo Reconstruction Ľubor Ladický, Paul Sturgess, Christopher Russell, Sunando Sengupta, Yalin.
Learning to Combine Bottom-Up and Top-Down Segmentation Anat Levin and Yair Weiss School of CS&Eng, The Hebrew University of Jerusalem, Israel.
ICCV 2007 tutorial Part III Message-passing algorithms for energy minimization Vladimir Kolmogorov University College London.
I Images as graphs Fully-connected graph – node for every pixel – link between every pair of pixels, p,q – similarity w ij for each link j w ij c Source:
Interactive Image Segmentation using Graph Cuts Mayuresh Kulkarni and Fred Nicolls Digital Image Processing Group University of Cape Town PRASA 2009.
Stephen J. Guy 1. Photomontage Photomontage GrabCut – Interactive Foreground Extraction 1.
1 s-t Graph Cuts for Binary Energy Minimization  Now that we have an energy function, the big question is how do we minimize it? n Exhaustive search is.
Graph-based image segmentation Václav Hlaváč Czech Technical University in Prague Faculty of Electrical Engineering Department of Cybernetics Prague, Czech.
Simultaneous Segmentation and 3D Pose Estimation of Humans or Detection + Segmentation = Tracking? Philip H.S. Torr Pawan Kumar, Pushmeet Kohli, Matt Bray.
Robust Higher Order Potentials For Enforcing Label Consistency
Schedule Introduction Models: small cliques and special potentials Tea break Inference: Relaxation techniques:
ICCV Tutorial 2007 Philip Torr Papers, presentations and videos on web.....
P 3 & Beyond Solving Energies with Higher Order Cliques Pushmeet Kohli Pawan Kumar Philip H. S. Torr Oxford Brookes University CVPR 2007.
2010/5/171 Overview of graph cuts. 2010/5/172 Outline Introduction S-t Graph cuts Extension to multi-label problems Compare simulated annealing and alpha-
Stereo & Iterative Graph-Cuts Alex Rav-Acha Vision Course Hebrew University.
Efficiently Solving Convex Relaxations M. Pawan Kumar University of Oxford for MAP Estimation Philip Torr Oxford Brookes University.
Stereo Computation using Iterative Graph-Cuts
Comp 775: Graph Cuts and Continuous Maximal Flows Marc Niethammer, Stephen Pizer Department of Computer Science University of North Carolina, Chapel Hill.
What Energy Functions Can be Minimized Using Graph Cuts? Shai Bagon Advanced Topics in Computer Vision June 2010.
An Iterative Optimization Approach for Unified Image Segmentation and Matting Hello everyone, my name is Jue Wang, I’m glad to be here to present our paper.
Measuring Uncertainty in Graph Cut Solutions Pushmeet Kohli Philip H.S. Torr Department of Computing Oxford Brookes University.
Graph-Cut Algorithm with Application to Computer Vision Presented by Yongsub Lim Applied Algorithm Laboratory.
Computer vision: models, learning and inference
Extensions of submodularity and their application in computer vision
What, Where & How Many? Combining Object Detectors and CRFs
Graph-based Segmentation
Reconstructing Relief Surfaces George Vogiatzis, Philip Torr, Steven Seitz and Roberto Cipolla BMVC 2004.
MRFs and Segmentation with Graph Cuts Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 03/31/15.
MRFs and Segmentation with Graph Cuts Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/24/10.
CS774. Markov Random Field : Theory and Application Lecture 13 Kyomin Jung KAIST Oct
Planar Cycle Covering Graphs for inference in MRFS The Typhon Algorithm A New Variational Approach to Ground State Computation in Binary Planar Markov.
Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects in N-D Images (Fri) Young Ki Baik, Computer Vision Lab.
CS 4487/6587 Algorithms for Image Analysis
Probabilistic Inference Lecture 3 M. Pawan Kumar Slides available online
Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London.
Associative Hierarchical CRFs for Object Class Image Segmentation Ľubor Ladický 1 1 Oxford Brookes University 2 Microsoft Research Cambridge Based on the.
Discrete Optimization Lecture 3 – Part 1 M. Pawan Kumar Slides available online
Probabilistic Inference Lecture 5 M. Pawan Kumar Slides available online
Machine Learning – Lecture 15
Associative Hierarchical CRFs for Object Class Image Segmentation
O BJ C UT M. Pawan Kumar Philip Torr Andrew Zisserman UNIVERSITY OF OXFORD.
Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.
Looking at people and Image-based Localisation Roberto Cipolla Department of Engineering Research team
Using Combinatorial Optimization within Max-Product Belief Propagation
Machine Learning – Lecture 15
Pushmeet Kohli. E(X) E: {0,1} n → R 0 → fg 1 → bg Image (D) n = number of pixels [Boykov and Jolly ‘ 01] [Blake et al. ‘04] [Rother, Kolmogorov and.
CS654: Digital Image Analysis Lecture 28: Advanced topics in Image Segmentation Image courtesy: IEEE, IJCV.
1 EE5900 Advanced Embedded System For Smart Infrastructure Static Scheduling.
Object Recognition by Integrating Multiple Image Segmentations Caroline Pantofaru, Cordelia Schmid, Martial Hebert ECCV 2008 E.
A global approach Finding correspondence between a pair of epipolar lines for all pixels simultaneously Local method: no guarantee we will have one to.
Markov Random Fields in Vision
Tracking Hands with Distance Transforms Dave Bargeron Noah Snavely.
Energy minimization Another global approach to improve quality of correspondences Assumption: disparities vary (mostly) smoothly Minimize energy function:
Nonparametric Semantic Segmentation
Learning to Combine Bottom-Up and Top-Down Segmentation
Discrete Inference and Learning
Learning Layered Motion Segmentations of Video
“Traditional” image segmentation
Presentation transcript:

Simultaneous Segmentation and 3D Pose Estimation of Humans Philip H.S. Torr Pawan Kumar, Pushmeet Kohli, Matt Bray Oxford Brookes University Arasanathan Thayananthan, Bjorn Stenger, Roberto Cipolla Cambridge University

Algebra n Unifying Conjecture n Tracking = Detection = Recognition n Detection = Segmentation therefore n Tracking (pose estimation)=Segmentation?

Objective ImageSegmentationPose Estimate?? Aim to get a clean segmentation of a human…

Developments n ICCV 2003, pose estimation as fast nearest neighbour plus dynamics (inspired by Gavrilla and Toyoma & Blake) n BMVC 2004, parts based chamfer to make space of templates more flexible (a la pictorial structures of Huttenlocher) n CVPR 2005, ObjCut combining segmentation and detection. n ICCV 2005 Dynamic Graph Cuts n ECCV 2006, interpolation of poses using the MVRVM (Agarwal and Triggs) n ECCV 2006 combination of pose estimation and segmentation using graph cuts.

Tracking as Detection (Stenger et al ICCV 2003) Detection has become very efficient, e.g. real-time face detection, pedestrian detection Example: Pedestrian detection [Gavrila & Philomin, 1999]: Find match among large number of exemplar templates Issues: Number of templates needed Efficient search Robust cost function

Cascaded Classifiers

First filter : 19.8 % patches remaining 1280x1024 image, 11 subsampling levels, 80s Average number of filter per patch : 6.7

Filter 10 : 0.74 % patches remaining 1280x1024 image, 11 subsampling levels, 80s Average number of filter per patch : 6.7

Filter 20 : 0.06 % patches remaining 1280x1024 image, 11 subsampling levels, 80s Average number of filter per patch : 6.7

Filter 30 : 0.01 % patches remaining 1280x1024 image, 11 subsampling levels, 80s Average number of filter per patch : 6.7

Filter 70 : % patches remaining 1280x1024 image, 11 subsampling levels, 80s Average number of filter per patch : 6.7

Hierarchical Detection Efficient template matching (Huttenlocher & Olson, Gavrila) Idea: When matching similar objects, speed-up by forming template hierarchy found by clustering Match prototypes first, sub-tree only if cost below threshold

Trees n These search trees are the same as used for efficient nearest neighbour. n Add dynamic model and Detection = Tracking = Recognition

Evaluation at Multiple Resolutions One traversal of tree per time step

Evaluation at Multiple Resolutions Tree: 9000 templates of hand pointing, rigid

Templates at Level 1

Templates at Level 2

Templates at Level 3

Comparison with Particle Filters n This method is grid based, No need to render the model on line Like efficient search Can always use this as a proposal process for a particle filter if need be.

Interpolation, MVRVM, ECCV 2006 Code available.

Energy being Optimized, link to graph cuts n Combination of Edge term (quickly evaluated using chamfer) Interior term (quickly evaluated using integral images) n Note that possible templates are a bit like cuts that we put down, one could think of this whole process as a constrained search for the best graph cut.

Likelihood : Edges Edge DetectionProjected Contours Robust Edge Matching Input Image 3D Model

Chamfer Matching Input imageCanny edges Distance transform Projected Contours

Likelihood : Colour Skin Colour Model Projected Silhouette Input Image 3D Model Template Matching

Template Matching = n Template Matching = constrained search for a cut/segmentation? n Detection = Segmentation?

Objective ImageSegmentationPose Estimate?? Aim to get a clean segmentation of a human…

MRF for Interactive Image Segmentation, Boykov and Jolly [ICCV 2001] Energy MRF Pair-wise Terms MAP Solution Unary likelihoodData (D) Unary likelihoodContrast TermUniform Prior (Potts Model) Maximum-a-posteriori (MAP) solution x* = arg min E(x) x =

However… n This energy formulation rarely provides realistic (target- like) results.

ObjCut (yesterday)  Unary potential Pairwise potential Pose parameters Label s Pixel s Prior Potts model Pose-specific MRF

Layer 2 Layer 1 Transformations Θ 1 P(Θ 1 ) = 0.9 Cow Instance Do we really need accurate models?

n Segmentation boundary can be extracted from edges n Rough 3D Shape-prior enough for region disambiguation

Energy of the Pose-specific MRF Energy to be minimized Unary term Shape prior Pairwise potential Potts model But what should be the value of θ?

The different terms of the MRF Original image Likelihood of being foreground given a foreground histogram Grimson- Stauffer segmentation Shape prior model Shape prior (distance transform) Likelihood of being foreground given all the terms Resulting Graph-Cuts segmentation

Can segment multiple views simultaneously

Solve via gradient descent n Comparable to level set methods n Could use other approaches (e.g. Objcut) n Need a graph cut per function evaluation

Formulating the Pose Inference Problem

But… EACH … to compute the MAP of E(x) w.r.t the pose, it means that the unary terms will be changed at EACH iteration and the maxflow recomputed! However… Kohli and Torr showed how dynamic graph cuts can be used to efficiently find MAP solutions for MRFs that change minimally from one time instant to the next: Dynamic Graph Cuts (ICCV05).

Dynamic Graph Cuts PBPB SBSB cheaper operation computationally expensive operation Simpler problem P B* differences between A and B similar PAPA SASA solve

Dynamic Image Segmentation Image Flows in n-edges Segmentation Obtained

First segmentation problem MAP solution GaGa Our Algorithm GbGb second segmentation problem Maximum flow residual graph ( G r ) G` difference between G a and G b updated residual graph

Energy Minimization using Graph cuts E MRF (a 1,a 2 ) Sink (1) Source (0) a1a1 a2a2 Graph Construction for Binary Random Variables

Energy Minimization using Graph cuts Sink (1) Source (0) a1a1 a2a2 E MRF (a 1,a 2 ) = 2a 1 2 t-edges (unary terms)

Energy Minimization using Graph cuts E MRF (a 1,a 2 ) = 2a 1 + 5ā 1 Sink (1) Source (0) a1a1 a2a2 2 5

Energy Minimization using Graph cuts E MRF (a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 Sink (1) Source (0) a1a1 a2a

Energy Minimization using Graph cuts E MRF (a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 Sink (1) Source (0) a1a1 a2a n-edges (pair-wise term)

Energy Minimization using Graph cuts E MRF (a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a 2 Sink (1) Source (0) a1a1 a2a

Energy Minimization using Graph cuts E MRF (a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a 2 Sink (1) Source (0) a1a1 a2a

Energy Minimization using Graph cuts E MRF (a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a 2 Sink (1) Source (0) a1a1 a2a a 1 = 1 a 2 = 1 E MRF (1,1) = 11 Cost of st-cut = 11

Energy Minimization using Graph cuts E MRF (a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a 2 Sink (1) Source (0) a1a1 a2a a 1 = 1 a 2 = 0 E MRF (1,0) = 8 Cost of st-cut = 8

Energy Minimization using Graph cuts Most probable (MAP) configuration ≡ minimum cost st-cut. st-mincut is in general a NP-hard problem - negative edge weights Solvable in polynomial time - non-negative edge weights - corresponds to sub-modular (regular) energy functions

The Max-flow Problem - Edge capacity and flow balance constraints Computing the st-mincut from Max-flow algorithms Notation - Residual capacity (edge capacity – current flow) - Augmenting path Simple Augmenting Path based Algorithms - Repeatedly find augmenting paths and push flow. - Saturated edges constitute the st-mincut. [Ford-Fulkerson Theorem] Sink (1) Source (0) a1a1 a2a

9 + α 4 + α Adding a constant to both the t-edges of a node does not change the edges constituting the st-mincut. Key Observation Sink (1) Source (0) a1a1 a2a E (a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a 2 E*(a 1,a 2 ) = E(a 1,a 2 ) + α(a 2 +ā 2 ) = E(a 1,a 2 ) + α [a 2 +ā 2 =1] Reparametrization

9 + α 4 All reparametrizations of the graph are sums of these two types. Other type of reparametrization Sink (1) Source (0) a1a1 a2a α 2 + α 1 - α Reparametrization, second type Both maintain the solution and add a constant α to the energy.

s G t original graph 0/9 0/7 0/5 0/20/4 0/1 xixi xjxj flow/residual capacity Graph Re-parameterization

s G t original graph 0/9 0/7 0/5 0/20/4 0/1 xixi xjxj flow/residual capacity Graph Re-parameterization t residual graph xixi xjxj 0/12 5/2 3/2 1/0 2/04/0 st-mincut Compute Maxflow GrGr Edges cut

Update t-edge Capacities s GrGr t residual graph xixi xjxj 0/12 5/2 3/2 1/0 2/04/0

Update t-edge Capacities s GrGr t residual graph xixi xjxj 0/12 5/2 3/2 1/0 2/04/0 capacity changes from 7 to 4

Update t-edge Capacities s G` t updated residual graph xixi xjxj 0/12 5/-1 3/2 1/0 2/04/0 capacity changes from 7 to 4 edge capacity constraint violated! (flow > capacity) = 5 – 4 = 1 excess flow (e) = flow – new capacity

add e to both t-edges connected to node i Update t-edge Capacities s G` t updated residual graph xixi xjxj 0/12 3/2 1/0 2/04/0 capacity changes from 7 to 4 edge capacity constraint violated! (flow > capacity) = 5 – 4 = 1 excess flow (e) = flow – new capacity 5/-1

Update t-edge Capacities s G` t updated residual graph xixi xjxj 0/12 3/2 1/0 4/0 capacity changes from 7 to 4 excess flow (e) = flow – new capacity add e to both t-edges connected to node i = 5 – 4 = 1 5/0 2/1 edge capacity constraint violated! (flow > capacity)

Update n-edge Capacities s GrGr t residual graph xixi xjxj 0/12 5/2 3/2 1/0 2/04/0 Capacity changes from 5 to 2

Update n-edge Capacities s t Updated residual graph xixi xjxj 0/12 5/2 3/-1 1/0 2/04/0 G` Capacity changes from 5 to 2 - edge capacity constraint violated!

Update n-edge Capacities s t Updated residual graph xixi xjxj 0/12 5/2 3/-1 1/0 2/04/0 G` Capacity changes from 5 to 2 - edge capacity constraint violated! Reduce flow to satisfy constraint

Update n-edge Capacities s t Updated residual graph xixi xjxj 0/11 5/2 2/0 1/0 2/04/0 excess deficiency G` Capacity changes from 5 to 2 - edge capacity constraint violated! Reduce flow to satisfy constraint - causes flow imbalance!

Update n-edge Capacities s t Updated residual graph xixi xjxj 0/11 5/2 2/0 1/0 2/04/0 deficiency excess G` Capacity changes from 5 to 2 - edge capacity constraint violated! Reduce flow to satisfy constraint - causes flow imbalance! Push excess flow to/from the terminals Create capacity by adding α = excess to both t-edges.

Update n-edge Capacities Updated residual graph Capacity changes from 5 to 2 - edge capacity constraint violated! Reduce flow to satisfy constraint - causes flow imbalance! Push excess flow to the terminals Create capacity by adding α = excess to both t-edges. G` xixi xjxj 0/11 5/3 2/0 3/04/1 s t

Update n-edge Capacities Updated residual graph Capacity changes from 5 to 2 - edge capacity constraint violated! Reduce flow to satisfy constraint - causes flow imbalance! Push excess flow to the terminals Create capacity by adding α = excess to both t-edges. G` xixi xjxj 0/11 5/3 2/0 3/04/1 s t

Complexity analysis of MRF Update Operations MRF Energy Operation Graph OperationComplexity modifying a unary term modifying a pair-wise term adding a latent variable delete a latent variable Updating a t-edge capacity Updating a n-edge capacity adding a node set the capacities of all edges of a node zero O(1) O(k)* *requires k edge update operations where k is degree of the node

Finding augmenting paths is time consuming. Dual-tree maxflow algorithm [Boykov & Kolmogorov PAMI 2004] -Reuses search trees after each augmentation. -Empirically shown to be substantially faster. Our Idea Reuse search trees from previous graph cut computation Saves us search tree creation tree time [O(#edges)] Search trees have to be modified to make them consistent with new graphs Constrain the search of augmenting paths –New paths must contain at least one updated edge Improving the Algorithm

Reusing Search Trees c’ = measure of change in the energy Running time – Dynamic algorithm (c’ + re-create search tree ) – Improved dynamic algorithm (c’) – Video Segmentation Example - Duplicate image frames (No time is needed)

Dynamic Graph Cut vs Active Cuts n Our method flow recycling n AC cut recycling n Both methods: Tree recycling

Compared results with the best static algorithm. - Dual-tree algorithm [Boykov & Kolmogorov PAMI 2004] Applications - Interactive Image Segmentation - Image Segmentation in Videos Experimental Analysis

additional segmentation cues user segmentation cues static: 175 msec dynamic : 80 msec dynamic (optimized): 15 msec static : 175 msec Interactive Image segmentation (update unary terms) Energy MRF =

Experimental Analysis Image resolution: 720x576 static: 220 msec dynamic (optimized): 50 msec Image segmentation in videos (unary & pairwise terms) Graph CutsDynamic Graph Cuts Energy MRF =

Experimental Analysis Image resolution: 720x576 static: 177 msec dynamic (optimized): 60 msec Image segmentation in videos (unary & pairwise terms) Graph CutsDynamic Graph Cuts Energy MRF =

Experimental Analysis MRF consisting of 2x10 5 latent variables connected in a 4-neighborhood. Running time of the dynamic algorithm

Other uses n Can be used to compute uncertainty in graph cuts via max marginals. n Max marginals can be used for parameter learning in MRF’s.

Inference in Graphical Models Graphical Model Topology TreeGraph with cycles Belief Propagation and variants Exact solution True Marginals/ min-marginals Approximate solution Approximate Marginals/ min-marginals Graph Cuts No Marginals/ Min-Marginals Class 1: Max-flow Computation, Exact Class 2: Alpha expansions, Approximate Solution with bounds Class 3: Local Minima (with no bounds)

Inference in Graphical Models Min-Marginals Energies( ψ ) - Minimize joint energy over all other variables. - Related to max-marginals as: - Can be used to compute confidence as: σ j = µ j / Σ a µ a = exp(- ψ i ) / Σ a exp(- ψ a ) µ j = (1/z)*exp(- ψ j )

Energy Projections and Graph Construction E MRF (a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a 2 + Kā 2 a1a1 a2a Sink (0) Source (1) ∞ A high unary term (t-edge) can be used to constrain the solution of the energy to be the solution of the energy projection. Alternative Construction K

Segmentation Comparison Grimson-Stauffer Bathia04 Our method

Face Detector and ObjCut

Segmentation

Conclusion n Combining pose inference and segmentation worth investigating. n Tracking = Detection n Detection = Segmentation n Tracking = Segmentation. n Segmentation = SFM ??