2 Image Understanding A long standing goal of Computer Vision Consists of understanding: –Objects and visual patterns –Context –State / Actions of objects –Relations between objects –Physical layout –Etc. A picture is worth a thousand words…
3 Natural Language Understanding Very far from being solved Even NL parsing (syntax) is problematic Ambiguities require high level (semantic) knowledge
5 Image Parsing Framework SegmentationEdge Detection Object RecognitionClassification Generic Framework Low-Level Tasks High-Level Tasks I S
6 Inference: Top-down (Generative) Constellation, Star-Model etc. Bottom-up (Discriminative) SVM, Boosting, Neural Nets etc. + Fast - Possibly Inconsistent + Consistent Solutions - Slow Approach used in “Image Parsing” IS
7 Coming up next… Define a (Monstrous) Generative model for Image Parsing How to perform s-l-o-w inference on such models (MCMC) How to accelerate inference using bottom-up cues (DDMCMC)
8 Image Parsing Generative Model –No. of regions K –Region Shapes L i and Types ζ i –Region Parameters Θ i Uniform I S
9 Image Parsing Generative Model –No. of regions K –Region Shapes L i and Types ζ i –Region Parameters Θ i Uniform
10 Generic Regions Constant up to Gaussian noise Gray level histogram Quadratic form
11 Faces Use a PCA model (Eigen-faces) Estimate Cov. Σ and prin. comp.
12 Text region shapes Use Spline templates Allow Affine transformation Allow small deformations of control point Shading intensity model
13 Problem Formulation Now we can compute We’d like to optimize over the space of parse graphs
14 Optimizing P(S|I) How about Gradient Methods ? –Hybrid State Space: Continuous & Discrete –Enormous number of local maxima How about BP ? –Cannot compare 3 faces to 4 letters
15 Optimizing P(S|I) is not easy… Hybrid State Space: Continuous & Discrete Enormous number of local maxima Graphical model structure is not pre- determined Rules out gradient methods Rules out Belief propagation
16 Optimize by Sampling! Monte Carlo Principle –Use random samples to optimize! –Lets say we’re given N samples from P(S|I) S 1,…,S N –Compute P(S i |I) Given S i it is easy to compute P(S i |I) –Choose the best S i !
17 Detour: Sampling methods How to sample from (very) complex probability space Sampling algorithm Why is Markov Chained in Monte Carlo?
19 Markov Chain A sequence of Random Variables Markov property Transition Given the present The future is independent of the past
20 Markov Chain – cont. Under certain conditions MC converges to unique distribution Stationary distribution – first eigen-vector of K
21 Markov Chain Monte Carlo Reminder: Had we wanted a sample from Take the value of X t, How to make our the stationary distribution of MC ? How to guarantee convergence ?
22 Markov Chain convergence Irreducibility: –The walk can reach any state starting at any state Non-periodicity –Stationary distribution cannot depend on t
23 Detailed Balance: (stationary distribution), if Written as matrix product Sufficient condition to converge to p(x) The same distribution p(.) How to make p(x) Stationary Probability sum to 1 Forward step Backward step Independent of x*
24 Kernel Selection Detailed Balance requires Kernel: Metropolis-Hastings Kernel: –Proposal: where to go next –Acceptance: should we go MH Kernel provides detailed balance Among the ten most influencing algorithms in science and engineering
26 Can we use any q(.) ? 1. Easy to sample from: – we sample from q(.) instead of p(.)
27 Can we use any q(.) ? 2. Supports p(x) p(x) q(x)
28 Can we use any q(.) ? 3. Explores p(x) wisely: –Too narrow q(.): q(x*|x) ~ N(x,.1) –Too wide q(.): q(x*|x) ~ N(0,20) p(x) q(x)
29 Can we use any q(.) ? 1.Easy to sample from: we sample from q(.) instead of p(.) 2.Supports p(x) – 3.Explores p(x) wisely: –q(.) too narrow –q(.) too wide -> low acceptance The best q(.) is p(.) – but we can’t sample p(.) directly.
30 Combining Kernels Suppose we have Satisfying detailed balance with the same Thenalso satisfies detailed balance.
31 Combining MH Kernels The same applies to Metropolis Hastings Kernels: –Combining MH Kernels with different proposals – MC will converge to
32 Example Revisited Proposal distribution: Acceptance: Given x - easy to compute p(x) Normalization factor cancels out
34 MAP Estimation Converge to Simulated Annealing: –explore less – exploit more! As the density is peaked at the global maxima
35 Annealing - example As the density is peaked at the global maxima
36 Dimensionality variation in our space Cannot directly compare density of different states! Model Selection Varying number of regions Varying types of explanations per region
37 Pair-wise common measure Jump across dimensions
38 Reversible Jumps Common measure –Sample extensions u and u* s.t dim(u)+dim(x) = dim(u*)+dim(x*) –Use common dimension for comparison using invertible deterministic functions h and h’ –Explicitly allow reversible jumps x* x
39 MCMC Summary Sample p(x) using Markov Chain Proposal q(x*|x) –Supports p(x) –Guides the sampling Detailed balance –MH Kernel ensures convergence to p(x) Reversible Jumps –Comparing across models and dimensions
40 If you want to make a new sample, You should first learn how to propose. Acceptance is random Eventually you’ll get trapped in endless chains until you become stationary. Some say it is better to do reversible jumps between models. MCMC – Take home message
41 Back to image parsing A state is a parse tree Moves between possible parses of the image Varying number of regions Different region types: Text, Face and Generic Varying number of parameters
42 Birth / Death of a Face / Text Split / Merge of a generic region Model switching for a region Region boundary evolution MCMC Moves
43 Moves -> Kernel Birth / Death of a Face / Text Split / Merge of a generic region Model switching for a region Region boundary evolution MCMC Moves
44 Moves -> Kernel Text Birth Text Death Face Birth Face Death Split Region Merge Region Model Switching Boundary Evolution Text Sub-Kernel Face Sub-Kernel Generic Sub-Kernel Dimensionality change: must allow reversible jump
45 Using bottom-up cues So far we haven’t stated the proposal probabilities q(.) If q(.) is uninformed of the image, convergence can be painfully slow Solution: use the image to propose moves Face birth kernel
46 Data Driven MCMC Define proposal probabilities q(x*|x;I) The proposal probabilities will depend on discriminative tests –Faces detection –Text detection –Edge detection –Parameter clustering Generative model with Discriminative proposals
47 Face/Text Detection Bottom-up cues: AdaBoost –hard classification –Estimate posterior instead –Run on sliding windows at several scales
48 Edge Map Canny edge detection at several scales Only these edges for split / merge
49 Parameters clustering Estimate likely parameter settings in the image Cluster using Mean-Shift
50 How to propose? q(S*|S,I) should approximate p(S*|I) Choose one sub-kernel at random –(e.g., create face) Use bottom-up cues to generate proposals: S 1,S 2,… Weight proposal according to p(S i |I) Sample from discrete distribution
51 Generic region – split/merge Split/merge according to edge map Dimensionality change – reversible S S’
52 Generic region – split/merge Splitting k into i,j: S k -> S ij Proposals are weighted Normalize weight to probabilities Sample
53 Generic region – split/merge Splitting k into i,j S -> S’ Suggestions are weighted Region Probability Parameters Probability
54 Faces sub-kernel Adding a face :S->S’ Take AdaBoost proposals Compute weights w i =P(S’|I)/P(S|I) Normalize weights to probability Sample Reversible kernel –add/remove face kernel
55 Accept / Reject We have the proposal q(S’|S;I) Check Metropolis Hastings acceptance
56 Full diagram Text Birth Text Death Face Birth Face Death Split Region Merge Region Model Switching Boundary Evolution Text Sub-Kernel Face Sub-Kernel Generic Sub-Kernel Generative Text DetectionFace DetectionEdge DetectionParameter Clustering Input Image Discriminativ e
66 Image Parsing –Decomposition to constituent visual patterns Top-down Generative Model for Parse Graphs Optimization using DDMCMC –MCMC –Discriminative bottom-up proposals Summary
67 References Zhuowen Tu, Xiangrong Chen, Alan L. Yuille, Song- Chun Zhu. Image Parsing: Unifying Segmentation, Detection, and Recognition. International Journal of Computer Vision, 2005.Image Parsing: Unifying Segmentation, Detection, and Recognition Z. Tu and S. Zhu. Image Segmentation by DDMCMC. IEEE Trans. Pattern Analysis and Machine Intelligence, 2002.Image Segmentation by DDMCMC Zhuowen Tu, Xiangrong Chen, A.L. Yuille and S.C. Zhu. Image Parsing: Unifying Segmentation, Detection, and Recognition. IEEE International Conference on Computer Vision, 2003. Image Parsing: Unifying Segmentation, Detection, and Recognition C. Andrieu, N. de Freitas, A. Doucet and M. Jordan. An introduction to MCMC for machine learning. Machine Learning, vol. 50, pp. 5--43, Jan.- Feb. 2003. An introduction to MCMC for machine learning
69 Summary MCMC –A method for sampling from very complex distributions –Metropolis-Hastings kernel guarantees convergence to desired distribution DDMCMC –Speeding up MCMC convergence using discriminative cues –Unifying framework for top-down, bottom- up, discriminative and generative methods
70 Example Compute posterior for a simple GMM: –Given one X, what component of the mixture generated it? –Exhaustive search – What if larger space?
72 Model selection example Curve fit –Line: ax+by+c=0 –2 nd order poly: ax 2 +bxy+cy 2 +dx+ey+f=0 –…
73 In order to calculate acceptance, we need the reverse term For we need A reversible jump The acceptance is now Reversible Jumps
74 Binarization Extracting text boundaries Adaptive thresholding
75 What’s so special about Text? Information lies in boundary –AdaBoost: suggests region –Adaptive binarization: boundary refinement
76 Union of model subspaces How can we compare densities across dimensions? Model selection UU
77 Parameter clustering Each cluster in parameter set induce saliency map Shading Gray level
78 Generic region – split/merge Splitting k into i,j or merging i,j into k Suggestions are weighted Region Affinity Shape Prior Parameter Clustering Current Region Probability Current parameters Probability
79 Switching node’s attributes No dimensionality change Weighting the proposals by
80 Boundary Evolution Kernel Does not change dimensionality For two adjacent regions: –Log likelihood ratio –Changes in area –Boundary curvature –Deviation from control points (text) –Brownian noise
Your consent to our cookies if you continue to use this website.