Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman.

Similar presentations


Presentation on theme: "1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman."— Presentation transcript:

1 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman

2 2 Image Understanding A long standing goal of Computer Vision Consists of understanding: –Objects and visual patterns –Context –State / Actions of objects –Relations between objects –Physical layout –Etc. A picture is worth a thousand words…

3 3 Natural Language Understanding Very far from being solved Even NL parsing (syntax) is problematic Ambiguities require high level (semantic) knowledge

4 4 Image Parsing Decomposition to constituent visual patterns –Edge Detection –Segmentation –Object Recognition

5 5 Image Parsing Framework SegmentationEdge Detection Object RecognitionClassification Generic Framework Low-Level Tasks High-Level Tasks I S

6 6 Inference: Top-down (Generative) Constellation, Star-Model etc. Bottom-up (Discriminative) SVM, Boosting, Neural Nets etc. + Fast - Possibly Inconsistent + Consistent Solutions - Slow Approach used in “Image Parsing” IS

7 7 Coming up next… Define a (Monstrous) Generative model for Image Parsing How to perform s-l-o-w inference on such models (MCMC) How to accelerate inference using bottom-up cues (DDMCMC)

8 8 Image Parsing Generative Model –No. of regions K –Region Shapes L i and Types ζ i –Region Parameters Θ i Uniform I S

9 9 Image Parsing Generative Model –No. of regions K –Region Shapes L i and Types ζ i –Region Parameters Θ i Uniform

10 10 Generic Regions Constant up to Gaussian noise Gray level histogram Quadratic form

11 11 Faces Use a PCA model (Eigen-faces) Estimate Cov. Σ and prin. comp.

12 12 Text region shapes Use Spline templates Allow Affine transformation Allow small deformations of control point Shading intensity model

13 13 Problem Formulation Now we can compute We’d like to optimize over the space of parse graphs

14 14 Optimizing P(S|I) How about Gradient Methods ? –Hybrid State Space: Continuous & Discrete –Enormous number of local maxima How about BP ? –Cannot compare 3 faces to 4 letters

15 15 Optimizing P(S|I) is not easy… Hybrid State Space: Continuous & Discrete Enormous number of local maxima Graphical model structure is not pre- determined Rules out gradient methods Rules out Belief propagation

16 16 Optimize by Sampling! Monte Carlo Principle –Use random samples to optimize! –Lets say we’re given N samples from P(S|I) S 1,…,S N –Compute P(S i |I) Given S i it is easy to compute P(S i |I) –Choose the best S i !

17 17 Detour: Sampling methods How to sample from (very) complex probability space Sampling algorithm Why is Markov Chained in Monte Carlo?

18 18 Example Sample from

19 19 Markov Chain A sequence of Random Variables Markov property Transition Given the present The future is independent of the past

20 20 Markov Chain – cont. Under certain conditions MC converges to unique distribution Stationary distribution – first eigen-vector of K

21 21 Markov Chain Monte Carlo Reminder: Had we wanted a sample from Take the value of X t, How to make our the stationary distribution of MC ? How to guarantee convergence ?

22 22 Markov Chain convergence Irreducibility: –The walk can reach any state starting at any state Non-periodicity –Stationary distribution cannot depend on t

23 23 Detailed Balance: (stationary distribution), if Written as matrix product Sufficient condition to converge to p(x) The same distribution p(.) How to make p(x) Stationary Probability sum to 1 Forward step Backward step Independent of x*

24 24 Kernel Selection Detailed Balance requires Kernel: Metropolis-Hastings Kernel: –Proposal: where to go next –Acceptance: should we go MH Kernel provides detailed balance Among the ten most influencing algorithms in science and engineering

25 25 Metropolis Hastings Sample x*~q(x*|x t ) Compute acceptance probability If rand

26 26 Can we use any q(.) ? 1. Easy to sample from: – we sample from q(.) instead of p(.)

27 27 Can we use any q(.) ? 2. Supports p(x) p(x) q(x)

28 28 Can we use any q(.) ? 3. Explores p(x) wisely: –Too narrow q(.): q(x*|x) ~ N(x,.1) –Too wide q(.): q(x*|x) ~ N(0,20) p(x) q(x)

29 29 Can we use any q(.) ? 1.Easy to sample from: we sample from q(.) instead of p(.) 2.Supports p(x) – 3.Explores p(x) wisely: –q(.) too narrow –q(.) too wide -> low acceptance The best q(.) is p(.) – but we can’t sample p(.) directly.

30 30 Combining Kernels Suppose we have Satisfying detailed balance with the same Thenalso satisfies detailed balance.

31 31 Combining MH Kernels The same applies to Metropolis Hastings Kernels: –Combining MH Kernels with different proposals – MC will converge to

32 32 Example Revisited Proposal distribution: Acceptance: Given x - easy to compute p(x) Normalization factor cancels out

33 33 Example – cont.

34 34 MAP Estimation Converge to Simulated Annealing: –explore less – exploit more! As the density is peaked at the global maxima

35 35 Annealing - example As the density is peaked at the global maxima

36 36 Dimensionality variation in our space Cannot directly compare density of different states! Model Selection Varying number of regions Varying types of explanations per region

37 37 Pair-wise common measure Jump across dimensions

38 38 Reversible Jumps Common measure –Sample extensions u and u* s.t dim(u)+dim(x) = dim(u*)+dim(x*) –Use common dimension for comparison using invertible deterministic functions h and h’ –Explicitly allow reversible jumps x* x

39 39 MCMC Summary Sample p(x) using Markov Chain Proposal q(x*|x) –Supports p(x) –Guides the sampling Detailed balance –MH Kernel ensures convergence to p(x) Reversible Jumps –Comparing across models and dimensions

40 40 If you want to make a new sample, You should first learn how to propose. Acceptance is random Eventually you’ll get trapped in endless chains until you become stationary. Some say it is better to do reversible jumps between models. MCMC – Take home message

41 41 Back to image parsing A state is a parse tree Moves between possible parses of the image Varying number of regions Different region types: Text, Face and Generic Varying number of parameters

42 42 Birth / Death of a Face / Text Split / Merge of a generic region Model switching for a region Region boundary evolution MCMC Moves

43 43 Moves -> Kernel Birth / Death of a Face / Text Split / Merge of a generic region Model switching for a region Region boundary evolution MCMC Moves

44 44 Moves -> Kernel Text Birth Text Death Face Birth Face Death Split Region Merge Region Model Switching Boundary Evolution Text Sub-Kernel Face Sub-Kernel Generic Sub-Kernel Dimensionality change: must allow reversible jump

45 45 Using bottom-up cues So far we haven’t stated the proposal probabilities q(.) If q(.) is uninformed of the image, convergence can be painfully slow Solution: use the image to propose moves Face birth kernel

46 46 Data Driven MCMC Define proposal probabilities q(x*|x;I) The proposal probabilities will depend on discriminative tests –Faces detection –Text detection –Edge detection –Parameter clustering Generative model with Discriminative proposals

47 47 Face/Text Detection Bottom-up cues: AdaBoost –hard classification –Estimate posterior instead –Run on sliding windows at several scales

48 48 Edge Map Canny edge detection at several scales Only these edges for split / merge

49 49 Parameters clustering Estimate likely parameter settings in the image Cluster using Mean-Shift

50 50 How to propose? q(S*|S,I) should approximate p(S*|I) Choose one sub-kernel at random –(e.g., create face) Use bottom-up cues to generate proposals: S 1,S 2,… Weight proposal according to p(S i |I) Sample from discrete distribution

51 51 Generic region – split/merge Split/merge according to edge map Dimensionality change – reversible S S’

52 52 Generic region – split/merge Splitting k into i,j: S k -> S ij Proposals are weighted Normalize weight to probabilities Sample

53 53 Generic region – split/merge Splitting k into i,j S -> S’ Suggestions are weighted Region Probability Parameters Probability

54 54 Faces sub-kernel Adding a face :S->S’ Take AdaBoost proposals Compute weights w i =P(S’|I)/P(S|I) Normalize weights to probability Sample Reversible kernel –add/remove face kernel

55 55 Accept / Reject We have the proposal q(S’|S;I) Check Metropolis Hastings acceptance

56 56 Full diagram Text Birth Text Death Face Birth Face Death Split Region Merge Region Model Switching Boundary Evolution Text Sub-Kernel Face Sub-Kernel Generic Sub-Kernel Generative Text DetectionFace DetectionEdge DetectionParameter Clustering Input Image Discriminativ e

57 57 Results

58 58 Results

59 59 Results

60 60 Results

61 61 Results

62 62 Limitations Scaling to a large number of objects –Algorithm design complexity –Convergence speed –Dealing with complex objects Good Synthesis / Detection but not so good segmentation

63 63 Extensions

64 64 Extensions

65 65 Extensions

66 66 Image Parsing –Decomposition to constituent visual patterns Top-down Generative Model for Parse Graphs Optimization using DDMCMC –MCMC –Discriminative bottom-up proposals Summary

67 67 References Zhuowen Tu, Xiangrong Chen, Alan L. Yuille, Song- Chun Zhu. Image Parsing: Unifying Segmentation, Detection, and Recognition. International Journal of Computer Vision, 2005.Image Parsing: Unifying Segmentation, Detection, and Recognition Z. Tu and S. Zhu. Image Segmentation by DDMCMC. IEEE Trans. Pattern Analysis and Machine Intelligence, 2002.Image Segmentation by DDMCMC Zhuowen Tu, Xiangrong Chen, A.L. Yuille and S.C. Zhu. Image Parsing: Unifying Segmentation, Detection, and Recognition. IEEE International Conference on Computer Vision, Image Parsing: Unifying Segmentation, Detection, and Recognition C. Andrieu, N. de Freitas, A. Doucet and M. Jordan. An introduction to MCMC for machine learning. Machine Learning, vol. 50, pp , Jan.- Feb An introduction to MCMC for machine learning

68 68 Backups

69 69 Summary MCMC –A method for sampling from very complex distributions –Metropolis-Hastings kernel guarantees convergence to desired distribution DDMCMC –Speeding up MCMC convergence using discriminative cues –Unifying framework for top-down, bottom- up, discriminative and generative methods

70 70 Example Compute posterior for a simple GMM: –Given one X, what component of the mixture generated it? –Exhaustive search – What if larger space?

71 71 Example revisited

72 72 Model selection example Curve fit –Line: ax+by+c=0 –2 nd order poly: ax 2 +bxy+cy 2 +dx+ey+f=0 –…

73 73 In order to calculate acceptance, we need the reverse term For we need A reversible jump The acceptance is now Reversible Jumps

74 74 Binarization Extracting text boundaries Adaptive thresholding

75 75 What’s so special about Text? Information lies in boundary –AdaBoost: suggests region –Adaptive binarization: boundary refinement

76 76 Union of model subspaces How can we compare densities across dimensions? Model selection UU

77 77 Parameter clustering Each cluster in parameter set induce saliency map Shading Gray level

78 78 Generic region – split/merge Splitting k into i,j or merging i,j into k Suggestions are weighted Region Affinity Shape Prior Parameter Clustering Current Region Probability Current parameters Probability

79 79 Switching node’s attributes No dimensionality change Weighting the proposals by

80 80 Boundary Evolution Kernel Does not change dimensionality For two adjacent regions: –Log likelihood ratio –Changes in area –Boundary curvature –Deviation from control points (text) –Brownian noise


Download ppt "1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman."

Similar presentations


Ads by Google