Download presentation

Presentation is loading. Please wait.

Published byIsis Eagles Modified about 1 year ago

1
1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman

2
2 Image Understanding A long standing goal of Computer Vision Consists of understanding: –Objects and visual patterns –Context –State / Actions of objects –Relations between objects –Physical layout –Etc. A picture is worth a thousand words…

3
3 Natural Language Understanding Very far from being solved Even NL parsing (syntax) is problematic Ambiguities require high level (semantic) knowledge

4
4 Image Parsing Decomposition to constituent visual patterns –Edge Detection –Segmentation –Object Recognition

5
5 Image Parsing Framework SegmentationEdge Detection Object RecognitionClassification Generic Framework Low-Level Tasks High-Level Tasks I S

6
6 Inference: Top-down (Generative) Constellation, Star-Model etc. Bottom-up (Discriminative) SVM, Boosting, Neural Nets etc. + Fast - Possibly Inconsistent + Consistent Solutions - Slow Approach used in “Image Parsing” IS

7
7 Coming up next… Define a (Monstrous) Generative model for Image Parsing How to perform s-l-o-w inference on such models (MCMC) How to accelerate inference using bottom-up cues (DDMCMC)

8
8 Image Parsing Generative Model –No. of regions K –Region Shapes L i and Types ζ i –Region Parameters Θ i Uniform I S

9
9 Image Parsing Generative Model –No. of regions K –Region Shapes L i and Types ζ i –Region Parameters Θ i Uniform

10
10 Generic Regions Constant up to Gaussian noise Gray level histogram Quadratic form

11
11 Faces Use a PCA model (Eigen-faces) Estimate Cov. Σ and prin. comp.

12
12 Text region shapes Use Spline templates Allow Affine transformation Allow small deformations of control point Shading intensity model

13
13 Problem Formulation Now we can compute We’d like to optimize over the space of parse graphs

14
14 Optimizing P(S|I) How about Gradient Methods ? –Hybrid State Space: Continuous & Discrete –Enormous number of local maxima How about BP ? –Cannot compare 3 faces to 4 letters

15
15 Optimizing P(S|I) is not easy… Hybrid State Space: Continuous & Discrete Enormous number of local maxima Graphical model structure is not pre- determined Rules out gradient methods Rules out Belief propagation

16
16 Optimize by Sampling! Monte Carlo Principle –Use random samples to optimize! –Lets say we’re given N samples from P(S|I) S 1,…,S N –Compute P(S i |I) Given S i it is easy to compute P(S i |I) –Choose the best S i !

17
17 Detour: Sampling methods How to sample from (very) complex probability space Sampling algorithm Why is Markov Chained in Monte Carlo?

18
18 Example Sample from

19
19 Markov Chain A sequence of Random Variables Markov property Transition Given the present The future is independent of the past

20
20 Markov Chain – cont. Under certain conditions MC converges to unique distribution Stationary distribution – first eigen-vector of K

21
21 Markov Chain Monte Carlo Reminder: Had we wanted a sample from Take the value of X t, How to make our the stationary distribution of MC ? How to guarantee convergence ?

22
22 Markov Chain convergence Irreducibility: –The walk can reach any state starting at any state Non-periodicity –Stationary distribution cannot depend on t

23
23 Detailed Balance: (stationary distribution), if Written as matrix product Sufficient condition to converge to p(x) The same distribution p(.) How to make p(x) Stationary Probability sum to 1 Forward step Backward step Independent of x*

24
24 Kernel Selection Detailed Balance requires Kernel: Metropolis-Hastings Kernel: –Proposal: where to go next –Acceptance: should we go MH Kernel provides detailed balance Among the ten most influencing algorithms in science and engineering

25
25 Metropolis Hastings Sample x*~q(x*|x t ) Compute acceptance probability If rand

26
26 Can we use any q(.) ? 1. Easy to sample from: – we sample from q(.) instead of p(.)

27
27 Can we use any q(.) ? 2. Supports p(x) p(x) q(x)

28
28 Can we use any q(.) ? 3. Explores p(x) wisely: –Too narrow q(.): q(x*|x) ~ N(x,.1) –Too wide q(.): q(x*|x) ~ N(0,20) p(x) q(x)

29
29 Can we use any q(.) ? 1.Easy to sample from: we sample from q(.) instead of p(.) 2.Supports p(x) – 3.Explores p(x) wisely: –q(.) too narrow –q(.) too wide -> low acceptance The best q(.) is p(.) – but we can’t sample p(.) directly.

30
30 Combining Kernels Suppose we have Satisfying detailed balance with the same Thenalso satisfies detailed balance.

31
31 Combining MH Kernels The same applies to Metropolis Hastings Kernels: –Combining MH Kernels with different proposals – MC will converge to

32
32 Example Revisited Proposal distribution: Acceptance: Given x - easy to compute p(x) Normalization factor cancels out

33
33 Example – cont.

34
34 MAP Estimation Converge to Simulated Annealing: –explore less – exploit more! As the density is peaked at the global maxima

35
35 Annealing - example As the density is peaked at the global maxima

36
36 Dimensionality variation in our space Cannot directly compare density of different states! Model Selection Varying number of regions Varying types of explanations per region

37
37 Pair-wise common measure Jump across dimensions

38
38 Reversible Jumps Common measure –Sample extensions u and u* s.t dim(u)+dim(x) = dim(u*)+dim(x*) –Use common dimension for comparison using invertible deterministic functions h and h’ –Explicitly allow reversible jumps x* x

39
39 MCMC Summary Sample p(x) using Markov Chain Proposal q(x*|x) –Supports p(x) –Guides the sampling Detailed balance –MH Kernel ensures convergence to p(x) Reversible Jumps –Comparing across models and dimensions

40
40 If you want to make a new sample, You should first learn how to propose. Acceptance is random Eventually you’ll get trapped in endless chains until you become stationary. Some say it is better to do reversible jumps between models. MCMC – Take home message

41
41 Back to image parsing A state is a parse tree Moves between possible parses of the image Varying number of regions Different region types: Text, Face and Generic Varying number of parameters

42
42 Birth / Death of a Face / Text Split / Merge of a generic region Model switching for a region Region boundary evolution MCMC Moves

43
43 Moves -> Kernel Birth / Death of a Face / Text Split / Merge of a generic region Model switching for a region Region boundary evolution MCMC Moves

44
44 Moves -> Kernel Text Birth Text Death Face Birth Face Death Split Region Merge Region Model Switching Boundary Evolution Text Sub-Kernel Face Sub-Kernel Generic Sub-Kernel Dimensionality change: must allow reversible jump

45
45 Using bottom-up cues So far we haven’t stated the proposal probabilities q(.) If q(.) is uninformed of the image, convergence can be painfully slow Solution: use the image to propose moves Face birth kernel

46
46 Data Driven MCMC Define proposal probabilities q(x*|x;I) The proposal probabilities will depend on discriminative tests –Faces detection –Text detection –Edge detection –Parameter clustering Generative model with Discriminative proposals

47
47 Face/Text Detection Bottom-up cues: AdaBoost –hard classification –Estimate posterior instead –Run on sliding windows at several scales

48
48 Edge Map Canny edge detection at several scales Only these edges for split / merge

49
49 Parameters clustering Estimate likely parameter settings in the image Cluster using Mean-Shift

50
50 How to propose? q(S*|S,I) should approximate p(S*|I) Choose one sub-kernel at random –(e.g., create face) Use bottom-up cues to generate proposals: S 1,S 2,… Weight proposal according to p(S i |I) Sample from discrete distribution

51
51 Generic region – split/merge Split/merge according to edge map Dimensionality change – reversible S S’

52
52 Generic region – split/merge Splitting k into i,j: S k -> S ij Proposals are weighted Normalize weight to probabilities Sample

53
53 Generic region – split/merge Splitting k into i,j S -> S’ Suggestions are weighted Region Probability Parameters Probability

54
54 Faces sub-kernel Adding a face :S->S’ Take AdaBoost proposals Compute weights w i =P(S’|I)/P(S|I) Normalize weights to probability Sample Reversible kernel –add/remove face kernel

55
55 Accept / Reject We have the proposal q(S’|S;I) Check Metropolis Hastings acceptance

56
56 Full diagram Text Birth Text Death Face Birth Face Death Split Region Merge Region Model Switching Boundary Evolution Text Sub-Kernel Face Sub-Kernel Generic Sub-Kernel Generative Text DetectionFace DetectionEdge DetectionParameter Clustering Input Image Discriminativ e

57
57 Results

58
58 Results

59
59 Results

60
60 Results

61
61 Results

62
62 Limitations Scaling to a large number of objects –Algorithm design complexity –Convergence speed –Dealing with complex objects Good Synthesis / Detection but not so good segmentation

63
63 Extensions

64
64 Extensions

65
65 Extensions

66
66 Image Parsing –Decomposition to constituent visual patterns Top-down Generative Model for Parse Graphs Optimization using DDMCMC –MCMC –Discriminative bottom-up proposals Summary

67
67 References Zhuowen Tu, Xiangrong Chen, Alan L. Yuille, Song- Chun Zhu. Image Parsing: Unifying Segmentation, Detection, and Recognition. International Journal of Computer Vision, 2005.Image Parsing: Unifying Segmentation, Detection, and Recognition Z. Tu and S. Zhu. Image Segmentation by DDMCMC. IEEE Trans. Pattern Analysis and Machine Intelligence, 2002.Image Segmentation by DDMCMC Zhuowen Tu, Xiangrong Chen, A.L. Yuille and S.C. Zhu. Image Parsing: Unifying Segmentation, Detection, and Recognition. IEEE International Conference on Computer Vision, Image Parsing: Unifying Segmentation, Detection, and Recognition C. Andrieu, N. de Freitas, A. Doucet and M. Jordan. An introduction to MCMC for machine learning. Machine Learning, vol. 50, pp , Jan.- Feb An introduction to MCMC for machine learning

68
68 Backups

69
69 Summary MCMC –A method for sampling from very complex distributions –Metropolis-Hastings kernel guarantees convergence to desired distribution DDMCMC –Speeding up MCMC convergence using discriminative cues –Unifying framework for top-down, bottom- up, discriminative and generative methods

70
70 Example Compute posterior for a simple GMM: –Given one X, what component of the mixture generated it? –Exhaustive search – What if larger space?

71
71 Example revisited

72
72 Model selection example Curve fit –Line: ax+by+c=0 –2 nd order poly: ax 2 +bxy+cy 2 +dx+ey+f=0 –…

73
73 In order to calculate acceptance, we need the reverse term For we need A reversible jump The acceptance is now Reversible Jumps

74
74 Binarization Extracting text boundaries Adaptive thresholding

75
75 What’s so special about Text? Information lies in boundary –AdaBoost: suggests region –Adaptive binarization: boundary refinement

76
76 Union of model subspaces How can we compare densities across dimensions? Model selection UU

77
77 Parameter clustering Each cluster in parameter set induce saliency map Shading Gray level

78
78 Generic region – split/merge Splitting k into i,j or merging i,j into k Suggestions are weighted Region Affinity Shape Prior Parameter Clustering Current Region Probability Current parameters Probability

79
79 Switching node’s attributes No dimensionality change Weighting the proposals by

80
80 Boundary Evolution Kernel Does not change dimensionality For two adjacent regions: –Log likelihood ratio –Changes in area –Boundary curvature –Deviation from control points (text) –Brownian noise

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google