Presentation is loading. Please wait.

Presentation is loading. Please wait.

Grammar of Image Zhaoyin Jia, 03-30-2009. Problems  Enormous amount of vision knowledge:  Computational complexity  Semantic gap …… Classification,

Similar presentations


Presentation on theme: "Grammar of Image Zhaoyin Jia, 03-30-2009. Problems  Enormous amount of vision knowledge:  Computational complexity  Semantic gap …… Classification,"— Presentation transcript:

1 Grammar of Image Zhaoyin Jia, 03-30-2009

2 Problems  Enormous amount of vision knowledge:  Computational complexity  Semantic gap …… Classification, Recognition

3 Task of image parsing

4 Objectives in this paper  Framework for vision  And-Or Graph  Algorithm for this framework  Top-down/bottom-up computation  Generalization of small sample  Use Monte Carlos simulation to synthesis more configurations  Fill the semantic gap

5 Grammar  Language: co-occurance of s is more than chance  Image: Parallel; T-junction CONSTANTINOPLE

6 Formulation of grammar  Start symbol: S  Non-terminal nodes: V N  Reproduction Rule: R  Terminal nodes: V T

7 Formulation of grammar  Start symbol: S  Non-terminal nodes: V N  Reproduction Rule: R  Terminal nodes: V T

8 Formulation of grammar  Start symbol: S  Non-terminal nodes: V N  Reproduction Rule: R  Terminal nodes: V T S NP VP VP VP PP VP V NP ……

9 Formulation of grammar  Start symbol: S  Non-terminal nodes: V N  Reproduction Rule: R  Terminal nodes: V T

10 Formulation of grammar  Start symbol: S  Non-terminal nodes: V N  Reproduction Rule: R  Terminal nodes: V T

11 Image grammar  Start symbol: S  Reproduction Rules  Non-terminal nodes: V N  Terminal nodes: V T

12 Overlapping parts/Ambiguity

13  Similar color, occlusion, etc. Overlapping parts/Ambiguity

14  For each V N, we have reproduction rules: with a probability associated with each one:  Probability of parsing tree:  Probability of sentence: Stochastic Context Free Grammar

15 Stochastic Grammar with Context  From left to right: bi-gram model (Markov chain) a sentence with n words:  Non-local relations: tree model

16 New issues in Image Grammar  Loss of “left to right” order: region adjacency graph

17 New issues in Image Grammar  Scaling makes different terminal in parsing tree

18 New issues in Image Grammar  Switch between texture and structure

19 Building the image grammar  Visual Vocabulary: primitives, sketch graph, textons…  Relations and configurations: co-occurance, attached, hinged, supported, occluded…  And-or Graph representation embedding image grammar  Learning /testing the parse graph find the possible inference

20 Database  Lotus Hill Institute Dataset  636,748 images, 3,927,130 Physical Objects  A few hundred are free Benjamin Yao, Xiong Yang, and Song-Chun Zhu, “Introduction to a large scale general purpose ground truth dataset: methodology, annotation tool, and benchmarks.” EMMCVPR, 2007 http://www.imageparsing.com/

21 Free Data  6 categories, 145 subsets Manmade Object 75 Nature Object 40 Objects in Scene 6 Transportation 9 UCLA Aerial Image 5 UIUC Sport Activity 10  Outline & segmentation of the object http://yoshi.cs.ucla.edu/yao/data/

22 Free Data  6 categories, 145 subsets Manmade Object 75 Nature Object 40 Objects in Scene 6 Transportation 9 UCLA Aerial Image 5 UIUC Sport Activity 10  Segmentation of a scene (street) http://yoshi.cs.ucla.edu/yao/data/

23 Free Data  6 categories, 145 subsets Manmade Object 75 Nature Object 40 Objects in Scene 6 Transportation 9 UCLA Aerial Image 5 UIUC Sport Activity 10  Physical parts of the object http://yoshi.cs.ucla.edu/yao/data/

24 Visual Vocabulary  The “Lego Land”  Language

25 Visual Vocabulary   : function of image primitives : a) geometry transformation b) appearance  : bond between each primitives

26 Visual Vocabulary  Sketch and Texture  S. C. Zhu, Y. N. Wu, and D. B. Mumford, “Minimax entropy principle and its applications to texture modeling,” Neural Computation, vol. 9, no. 8, pp. 1627–1660, November 1997

27 Primal sketch model Input image Sketch graph Texture pixels C. E. Guo, S. C. Zhu, and Y. N. Wu, “Primal sketch: Integrating texture and structure,” in Proceedings of International Conference on Computer Vision,2003.

28 Primal sketch model C. E. Guo, S. C. Zhu, and Y. N. Wu, “Primal sketch: Integrating texture and structure,” in Proceedings of International Conference on Computer Vision,2003.

29 High level visual vocabulary  Cloth: collar, left/right sleeves, hands H. Chen, Z. J. Xu, Z. Q. Liu, and S. C. Zhu, “Composite templates for cloth modeling and sketching,” in Proceedings of IEEE Conference on Pattern Recognition and Computer Vision, New York, June 2006

30 Relations and configurations  Definition of relation: bonds: relations:, : structure, : compatibility  Three types of relations  Bonds and connections  Joints and junctions  Object interactions/semantics  Definition of configurations:

31 Relations  Bonds and connections connects primitives into bigger graphs intensity/color compatibility

32 Relations  Joint and junctions

33 Relations  Object interactions

34 Configuration  Spatial layout of entities at a certain level Primal sketch – parts – object – scene

35 Reconfigurable graphs  Treat bonds as random variables: address nodes

36 Inference of the configuration  Have the primal sketch of the image  Detect the ‘T-junction’  Simulated annealing to infer the Gestalt Law R. X. Gao and S. C. Zhu, “From primal sketch to 2.1D sketch,” Technical Report, Lotus Hill Institute, 2006 Red dot: connect region Black line: known edge Green line: inferred connection

37 Reconfigurable graphs Ru-Xin Gao1, Tian-Fu Wu, Song-Chun Zhu, and Nong Sang, “Bayesian Inference for Layer Representation with Mixed Markov Random Field ” Source imageT-junction Inferred connection Layer extraction

38 Reconfigurable graphs R. X. Gao and S. C. Zhu, “From primal sketch to 2.1D sketch,” Technical Report, Lotus Hill Institute, 2006

39 And-Or Graph  Parse graph of the image pt: parse tree of vocabularyE: relations  Inference the parse graph: Z. J. Xu, L. Lin, T. F. Wu, and S. C. Zhu, “Recursive top-down/bottom up algorithm for object recognition,” Technical Report, Lotus Hill Research Institute, 2007.

40  Contain all the valid parse graphs  And node, Or node, leaf- node  Relation between children of And node  Parse tree: assigning label on Or node And-Or Graph Z. J. Xu, L. Lin, T. F. Wu, and S. C. Zhu, “Recursive top-down/bottom up algorithm for object recognition,” Technical Report, Lotus Hill Research Institute, 2007.

41  Definition:   image primitives  relations at all level  : probability model defined on the And-Or graph  : valid configuration of terminal nodes And-Or Graph

42 Stochastic Model on And-Or graph  Terminal (leaf) node:  And-Or node:  Set of links:  Switch variable at Or-node:  Attributes of primitives:

43 Stochastic Model on And-Or graph  Terminal (leaf) node:  And-Or node:  Set of links:  Switch variable at Or-node:  Attributes of primitives: SCFG: weigh the frequency at the children of or-nodes

44 Stochastic Model on And-Or graph  Terminal (leaf) node:  And-Or node:  Set of links:  Switch variable at Or-node:  Attributes of primitives: Weigh the local compatibility of primitives (geometric and appearance)

45 Stochastic Model on And-Or graph  Terminal (leaf) node:  And-Or node:  Set of links:  Switch variable at Or-node:  Attributes of primitives: Spatial and appearance between primitives (parts or objects)

46 Learning And-Or Graph  Learning the vocabulary  Learning the relation set R, given  Learning the parameters, given R and

47 Learning And-Or Graph  Learning the vocabulary, and hierarchic And-Or Graph  Learning the relation set R, given  Learning the parameters, given R and Discussed in the paper

48 Learning And-Or Graph  Learning and Pursuing Relation Set R:  Start from Stochastic Context Free Graph (a)  Learn the relations that maximally reduce the KL divergence to the observation (b-e) Observation: Learning model: J. Porway, Z. Y. Yao, and S. C. Zhu, “Learning an And–Or graph for modeling and recognizing object categories,” Technical Report, Department of Statistics,2007

49  Learning graph parameter  Approximating to  Similar to texture synthesis S. C. Zhu, Y. N. Wu, and D. B. Mumford, “Minimax entropy principle and its applications to texture modeling,” Neural Computation, vol. 9, no. 8, pp. 1627–1660, November 1997 Learning And-Or Graph

50 Case I: Rectangle  Nodes: Rectangle  Two vanishing points, four edge direction  Rules: F. Han and S. C. Zhu, “Bottom-up/top-down image parsing by attribute graph grammar”. Proceedings of International Conference on Computer Vision, Beijing,China, 2005.

51 Case I: Rectangle  Get the primal sketch of the scene  Find the ‘strong’ rectangular (bottom-up, red)  Weigh (score) different hypothesis (top- down, blue)  Weight is the compatibility of the image with the proposed rectangular (primal-sketch)  Accept the best one  Do the previous 3 steps until all the weigh is small. (negative) F. Han and S. C. Zhu, “Bottom-up/top-down image parsing by attribute graph grammar”. Proceedings of International Conference on Computer Vision, Beijing,China, 2005.

52 Case I: Rectangle  Inference process

53 Case I: Rectangle F. Han and S. C. Zhu, “Bottom-up/top-down image parsing by attribute graph grammar”. Proceedings of International Conference on Computer Vision, Beijing,China, 2005.

54 Case II: Human Cloth  Use And-Or graph to generate a matching model  Vocabulary (training dataset) Matching using the And-or Graph

55 Case II: Human Cloth  The And-Or Graph H. Chen, Z. J. Xu, Z. Q. Liu, and S. C. Zhu, “Composite templates for cloth modeling and sketching,” in Proceedings of IEEE Conference on Pattern Recognition and Computer Vision, New York, June 2006.  Novel Configuration

56  Inference process Case II: Human Cloth H. Chen, Z. J. Xu, Z. Q. Liu, and S. C. Zhu, “Composite templates for cloth modeling and sketching,” in Proceedings of IEEE Conference on Pattern Recognition and Computer Vision, New York, June 2006. Localize face, then estimate the parts of the body Bottom-up: a coarse matching of the parts Top-down: refine the matching using the relation

57 Case II: Human Cloth  Inference result H. Chen, Z. J. Xu, Z. Q. Liu, and S. C. Zhu, “Composite templates for cloth modeling and sketching,” in Proceedings of IEEE Conference on Pattern Recognition and Computer Vision, New York, June 2006.

58 Case II: Human Cloth  Inference result H. Chen, Z. J. Xu, Z. Q. Liu, and S. C. Zhu, “Composite templates for cloth modeling and sketching,” in Proceedings of IEEE Conference on Pattern Recognition and Computer Vision, New York, June 2006. Hands are not exactly the same: find the best matching in the dataset

59 Case III: Recognition Z. J. Xu, L. Lin, T. F. Wu, and S. C. Zhu, “Recursive top-down/bottomup algorithm for object recognition,” Technical Report, Lotus Hill Research Institute, 2007.

60 Conclusion  Enormous amount of vision knowledge: (Add-Or graph) ……

61 Conclusion  Computational complexity :  Remain open for scheduling bottom-up/top-down procedure  Semantic Gap  Learning the And-Or Graph  Learning the vocabulary, and its attributes After all, we are not supposed to define so many things: ideal vision words: what we have now:

62 Thank you Zhaoyin Jia


Download ppt "Grammar of Image Zhaoyin Jia, 03-30-2009. Problems  Enormous amount of vision knowledge:  Computational complexity  Semantic gap …… Classification,"

Similar presentations


Ads by Google