Presentation is loading. Please wait.

Presentation is loading. Please wait.

Graphical Models in Vision. Alan L. Yuille. UCLA. Dept. Statistics.

Similar presentations


Presentation on theme: "Graphical Models in Vision. Alan L. Yuille. UCLA. Dept. Statistics."— Presentation transcript:

1 Graphical Models in Vision. Alan L. Yuille. UCLA. Dept. Statistics

2 The Purpose of Vision. “To Know What is Where by Looking”. Aristotle. (384-322 BC). Information Processing: receive a signal by light rays and decode its information. Vision appears deceptively simple, but there is more to Vision than meets the Eye.

3 Ames Room

4 Perspective.

5 What are Humans Ideal for? Clearly humans are not good at determining the size of objects in images – at least for these types of stimuli. But they are good at determining context and taking contextual cues into account – i.e. use perspective cues to estimate depth and make adjustments. What reasoning/statistical tasks are humans ideal for?

6 Brightness of Patterns: Adelson (MIT)

7 Visual Illusions The perception of brightness of a surface, or the length of a line, depends on context. Not on basic measurements like: the no. of photons that reach the eye or the length of line in the image.

8 Vision is ill-posed. Vision is ill-posed – the data in the retina is not sufficient to unambiguously determine the visual scene. Vision is possible because we have prior knowledge about visual scenes. Even simple perception is an act of creation.

9 Perception as Inference Helmholtz. 1821-1894. “Perception as Unconscious Inference”.

10 Ball in a Box. (D. Kersten)

11 How Hard is Vision? The Human Brain devotes an enormous amount of resources to vision. (I) Optic nerve is the biggest nerve in the body. (II) Roughly half of the neurons in the cortex are involved in vision (van Essen). If intelligence is proportional to neural activity, then vision requires more intelligence than mathematics or chess.

12 Vision and the Brain

13 Half the Cortex does Vision

14 Vision and Artificial Intelligence The hardness of vision became clearer when the Artificial Intelligence community tried to design computer programs to do vision. ’60s. AI workers thought that vision was “low- level” and easy. Prof. Marvin Minsky (pioneer of AI) asked a student to solve vision as a summer project.

15 Chess and Face Detection Artificial Intelligence Community preferred Chess to Vision. By the mid-90’s Chess programs could beat the world champion Kasparov. But computers could not find faces in images.

16 Man and Machine. David Marr (1945-1980) Three Levels of explanation: 1. Computation Level/Information Processing 2. Algorithmic Level 3. Hardware: Neurons versus silicon chips. Claim: Man and Machine are similar at Level 1.

17 Vision: Decoding Images

18 Vision as Probabilistic Inference Represent the World by S. Represent the Image by I. Goal: decode I and infer S. Model image formation by likelihood function, generative model, P(I|S) Model our knowledge of the world by a prior P(S).

19 Bayes Theorem Then Bayes’ Theorem states we show infer the world S from I by P(S|I) = P(I|S)P(S)/P(I). Rev. T. Bayes. 1702-1761

20 Bayes to Infer S from I P(I|S) likelihood function. P(S) prior..

21 Ambiguity and Complexity of Images. Similar objects give rise to very different images. Different objects can cause similar images.

22 Ideal Observers The Image of a cylinder is consistent with multiple objects and viewpoints. The likelihood is ambiguous (concave or convex). The prior resolves the ambiguity by biasing towards convex objects viewed from above.

23 Influence Graphs and Visual Tasks Influence Graphs and the Visual Task

24 A Simple Taxonomy of Graphs A Taxonomy of Graphs: B. C. D.

25 Examples of Vision Tasks Visual Inference: (1) Estimating Shape. (2) Segmenting Images. (3) Detecting Faces. (4) Detecting and Reading Text. (5) Parsing the full image – detect and recognize all objects in the image, understand the viewed scene.

26 Segmentation (Level Sets)

27

28 Analysis by Synthesis Invert generation process to parse the image. Probabilistic Grammars for image generation (week 2).

29 Probabilistic Grammars for Images (I) Image are generated by composing visual patterns: (II) Parse an image by decomposing it into patterns.

30 Generative Models for Patterns Examples of images synthesized from generative models (MCMC).

31

32

33 Shape Inference

34 Face and Text Detection.

35 Text Detection

36 Towards Full Image Parsing The image genome project (Zhu). Attempt to determine the grammar for images by interactive parsing of images. Thereby learn the statistical regularities of images – the priors and the representations.

37 Parse graph with horizontal relations

38 Example: street scene

39

40 Database

41 Back to the Brain Top-Level; compare human performance to Ideal Observers. Explain human perceptual biases (visual illusions) as strategies that are “statistical effective”.

42 Brain Architecture The Bayesian models have interesting analogies to the brain. Generative models and analysis by synthesis. This is consistent with top-down processing? (Kersten’s talk next week).

43 Conclusion Vision is unconscious inference. Bayesian Approach lead to vision as analysis by synthesis -- inverting the image generation process. This requires “sophisticated” priors about the statistics of natural images. This can be formulated mathematically in terms of Probabilistic Grammars for image formation. These grammars can be learnt by analysing the “sophisticated” statistics of natural images.


Download ppt "Graphical Models in Vision. Alan L. Yuille. UCLA. Dept. Statistics."

Similar presentations


Ads by Google