Presentation is loading. Please wait.

Presentation is loading. Please wait.

© 2009 Robert Hecht-Nielsen. All rights reserved. 1 Andrew Smith University of California, San Diego 10.14.09 Building a Visual Hierarchy.

Similar presentations


Presentation on theme: "© 2009 Robert Hecht-Nielsen. All rights reserved. 1 Andrew Smith University of California, San Diego 10.14.09 Building a Visual Hierarchy."— Presentation transcript:

1 © 2009 Robert Hecht-Nielsen. All rights reserved. 1 Andrew Smith University of California, San Diego 10.14.09 Building a Visual Hierarchy

2 © 2009 Robert Hecht-Nielsen. All rights reserved. 2 Outline  Building A Visual Hierarchy  Learning layer-by-layer  Inference – filling in a missing segment of an image  Examples \  Applications/Products & Future work

3 © 2009 Robert Hecht-Nielsen. All rights reserved. 3 Choosing an appropriate problem  We want to:  Model human visual processes.  Understand vision in terms of Confabulation Theory.  Build practical applications.  Begin basis for much deeper research.  Answer:  Build image modeling system.  Represent images in terms of textural components (low statistical order).  Represent images as symbolic (discrete) tuples.

4 © 2009 Robert Hecht-Nielsen. All rights reserved. 4 Machine Vision vs. Biological Vision  Machine Vision  Pixels --- local representation.  Orthogonal  Biological Vision  Filter/Feature responses  Massively overcomplete/non-orthogonal

5 © 2009 Robert Hecht-Nielsen. All rights reserved. 5 Confabulation & vision (Pixels → Modules & Symbols)‏  Features (symbols) develop in a layer of the hierarchy as commonly seen inputs from their inputs.  Knowledge links are simple conditional probabilities between symbols:  p(  |  ) where  and  are symbols in connected modules  All knowledge can therefore be learned by simple co-occurrence counting.  p(  |  ) = C( ,  ) / C(  )‏  Confabulation operations:  Given evidence,  find the answer  that maximizes: p(  |  ) p(  |  ) p(  |  ) p(  |  )‏

6 © 2009 Robert Hecht-Nielsen. All rights reserved. 6 Building a vision hierarchy Can no longer use SSE to evaluate model [ SSE maximizes p(  | , ,  ) ] Instead, make use of generative model: –Always be able to generate a plausible image.

7 © 2009 Robert Hecht-Nielsen. All rights reserved. 7 Data set 4,300 1.5 Mpix natural images (BW)‏

8 © 2009 Robert Hecht-Nielsen. All rights reserved. 8 Vision Hierarchy – level “0”  We know the first transformation from neuroscience research: simple cells approximate Gabor filters.  5 scales, 16 orientations (odd + even)‏  Parameters picked to closely resemble feline simple cells.  Same approach is used elsewhere in lab. [Minnett, et al.]

9 © 2009 Robert Hecht-Nielsen. All rights reserved. 9 Vision Hierarchy – level “0” Does the full convolution preserve information in images? (inverted by LS)‏ Very closely.

10 © 2009 Robert Hecht-Nielsen. All rights reserved. 10 Vision Hierarchy – level “0” We can do even better by super-sampling an image before encoding:

11 © 2009 Robert Hecht-Nielsen. All rights reserved. 11 Vision Hierarchy – level “0” Supersampling RMSE: 1x: 0.0202 2x: 0.0081 3x: 0.0051 4x: 0.0044 5x: 0.0038

12 © 2009 Robert Hecht-Nielsen. All rights reserved. 12 Inverting Gabor Representations  Studied by Daugman  Simple cells (found in 1950s) re-represent “pixel” data, were first characterized by Daugman as Gabor Logons in 1980's.  Attempted to answer “How much information is lost?”  “not much!” -- Able to completely reconstruct images. (i.e. what we've just seen in previous few slides)‏  Frame Analysis can show:  Can mathematically prove when complete inversion is possible.  Optimal linear inverse.

13 © 2009 Robert Hecht-Nielsen. All rights reserved. 13 Vision Hierarchy – level 1 We now have a simple-cell like representation. How to create a symbolic representation (“Complex Cells”)? Apply principle of Confabulation Theory: Collect common sets of inputs from simple cells: similar to a Vector Quantizer. Keep the 5-scales separate –(quantize 16-dimensions, not 80)‏

14 © 2009 Robert Hecht-Nielsen. All rights reserved. 14 Vision Hierarchy – level 1 To create actual symbols, we use a vector quantizer –Trade-offs (threshold of quantizer) : Number of symbols Preservation of information Probability accuracy Solution Use angular distance metric (dot-product)‏ –Keep only symbols that occurred in training set more than 200 times, to get accurate p(  ). –After training, ~95% of samples should be within threshold of at least one symbol. –Pick a threshold so images can be plausibly generated.

15 © 2009 Robert Hecht-Nielsen. All rights reserved. 15 Vision Hierarchy – level 1 Oops! Ignoring wavelet magnitude makes all “texture features” equally prominent.

16 © 2009 Robert Hecht-Nielsen. All rights reserved. 16 Vision Hierarchy – level 1 Symbolic representation can generate plausible images: A theory of animal vision that actually demonstrates that animals can see!

17 © 2009 Robert Hecht-Nielsen. All rights reserved. 17 Vision Hierarchy – level 1 ~8,000 symbols are learned for each of the 5 scales. Complex local features develop. (unlike PCA re- representations & ICA representations)‏

18 © 2009 Robert Hecht-Nielsen. All rights reserved. 18 Vision Hierarchy – level 1 Now image is re- represented as 5 “planes” of symbols:

19 © 2009 Robert Hecht-Nielsen. All rights reserved. 19 Knowledge links: Learn which symbols may be next to which symbols (conditional probabilities)‏ Learn which symbols may be over/under which symbols. Go out to ‘radius’ 7.  Consistent with cortical representation of knowledge  Very large (10s of GB) set of knowledge.

20 © 2009 Robert Hecht-Nielsen. All rights reserved. 20 Texture modeling – (inference)‏  What if a portion of our image symbol representation is damaged?  Blind spot  CCD defect  brain lesion  We can use confabulation (generation) to infer a plausible replacement.

21 © 2009 Robert Hecht-Nielsen. All rights reserved. 21 Texture modeling – Inference 1 Fill in missing region by confabulating from lateral & different scale neighbors (rad 5).

22 © 2009 Robert Hecht-Nielsen. All rights reserved. 22 Texture modeling

23 © 2009 Robert Hecht-Nielsen. All rights reserved. 23 Texture modeling

24 © 2009 Robert Hecht-Nielsen. All rights reserved. 24 Texture modeling

25 © 2009 Robert Hecht-Nielsen. All rights reserved. 25 More Examples 1/7 (find the replacements)‏

26 © 2009 Robert Hecht-Nielsen. All rights reserved. 26 More Examples 1/7 (replacement locations)‏

27 © 2009 Robert Hecht-Nielsen. All rights reserved. 27 More Examples 2/7 (find the replacements)‏

28 © 2009 Robert Hecht-Nielsen. All rights reserved. 28 More Examples 2/7 (replacement locations)‏

29 © 2009 Robert Hecht-Nielsen. All rights reserved. 29 More Examples 3/7 (find the replacements)‏

30 © 2009 Robert Hecht-Nielsen. All rights reserved. 30 More Examples 3/7 (replacement locations)‏

31 © 2009 Robert Hecht-Nielsen. All rights reserved. 31 More Examples 4/7 (find the replacements)‏

32 © 2009 Robert Hecht-Nielsen. All rights reserved. 32 More Examples 4/7 (replacement locations)‏

33 © 2009 Robert Hecht-Nielsen. All rights reserved. 33 More Examples 5/7 (find the replacements)‏

34 © 2009 Robert Hecht-Nielsen. All rights reserved. 34 More Examples 5/7 (replacement locations)‏

35 © 2009 Robert Hecht-Nielsen. All rights reserved. 35 More Examples 6/7 (find the replacements)‏

36 © 2009 Robert Hecht-Nielsen. All rights reserved. 36 More Examples 6/7 (replacement locations)‏

37 © 2009 Robert Hecht-Nielsen. All rights reserved. 37 Texture modeling  Conclusions  This visual hierarchy does an excellent job at capturing an image up to a certain order of complexity.  Given this visual hierarchy and its learned knowledge links, missing regions could plausibly be filled in. This could be a reasonable explanation for what animals do.  Preparing for publication (IEEE Transactions on Image Processing), with help from Professor Serge Belongie (CSE).  Last hurdle to graduation!

38 © 2009 Robert Hecht-Nielsen. All rights reserved. 38 Texture modeling – Inference 2  Super-resolution:  If we have a low resolution image, can we confabulate (generate) a high-resolution version?  “Space out” the symbols, and confabulate values for the new neighbors

39 © 2009 Robert Hecht-Nielsen. All rights reserved. 39 Texture modeling

40 © 2009 Robert Hecht-Nielsen. All rights reserved. 40 Texture modeling

41 © 2009 Robert Hecht-Nielsen. All rights reserved. 41 Texture modeling  Super-resolution: conclusions  Having learned the statistics of natural images, the generative properties of this hierarchy can confabulate (generate) plausible high-resolution versions of its input.

42 © 2009 Robert Hecht-Nielsen. All rights reserved. 42 References

43 © 2009 Robert Hecht-Nielsen. All rights reserved. 43 Applications  DVD  HD “upconversion” -- exist in current DVD players  Intelligent Pixel Creation (superresolution)‏  Intelligent Frame Interpolation (increasing frame rates)‏  Imagine an online ONR service available to all US govt. agencies...  Generating high-resolution images from damaged, low-resolution (in specific contexts).  Analyzing surveillance data.  Low-resolution video  High-resolution image

44 © 2009 Robert Hecht-Nielsen. All rights reserved. 44 The next level… Level 2 symbol hierarchy Collect commonly recurring regions of level 1 symbols. Symbols at Level 2 will fit together like puzzle pieces. Thank you!


Download ppt "© 2009 Robert Hecht-Nielsen. All rights reserved. 1 Andrew Smith University of California, San Diego 10.14.09 Building a Visual Hierarchy."

Similar presentations


Ads by Google