Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computer Vision, Part 2 Object recognition and scene “understanding”

Similar presentations


Presentation on theme: "Computer Vision, Part 2 Object recognition and scene “understanding”"— Presentation transcript:

1 Computer Vision, Part 2 Object recognition and scene “understanding”

2 What makes object recognition a hard task for computers?

3 HMAX Riesenhuber, M. & Poggio, T. (1999), “Hierarchical Models of Object Recognition in Cortex” Serre, T., Wolf, L., Bileschi, S., Risenhuber, M., and Poggio, T. (2006), “Robust Object Recognition with Cortex-Like Mechanisms” HMAX: A hierarchical neural-network model of object recognition. Meant to model human vision at level of “immediate recognition” capabilities of ventral visual pathway, independent of attention or other top-down processes. Also called “Standard Model” (because it incorporates the “standard model” of visual cortex) Inspired by earlier “Neocognitron” model of Fukushima (1980)

4 General ideas behind model “Immediate” visual processing is feedforward and hierachical: low levels detect simple features, which are combined hierarchically into increasingly complex features to be detected Layers of hierarchy alternate between “sensitivity” (to detecting features) and “invariance” (to position, scale, orientation) Size of receptive fields increases along the hierarchy Degree of invariance increases along the hierarchy

5 The HMAX model for object recognition (Riesenhuber, Poggio, Serre, et al.)

6 The HMAX model for object recognition (Riesenhuber, Poggio, Serre, et al.) Image (gray-scale)

7 S1 layer Edge detectors The HMAX model for object recognition (Riesenhuber, Poggio, Serre, et al.) Image (gray-scale)

8 S1 layer Edge detectors The HMAX model for object recognition (Riesenhuber, Poggio, Serre, et al.) Image (gray-scale) C1 layer Max over local S1 units Layers alternate between “specificity” and “invariance” over position, scale, orientation

9 S1 layer Edge detectors The HMAX model for object recognition (Riesenhuber, Poggio, Serre, et al.) Image (gray-scale) C1 layer Max over local S1 units S2 layer Prototypes (small image patches) Layers alternate between “specificity” and “invariance” over position, scale, orientation

10 S1 layer Edge detectors The HMAX model for object recognition (Riesenhuber, Poggio, Serre, et al.) Image (gray-scale) C1 layer Max over local S1 units S2 layer Prototypes (small image patches) C2 layer Max activation over each prototype Layers alternate between “specificity” and “invariance” over position, scale, orientation

11 S1 layer Edge detectors The HMAX model for object recognition (Riesenhuber, Poggio, Serre, et al.) Image (gray-scale) C1 layer Max over local S1 units S2 layer Prototypes (small image patches) C2 layer Max activation over each prototype Classification layer Object or image classification Layers alternate between “specificity” and “invariance” over position, scale, orientation

12 S1 layer Edge detectors The HMAX model for object recognition (Riesenhuber, Poggio, Serre, et al.) Image (gray-scale) C1 layer Max over local S1 units S2 layer Prototypes (small image patches) C2 layer Max activation over each prototype Classification layer Object or image classification Layers alternate between “specificity” and “invariance” over position, scale, orientation Job of HMAX is to produce a higher-level representation of an image that will be useful for classification.

13 S1 layer Edge detectors 4 orientations, 16 scales Image (gray-scale)

14 Etc.: 16 scales One S1 receptive field:

15 MAX S1 layer Edge detectors 4 orientations, 16 scales C1 layer Max activation over local S1 units (local position, scale) 4 orientations, 8 scales Image (gray-scale)

16 S2 layer Calculate similarity to prototype (radial basis function) 4 orientations, 8 scales … C1 layer Max activation over local S1 units (local position, scale) 4 orientations, 8 scales S2 unit: Calculate similarity to prototype for each “pooled” position in C1 layer.

17 S2 layer Calculate similarity to prototype (radial basis function) 4 orientations, 8 scales … Prototypes (~1000, chosen from image collection, translated to C1 features) C1 layer Max activation over local S1 units (local position, scale) 4 orientations, 8 scales S2 unit: Calculate similarity to prototype for each “pooled” position in C1 layer.

18 S2 layer Calculate similarity to prototype (radial basis function) 4 orientations, 8 scales … Prototypes (~1000, chosen from image collection, translated to C1 features) C1 layer Max activation over local S1 units (local position, scale) 4 orientations, 8 scales S2 unit: Calculate similarity to prototype for each “pooled” position in C1 layer. Similarity: Radial basis function:

19 S2 layer Calculate similarity to prototype (radial basis function) 4 orientations, 8 scales … C2 layer Max activation over position, orientation, scale S2 1 S2 2 … MAX (1 value) MAX (1 value) …

20 C2 layer Max over position, orientation, scale ….32 Support Vector Machine classification (e.g., dog / not dog)

21 Streetscenes “scene understanding” system (Bileschi, 2006) Use HMAX + SVM to identify object classes: Car, Pedestrian, Bicycle, Building, Tree

22 How Streetscenes Works (Bileschi, 2006) 1. Densely tile the image with windows of different sizes. 2. C1 and C2 features are computed in each window. 3. The features in each window are given as input to each of five trained support vector machines 4. If any return a classification with score above a learned threshold, that object is said to be “detected”. …

23 Object detection (here, “car”) with HMAX model (Bileschi, 2006)

24 Sample of results from HMAX model (Serre et al., 2006)


Download ppt "Computer Vision, Part 2 Object recognition and scene “understanding”"

Similar presentations


Ads by Google