Presentation is loading. Please wait.

Presentation is loading. Please wait.

Biologically Motivated Computer Vision

Similar presentations

Presentation on theme: "Biologically Motivated Computer Vision"— Presentation transcript:

1 Biologically Motivated Computer Vision
Digital Image Processing Sumitha Balasuriya Department of Computing Science, University of Glasgow

2 General Vision Problem
Machine vision has been very successful in finding solutions to specific, well constrained problems such as optical character recognition or fingerprint recognition. In fact machine vision has surpassed human vision in many such closed domain tasks. However it is only in biology where we find systems that can handle unconstrained, diverse vision problems. How can a biological or machine system which just captures two dimensional visual information from a view of a cluttered field even attempt to reason with and function in the environment? An accurate detailed spatial model of the environment is difficult to compute and the whole problem of scene analysis is ill-posed. A problem is well posed if (1) a solution exists, (2) the solution is unique, (3) the solution depends continuously on the initial data (stability property).

3 Ill-posed problem ? Several possible solutions exist

4 The general vision problem isn’t really solved in biology …
For example I can't build an accurate spatial world model of the scene I look at ... Biological systems have evolved to process visual data to extract just enough information to perform the reasoning for everyday tasks that are part of survival. Visual information is combined with higher level knowledge and other sensory modalities that constrain the reasoning in the solution space and finally makes vision possible.

5 Visual cortex and a bit more …
Direct feedback projections to V1 originate from: V2 (complex features) V3 (orientation, motion, depth) V4 (colour, attention) MT (motion) MST (motion) FEF (saccades, spatial memory) LIP (saccade planning) IT (recognition) Lower visual cortex Feedback from higher cortical areas Frontal cortex  V2, V4, FEF, IT  V1 Face  features  V1

6 Held and Hein, 1963 Newborn kittens Placed in a carousel
One active, other passively towed along Both receive same stimulation The actively moving kitten receives visual stimulation which results from its own movements Only the active kitten develops sensory-motor coordination.

7 Conventional Computer Vision Architecture
Feature Extraction Action Output Input Classification, Recognition, Disparity

8 The Future - Biologically Motivated Computer Vision Architecture
Optical illusions Feedback processing Is there a square, triangle or circle? Hierarchical processing Square triangle s t More abstract features / symbols Other modalities Feedforward processing Lateral processing Input

9 Biologically Motivated Computer Vision Architectures in action
Simple colour cues. Foveated sensors. Also: Learnt arm control, Learn how to act on objects

10 Biologically Inspired features
Machine vision and biological vision systems process similar information (visual scenes) and perform similar tasks (recognition, targeting) Not surprisingly the optimal features that are extracted by many machine vision system look surprising like those found in biology But first ….

11 Why bother with feature extraction?
Why not use the actual image/video itself for reasoning/analysis? INVARIANCE! The information we extract (i.e. the features) from the ‘entity’ must be insensitive to changes. The extracted features might be invariant to rotation and scaling of objects in images, lighting conditions, partial occlusions

12 What features should we extract?
Depends…. Modality (video/image/audio …) Task (eg: topic categorisation/face recognition/ audio compression) Dimensionality reduction / sparsification Invariance vs descriptiveness If the features are too descriptive they can’t generalise to new examples If they generalise to much – everything looks just about the same As the feature we extract becomes more complex/descriptive it will also become less invariant to even minor changes in the entity that we are measuring.

13 Human visual pathway Inspiration for feature extraction methodology
Receptive field: area in the FOV in which stimulation leads to a response in the neuron Circularly symmetric retinal ganglion receptive fields Orientated simple cell cortical receptive fields (similar to Gabor filter)

14 Gabor filter A function f(t) can be decomposed into cosine (even) and sine (odd) functions. Good for defining periodic structures. Not localised. There is an uncertainty relation between a signals specificity in time and frequency. Dennis Gabor defined a family of signals that optimised this trade-off Enables us to extract local features Daugman(1995) defined a 2D filter based on the above which was called a Gabor filter These filters resemble cortical simple cells

15 Gabor filter Localise the sine and cosine functions using a Gabor envelope. σ Gaussian envelope Gaussian envelope Assuming symmetric Gaussian envelope U,V In the Fourier domain the Gabor is a Gaussian centred about the central frequency (U,V). The orientation of the Gabor in the spatial domain is v Modulating cosine Modulating sine u Even symmetric cosine Gabor wavelet Odd symmetric sine Gabor wavelet

16 Spatial Frequency Bandwidth
Spectral (Fourier) Bandwidth at half power point Bandwidth depends on symmetric Gaussian envelope’s sigma. Large sigma results in narrow bandwidth at the Gabor filter exactly filters at its central frequency. Also due to the uncertainty relation a narrow frequency bandwidth will result in reduced spatial localisation by the filter. frequency Spatial filter profile Wide bandwidth Narrow bandwidth Even symmetric cosine Gabor wavelet Odd symmetric sine Gabor wavelet

17 Gabor filter with asymmetric Gaussian
However the Gabor’s Gaussian envelope need not be circular symmetric! An elliptical spatial Gaussian envelope lets us control orientation bandwidth. Better formulation for asymmetric Gaussian envelope Spatial domain along direction of wave propagation fo= central frequency θ = angle γ = sigma in direction of propagation η = sigma perpendicular to direction of propagation Spectral domain along direction of wave propagation Fourier domain

18 Bandwidth of Gabor with asymmetric Gaussian
Half power points Along direction of wave propagation, Perpendicular to direction of wave propagation, Spatial bandwidth in direction of wave propagation Spatial bandwidth perpendicular to wave propagation

19 Orientation Bandwidth
Orientation bandwidth is related to the number of orientations we want to extract. The half power points of the filters should coincide in the spectral domain. If the filter bank consists of k orientated filters, and redundancy in orientation sampling l=rθ small θ Half power v Spatial frequency bandwidth Orientation bandwidth Δθ u ωo

20 Orientation Bandwidth
Spatial domain Half power v Spatial frequency bandwidth Orientation bandwidth Δθ u ωo Frequency domain Filter bank

21 Only using the even symmetric component in the filter bank
Hypercolumn Experiments by Hubel and Weisel (1962,1968) A set of orientation selective units over a common patch of the FOV. Organised as a vertical column in the visual cortex In computational system use information in hypercolumn for higher level reasoning Only using the even symmetric component in the filter bank Feature vector

22 Properties of the hypercolumn feature vector
Invariance to rotation in image plane stimulation Hypercolumn responses Even symmetric detector

23 Cycle to canonical orientation
Invariance to rotation in image plane stimulation Cycle responses in feature vector

24 Properties of the hypercolumn feature vector
Invariance to scaling (i.e. spatial frequency) stimulation central frequency

25 Scale Invariance Feature Transform
Pandemonium model (Selfridge, 1959!) Build ever more complex / abstract features along the hierarchy Aggregate hypercolumn feature vectors to complex feature

26 SIFT features Rotate hypercolumn features to canonical of large support region Rotate descriptor canonical of large support region Complex feature vector Hypercolumn features

27 Recognition Extract SIFT features at corner locations (Harris corner detector), and scale space peaks Training Recognition

28 Recap Biologically motivated computer vision architecture
Feedforward, feedback, lateral processing in architecture Hierarchical processing Feature extraction provides information about entities which are (somewhat!) invariant to changes Gabor filter Hypercolumn feature vector. SIFT features

29 The End

Download ppt "Biologically Motivated Computer Vision"

Similar presentations

Ads by Google