Presentation on theme: "Biologically Motivated Computer Vision"— Presentation transcript:
1Biologically Motivated Computer Vision Digital Image ProcessingSumitha Balasuriya Department of Computing Science, University of Glasgow
2General Vision Problem Machine vision has been very successful in finding solutions to specific, well constrained problems such as optical character recognition or fingerprint recognition. In fact machine vision has surpassed human vision in many such closed domain tasks.However it is only in biology where we find systems that can handle unconstrained, diverse vision problems.How can a biological or machine system which just captures two dimensional visual information from a view of a cluttered field even attempt to reason with and function in the environment? An accurate detailed spatial model of the environment is difficult to compute and the whole problem of scene analysis is ill-posed.A problem is well posed if (1) a solution exists, (2) the solution is unique, (3) the solution depends continuously on the initial data (stability property).
3Ill-posed problem?Several possible solutions exist
4The general vision problem isn’t really solved in biology … For example I can't build an accurate spatial world model of the scene I look at ...Biological systems have evolved to process visual data to extract just enough information to perform the reasoning for everyday tasks that are part of survival.Visual information is combined with higher level knowledge and other sensory modalities that constrain the reasoning in the solution space and finally makes vision possible.
5Visual cortex and a bit more … Direct feedback projections to V1 originate from: V2 (complex features) V3 (orientation, motion, depth) V4 (colour, attention) MT (motion) MST (motion) FEF (saccades, spatial memory) LIP (saccade planning) IT (recognition)Lower visual cortexFeedback from higher cortical areasFrontal cortex V2, V4, FEF, IT V1Face features V1
6Held and Hein, 1963 Newborn kittens Placed in a carousel One active, other passively towed alongBoth receive same stimulationThe actively moving kitten receives visual stimulation which results from its own movementsOnly the active kitten develops sensory-motor coordination.
8The Future - Biologically Motivated Computer Vision Architecture Optical illusionsFeedback processingIs there a square, triangle or circle?Hierarchical processingSquare trianglestMore abstract features / symbolsOther modalitiesFeedforward processingLateral processingInput
9Biologically Motivated Computer Vision Architectures in action Simple colour cues. Foveated sensors. Also: Learnt arm control, Learn how to act on objects
10Biologically Inspired features Machine vision and biological vision systems process similar information (visual scenes) and perform similar tasks (recognition, targeting)Not surprisingly the optimal features that are extracted by many machine vision system look surprising like those found in biologyBut first ….
11Why bother with feature extraction? Why not use the actual image/video itself for reasoning/analysis?INVARIANCE!The information we extract (i.e. the features) from the ‘entity’ must be insensitive to changes.The extracted features might be invariant to rotation and scaling of objects in images, lighting conditions, partial occlusions
12What features should we extract? Depends….Modality (video/image/audio …)Task (eg: topic categorisation/face recognition/ audio compression)Dimensionality reduction / sparsificationInvariance vs descriptivenessIf the features are too descriptive they can’t generalise to new examplesIf they generalise to much – everything looks just about the sameAs the feature we extract becomes more complex/descriptive it will also become less invariant to even minor changes in the entity that we are measuring.
13Human visual pathway Inspiration for feature extraction methodology Receptive field: area in the FOV in which stimulation leads to a response in the neuronCircularly symmetric retinal ganglion receptive fieldsOrientated simple cell cortical receptive fields (similar to Gabor filter)
14Gabor filterA function f(t) can be decomposed into cosine (even) and sine (odd) functions. Good for defining periodic structures. Not localised.There is an uncertainty relation between a signals specificity in time and frequency.Dennis Gabor defined a family of signals that optimised this trade-offEnables us to extract local featuresDaugman(1995) defined a 2D filter based on the above which was called a Gabor filterThese filters resemble cortical simple cells
15Gabor filterLocalise the sine and cosine functions using a Gabor envelope.σGaussian envelopeGaussian envelopeAssuming symmetric Gaussian envelopeU,VIn the Fourier domain the Gabor is a Gaussian centred about the central frequency (U,V). The orientation of the Gabor in the spatial domain isvModulating cosineModulating sineuEven symmetric cosine Gabor waveletOdd symmetric sine Gabor wavelet
16Spatial Frequency Bandwidth Spectral (Fourier)Bandwidth at half power pointBandwidth depends on symmetric Gaussian envelope’s sigma. Large sigma results in narrow bandwidth at the Gabor filter exactly filters at its central frequency. Also due to the uncertainty relation a narrow frequency bandwidth will result in reduced spatial localisation by the filter.frequencySpatial filter profileWide bandwidthNarrow bandwidthEven symmetric cosine Gabor waveletOdd symmetric sine Gabor wavelet
17Gabor filter with asymmetric Gaussian However the Gabor’s Gaussian envelope need not be circular symmetric! An elliptical spatial Gaussian envelope lets us control orientation bandwidth.Better formulation for asymmetric Gaussian envelopeSpatial domainalong direction of wave propagationfo= central frequency θ = angle γ = sigma in direction of propagation η = sigma perpendicular to direction of propagationSpectral domainalong direction of wave propagationFourier domain
18Bandwidth of Gabor with asymmetric Gaussian Half power pointsAlong direction of wave propagation,Perpendicular to direction of wave propagation,Spatial bandwidth in direction of wave propagationSpatial bandwidth perpendicular to wave propagation
19Orientation Bandwidth Orientation bandwidth is related to the number of orientations we want to extract. The half power points of the filters should coincide in the spectral domain.If the filter bank consists of k orientated filters, and redundancy in orientation samplingl=rθsmall θHalf powervSpatial frequency bandwidthOrientation bandwidthΔθuωo
20Orientation Bandwidth Spatial domainHalf powervSpatial frequency bandwidthOrientation bandwidthΔθuωoFrequency domainFilter bank
21Only using the even symmetric component in the filter bank HypercolumnExperiments by Hubel and Weisel (1962,1968)A set of orientation selective units over a common patch of the FOV.Organised as a vertical column in the visual cortexIn computational system use information in hypercolumn for higher level reasoningOnly using the even symmetric component in the filter bankFeature vector
22Properties of the hypercolumn feature vector Invariance to rotation in image planestimulationHypercolumn responsesEven symmetric detector
23Cycle to canonical orientation Invariance to rotation in image planestimulationCycle responses in feature vector
24Properties of the hypercolumn feature vector Invariance to scaling (i.e. spatial frequency)stimulationcentral frequency
25Scale Invariance Feature Transform Pandemonium model (Selfridge, 1959!)Build ever more complex / abstract features along the hierarchyAggregate hypercolumn feature vectors to complex feature
26SIFT featuresRotate hypercolumn features to canonical of large support regionRotate descriptor canonical of large support regionComplex feature vectorHypercolumn features
27RecognitionExtract SIFT features at corner locations (Harris corner detector), and scale space peaksTrainingRecognition
28Recap Biologically motivated computer vision architecture Feedforward, feedback, lateral processing in architectureHierarchical processingFeature extraction provides information about entities which are (somewhat!) invariant to changesGabor filterHypercolumn feature vector.SIFT features