Presentation is loading. Please wait.

Presentation is loading. Please wait.

Neural Models of Visual Attention John K. Tsotsos Center for Vision Research York University, Toronto, Canada Marc Pomplun Department of Computer Science.

Similar presentations


Presentation on theme: "Neural Models of Visual Attention John K. Tsotsos Center for Vision Research York University, Toronto, Canada Marc Pomplun Department of Computer Science."— Presentation transcript:

1 Neural Models of Visual Attention John K. Tsotsos Center for Vision Research York University, Toronto, Canada Marc Pomplun Department of Computer Science University of Massachusetts at Boston

2 Müller (1873) Exner (1894) Wundt (1902) Pillsbury (1908) Broadbent 1958 (Early Selection) Deutsch, Deutsch & Norman 1963/68 (Late Selection) Treisman 1964 Milner 1974 * Grossberg 1976+ (Adaptive Resonance Theory) * Treisman & Gelade 1980 (Feature Integration Theory) von der Malsburg 1981+ (Correlation Theory) * Crick 1984 * Koch and Ullman 1985 Anderson and Van Essen 1987 (Shifter Circuits) * Sandon 1989 ‡ Wolfe et al. 1989+ (Guided Search 1.0, 2.0. 3.0) Phaf, Van der Heijden, Hudson 1990 (SLAM) Tsotsos et al. 1990+ (Selective Tuning) * ‡ Mozer 1991 (MORSEL) Ahmad 1991 (VISIT) * Olshausen, Anderson & Van Essen 1993 * ‡ Niebur, Koch et al. 1993+ * Desimone & Duncan 1995 (Biased Competition) * Postma 1995 (SCAN) * ‡ Schneider 1995 (VAM) * LaBerge 1995 * Itti & Koch 1998 ‡ Cave et al. 1999 (FeatureGate) Theories/Models The number of models that address the neurobiology of visual attention is small (* in the list). The number that have real computational tests on actual images is even smaller (‡ in the list). However, many relevant ideas have appeared in psychological models. A selected historical perspective on the ideas important to the modelling task appears in the following slides.

3 Models of visual attention need to include solutions to or exhibit observed neurobiological/psychophysical performance for: Models of visual attention need to include solutions to or exhibit observed neurobiological/psychophysical performance for: F computational complexity of visual processes F information routing through the processing hierarchy F attentional control F time course of attentive modulation F single cell attentive modulation F attentive modulation in (apparently) all visual areas F suppressive surround effects F serial/”parallel” visual search performance F binding of features to objects Issues

4 Format of Overview Not all models are included, only those that have historical importance or that claim neuro-psycho relevance importance or that claim neuro-psycho relevance Due to space and time limits, each model is described only with: 1. key references 2. key ideas 3. neurobiological relationship (where possible) ( √ has supporting evidence X does not have supporting evidence X does not have supporting evidence ? open question) ? open question) Note that this can only be regarded as a partial review!

5 Koch and Ullman 1985 Koch, C., Ullman, S. (1985). Shifts in selective visual attention: Towards the underlying neural circuitry, Human Neurobiology 4, 219-227. Key ideas: - saliency map (Treisman’s map) ? - winner-take-all competition √ (Findlay 1996, Lee et al. 1999) - WTA selects items to route to central representation X - inhibition of return for shifts ? - time to move attention proportional to logarithmic in distance between stimuli X (Krose & Julesz 1989) - no single cell modulations X

6 Anderson and Van Essen 1987 Shifter Circuits Anderson, C., Van Essen, D. (1987). Shifter Circuits: a computational strategy for dynamic aspects of visual processing, Proc. Natl. Academy Sci. USA 84: 6297-6301. Key ideas: - information routing is accomplished by simple shifting circuits starting in the LGN and input layers of primate visual area V1 X the LGN and input layers of primate visual area V1 X - realignment is based on the preservation of spatial relationships - stages linked by diverging excitatory inputs. - direction of shift by inhibitory neurons that selectively suppress sets of ascending inputs. ascending inputs. - stages are grouped into small and large scale shifts. - control comes from pulvinar ?

7

8 Tsotsos 1990+ Selective Tuning Model Tsotsos, J.K., Analyzing Vision at the Complexity Level, Behavioral and Brain Sciences 13-3, p423 - 445, 1990. Tsotsos, J.K. (1993). An Inhibitory Beam for Attentional Selection, in Spatial Vision in Humans and Robots, ed. by L. Harris and M. Jenkin, p313 - 331, Cambridge University Press. Tsotsos, J.K., Culhane, S., Wai, W., Lai, Y., Davis, N., Nuflo, F. (1995). Modeling visual attention via selective tuning, Artificial Intelligence 78(1-2),p 507 - 547. Tsotsos, J.K. (1995). Towards a Computational Model of Visual Attention, in Early Vision and Beyond, ed. by T. Papathomas, C, Chubb, A. Gorea, E. Kowler, MIT Press/Bradford Books, p207 - 218. Tsotsos, J.K., Culhane, S., Cutzu, F., From Theoretical Foundations to a Hierarchical Circuit for Selective Attention, Visual Attention and Cortical Circuits, ed. by J. Braun, C. Koch & J. Davis, MIT Press (in press).

9 neuron ‘sees’ this receptive field subject ‘attends’ to single item Key ideas: - attention modulates neurons to earliest levels; wherever there is a many-to-one mapping √ many-to-one mapping √ - signal interference controlled by surround inhibition throughout processing network throughout processing network - task knowledge biases computations throughout processing network - inhibition of connections not units √ Hernandez-Peon, Scherrer, Jouvet (1956) √ Hernandez-Peon, Scherrer, Jouvet (1956) - attentional control is local, distributed and internal - competition is based on WTA (different form than previous models) (different form than previous models) - pyramid representation with reciprocal convergence and divergence √ Salin &Bullier(1995) √ Salin &Bullier(1995)

10 The basic idea (BBS 1990) not the same as von derMalsburg - only connections leading to interference are inhibited; other unattended ones left alone

11 processing pyramid inhibited pathways pass pathways unit of interest at top input √ Caputo & Guerra 1998 Bahcall & Kowler 1999 Vanduffel, Tootell, Orban 2000 Smith et al. 2000 √ Kastner, De Weerd, Desimone, Ungerleider, 1998

12 top-down, coarse-to-fine WTA hierarchy for WTA hierarchy for incremental selection and incremental selection and localization localization unselected connections are unselected connections are inhibited inhibited WTA achieved through local gating networks Hierarchical Winner-Take-All Simulation

13 unit and connection in the interpretive network unit and connection in the gating network unit and connection in the top-down bias network layer +1 layer  -1 layer I Selection Circuits

14 Search for Blue Regions

15 Predictions from 1990 paper: attention in all visual areas, down to earliest competition can be biased by task inhibition of unselected connections within beam inhibitory surround impairs perception around attended item distractor effects depend on distractor-target separation

16 Olshausen, Anderson & Van Essen 1993 Olshausen, B., et al. (1993). A neurobiological model of visual attention and invariant pattern recognition based on dynamic routing of information, J. of Neuroscience, 13(1):4700-4719. Key ideas: - implementation of shifter circuits - forms position and scale invariant representations at the output layer X - control neurons, originating in the pulvinar, dynamically modify synaptic weights of intracortical connections to achieve routing ? weights of intracortical connections to achieve routing ? - the topography of the selected portion of the visual field is preserved - uses Koch & Ullman mechanism (luminance saliency only) for selection - associative recognition at output layer at output layer

17 Olshausen seeks to achieve translation-rotation invariant recognition only attended item reaches output layer

18 Itti 1998 Itti, L., Koch, C., Niebur, E. (1998). A model for saliency-based visual attention for rapid scene analysis, IEEE Trans. Pattern Analysis and Machine Intelligence 20, 1254-1259. Key ideas: - a newer implementation of Koch and Ullman’s scheme - fast and parallel pre-attentive extraction of visual features across 50 spatial maps (for orientation, intensity and color, at six spatial scales) maps (for orientation, intensity and color, at six spatial scales) - features are computed using linear filtering and center-surround structures - these features form a saliency map ? - Winner-Take-All neural network to select the most conspicuous image location location - inhibition-of-return mechanism to generate attentional shifts - saliency map topographically encodes for the local conspicuity in the visual scene, and controls where the focus of attention is currently deployed scene, and controls where the focus of attention is currently deployed

19

20

21 Conclusions Several ideas have endured: F Winner-Take-All for selection (competition) F Hierarchies F Inhibition of return to force serial search F Some kind of ‘gating’ process F Inhibitory surrounds F However, modeling seems to be still in its early days F Progress will depend on whether modelers and experimenters can work together


Download ppt "Neural Models of Visual Attention John K. Tsotsos Center for Vision Research York University, Toronto, Canada Marc Pomplun Department of Computer Science."

Similar presentations


Ads by Google