Presentation is loading. Please wait.

Presentation is loading. Please wait.

Attention in Computer Vision Mica Arie-Nachimson and Michal Kiwkowitz May 22, 2005 Advanced Topics in Computer Vision Weizmann Institute of Science.

Similar presentations


Presentation on theme: "Attention in Computer Vision Mica Arie-Nachimson and Michal Kiwkowitz May 22, 2005 Advanced Topics in Computer Vision Weizmann Institute of Science."— Presentation transcript:

1

2 Attention in Computer Vision Mica Arie-Nachimson and Michal Kiwkowitz May 22, 2005 Advanced Topics in Computer Vision Weizmann Institute of Science

3 Problem definition – Search Order Object recognition NO Vision applications apply “expensive” algorithms (e.g. recognition) to image patches Mostly naïve selection of patches Selection of patches determines number of calls to “expensive” algorithm

4 Problem Definition - Search Order Object recognition NO YES More sophisticated selection of patches would imply less calls to “expensive” algorithm Attention used to efficiently focus on incoming data (better use for limited processing capacity)

5 Problem Definition - Search Order Object recognition 1 23 4 5 6

6 Outline What is Attention Attention in Object Recognition Saliency Model Feature Integration Theory Saliency Algorithm Saliency & Object Recognition Comparison Inner Scene Similarity Model Biological motivation Difficulty of Search Tasks Algorithms FLNN VSLE

7 Outline What is Attention Attention in Object Recognition Saliency Model Feature Integration Theory Saliency Algorithm Saliency & Object Recognition Comparison Inner Scene Similarity Model Biological motivation Difficulty of Search Tasks Algorithms FLNN VSLE

8 Attention Attention implies allocating resources, perceptual or cognitive, to some things at the expense of not allocating them to something else.

9 What is Attention You are sitting in class listening to a lecture. Two people behind you are talking. –Can you hear the lecture? One of them mentions the name of a friend of yours. –How did you know?

10 Attention in Other Applications Face Detection (feature selection) Video Analysis (temporal block selection) Robot Navigation (select locations) …

11 Attention is Directed by: Bottom-up: From small to large units of meaning Rapid Task-independent

12 Attention is Directed by: Top-down: Use higher levels (context, expectation) to process incoming information (Guess) Slower Task dependent http://www.rybak-et-al.net/nisms.html

13 Outline What is Attention Attention in Object Recognition Saliency Model Feature Integration Theory Saliency Algorithm Saliency & Object Recognition Comparison Inner Scene Similarity Model Biological motivation Difficulty of Search Tasks Algorithms FLNN VSLE

14 When is information selected (filtered)? –Early selection (Broadbent, 1958) –Cocktail party phenomenon (Moray, 1959) –Late selection (Treisman, 1960) - attenuation All information is sent to perceptual systems for processing Some is selected for complete processing Some is more likely to be selected Attention WHICH?

15 Parallel Search Is there a green O ? + A. Treisman, G. Gelade, 1980

16 Conjunction Search Is there a green N ? + A. Treisman, G. Gelade, 1980

17 Results A. Treisman, G. Gelade, 1980

18 Conjunction Search + A. Treisman, G. Gelade, 1980

19 Color map Orientation map A. Treisman, G. Gelade, 1980

20 Color mapOrientation map A. Treisman, G. Gelade, 1980

21 Conjunction Search + A. Treisman, G. Gelade, 1980

22 Primitives P PP P P P Intensity P PP P P P Orientation P PP P P P Color x x x x s x Curvature I I I I I Line End Movement x x x x x x

23 Feature Integration Theory Attention - two stages: Attention Serial Processing Localized Focus Slower Conjunctive search Pre-attention Parallel Processing Low Level Features Fast Parallel Search How is the Focus found & shifted? A. Treisman, G. Gelade, 1980

24 Outline What is Attention Attention in Object Recognition Saliency Model Feature Integration Theory Saliency Algorithm Saliency & Object Recognition Comparison Inner Scene Similarity Model Biological motivation Difficulty of Search Tasks Algorithms FLNN VSLE

25 Shifts in Attention “Shifts in selective visual attention: towards the underlying neural circuitry”, Christof Koch, and Shimon Ullman, 1985 C. Koch, and S. Ullman, 1985 Feature Maps Orientation Color Curvature Line end Movement Feature Maps Orientation Color Curvature Line end Movement Feature Maps Orientation Color Curvature Line end Movement Feature Maps Orientation Color Curvature Line end Movement Feature Maps Orientation Color Curvature Line end Movement Central Representation Attention Saliency

26 “A model of saliency-based visual attention for rapid scene analysis” Laurent Itti, Christof Koch, and Ernst Niebur, 1998 L. Itti, C. Koch, and E. Niebur, 1998 Salient - stands out Example – telephone & road sign have high saliency

27 from C. Koch L. Itti, C. Koch, and E. Niebur, 1998

28 Intensity L. Itti, C. Koch, and E. Niebur, 1998 Cells in the retina

29 012 Intensity Create 8 spatial scale using Gaussian pyramids 8 L. Itti, C. Koch, and E. Niebur, 1998

30 Intensity Center-Surround difference operator -Sensitive to local spatial discontinuities -Principle computation in the retina & primary visual cortex -Subtract coarse scale from fine scale + - Fine scale Coarse scale L. Itti, C. Koch, and E. Niebur, 1998 + - fine coarse

31 Toy Example 000 000 000 000 02550 000 Fine levelCoarse level Gauss Pyramid Interpolation Coarse level Point-by-point subtraction 000 02550 000

32 Toy Example 255 Fine level Coarse level Gauss Pyramid Interpolation Coarse level Point-by-point subtraction 000 000 000

33 Intensity Compute:  6 Intensity maps Different ratios – multiscale feature extraction L. Itti, C. Koch, and E. Niebur, 1998

34 Color Same c and s as with intensity  12 Color maps Kandel et al. (2000). Principles of Neural Science. McGraw-Hill/Appleton & Lange L. Itti, C. Koch, and E. Niebur, 1998 More

35 Color - More Same c and s as with intensity  12 Color maps Kandel et al. (2000). Principles of Neural Science. McGraw-Hill/Appleton & Lange L. Itti, C. Koch, and E. Niebur, 1998

36 Orientation Same c and s as with intensity  24 Orientation maps From Visual system presentation by S. Ullman L. Itti, C. Koch, and E. Niebur, 1998 More

37 Reprinted from “Shiftable MultiScale Transforms,” by Simoncelli et al., IEEE Transactions on Information Theory, 1992, copyright 1992, IEEE Orientation – Gabor pyramids

38 from C. Koch L. Itti, C. Koch, and E. Niebur, 1998

39 More Normalization Operator L. Itti, C. Koch, and E. Niebur, 1998

40 Normalization -Normalize to fixed range -Find global maximum -Compute average over all other points -Multiply map by L. Itti, C. Koch, and E. Niebur, 1998

41 Saliency Map L. Itti, C. Koch, and E. Niebur, 1998

42 Conspicuity Maps

43 1. Extract Feature Maps Algorithm- up to now 2. Compute Center- Surround (42) Intensity – I(6) Color – C(12) Orientation – O(24) 3. Combine each channel into conspicuity map 4. Compute saliency by summing and normalizing maps

44 Laurent Itti, Christof Koch, and Ernst Niebur, 1998

45 Leaky integrate-and-fire neurons “Inhibition of return” Winner Takes All Selection (FOA) L. Itti, C. Koch, and E. Niebur, 1998 FOA – Focus Of Attention

46 Results FOA shifts: 30-70 ms Inhibition: 500-900 ms Inhibition of return ends L. Itti, C. Koch, and E. Niebur, 1998

47 Results Spatial Frequency Content, Reinage & Zador, 1997 Image SFC Saliency Output L. Itti, C. Koch, and E. Niebur, 1998

48 Results (a) (b) (c)(d) Image SFC Saliency Output L. Itti, C. Koch, and E. Niebur, 1998 Spatial Frequency Content, Reinage & Zador, 1997

49 Outline What is Attention Attention in Object Recognition Saliency Model Feature Integration Theory Saliency Algorithm Saliency & Object Recognition Comparison Inner Scene Similarity Model Biological motivation Difficulty of Search Tasks Algorithms FLNN VSLE

50 Attention & Object Recognition “Is bottom-up attention useful for object recognition?” –U. Rutishauser, D. Walther, C. Koch and P. Perona, 2004 U. Rutishauser, D. Walther, C. Koch and P. Perona, 2004 Computer recognition Human recognition segmentedCluttered scenes labeledNon labeled Attention

51 Object Recognition saliency model U. Rutishauser, D. Walther, C. Koch and P. Perona, 2004 Growing region in strongest map To Object Recognition (Lowe) More

52 Attention & Object Recognition U. Rutishauser, D. Walther, C. Koch and P. Perona, 2004 Added selection of image region: 1.Strongest contribution map 2.Segment “winning” map 3.create a mask M that modulates contrast in original image

53 Attention & Object Recognition Learning inventories – “grocery cart problem” U. Rutishauser, D. Walther, C. Koch and P. Perona, 2004 Real world scenes 1 image for training (15 fixations) 2-5 images for testing (20 fixations)

54 testing training Object recognition Match

55 “Grocery Cart” Problem U. Rutishauser, D. Walther, C. Koch and P. Perona, 2004 trainingtesting1 testing2

56 “Grocery Cart” Problem Downsides: Bias of human photography Small image set U. Rutishauser, D. Walther, C. Koch and P. Perona, 2004 Solution Robot as acquisition tool

57 Robot - Landmark Learning Objective – how many objects are found and classified correctly? Navigation – simple obstacle avoiding algorithm using infrared sensors U. Rutishauser, D. Walther, C. Koch and P. Perona, 2004

58 Landmark Learning Algorithm: 1.Extract most salient location 2.Has 3 “ key points”? No – back to 1 3.Test patch with all known object models Match – increase object count. No match – learn as new object. U. Rutishauser, D. Walther, C. Koch and P. Perona, 2004

59 Object recognition < 3 key points

60 Landmark Learning With Attention U. Rutishauser, D. Walther, C. Koch and P. Perona, 2004

61 Landmark Learning With Random Selection U. Rutishauser, D. Walther, C. Koch and P. Perona, 2004

62 Landmark Learning - Results U. Rutishauser, D. Walther, C. Koch and P. Perona, 2004

63 Saliency Based Object Recognition Biologically motivated Uses bottom-up, allows combining top-down information Segmentation –Cluttered scenes –Unlabeled objects –Multiple objects in single image Static priority map U. Rutishauser, D. Walther, C. Koch and P. Perona, 2004

64 Outline What is Attention Attention in Object Recognition Saliency Model Feature Integration Theory Saliency Algorithm Saliency & Object Recognition Comparison Inner Scene Similarity Model Biological motivation Difficulty of Search Tasks Algorithms FLNN VSLE

65 Comparison “Comparing attention operators for learning landmarks”, R. Sim, S. Polifroni, G. Dudek, June 2003 Other attention operators for low level features R. Sim, S. Polifroni, G. Dudek, June 2003

66 Comparison R. Sim, S. Polifroni, G. Dudek, June 2003 Edge densityRadial symmetry Smallest eigenvalue Caltech saliency

67 Comparison Landmark learning Training – learn landmarks knowing camera pose Testing - determine pose of camera according to landmarks (pose estimation) R. Sim, S. Polifroni, G. Dudek, June 2003

68 Comparison - Results All operators better than random Radial symmetry worst results Caltech operator performs similar to edge and eigenvalue operators BUT –More complex to implement –More computing time Less preferred candidate in practice R. Sim, S. Polifroni, G. Dudek, June 2003

69 Outline What is Attention Attention in Object Recognition Saliency Model Feature Integration Theory Saliency Algorithm Saliency & Object Recognition Comparison Inner Scene Similarity Model Biological motivation Difficulty of Search Tasks Algorithms FLNN VSLE

70 The Problem Object recognition 1 23 4 5 6

71 Outline What is Attention Attention in Object Recognition Saliency Model Feature Integration Theory Saliency Algorithm Saliency & Object Recognition Comparison Inner Scene Similarity Model Biological motivation Difficulty of Search Tasks Algorithms FLNN VSLE

72 Biological Motivation An alternative approach: continuous search difficulty Based on similarity: –Between Targets and Non-Targets in the scene –Between Non-Targets and Non-Targets in the scene Similar structural units do not need separate treatment Structural units similar to a possible target get high priority Duncan & Humphreys [89]

73 Biological Motivation similar not similar search difficulty target- nontarget similarity nontarget- nontarget similarity Duncan & Humphreys [89]

74 Biological Motivation Explains pop-out vs. serial search phenomenon Non-targets: Target: Duncan & Humphreys [89]

75 Biological Motivation Explains pop-out vs. serial search phenomenon Non-targets: Target: Duncan & Humphreys [89]

76 similar not similar search difficulty Biological Motivation Explains pop-out vs. serial search phenomenon Non-targets: Target: Non-targets: Target: target- nontarget similarity nontarget- nontarget similarity Duncan & Humphreys [89]

77 Using Inner-scene Similarities Every candidate is characterized by a vector of n attributes n-dimentional metric space –A candidate is a point in the space –Some distance function d is associated with the space Avraham & Lindenbaum [04] Avraham & Lindenbaum [05]

78 Using Inner-scene Similarities Example One feature only: object area d: regular Euclidean distance Feature space

79 Outline What is Attention Attention in Object Recognition Saliency Model Feature Integration Theory Saliency Algorithm Saliency & Object Recognition Comparison Inner Scene Similarity Model Biological motivation Difficulty of Search Tasks Algorithms FLNN VSLE

80 Difficulty of Search The difficulty measure is the number of queries until the first target is found Two main factors –Distance between Targets and Non-Targets –Distance between Non-Targets and Non- Targets Feature space

81 Cover Difficulty of Search Feature space c: the number of circles in the cover

82 Difficulty of Search c will be our measure of the search difficulty We need some constraint on the circles’ size! c: the number of circles

83 d t : max-min target distance Difficulty of Search dtdt

84 d t -cover diameter d t Difficulty of Search dtdt

85 Minimum d t -cover c: The number of circles in the minimal d t -cover diameter d t Difficulty of Search dtdt

86 c: the number of circles Difficulty of Search dtdt c = 7 dtdt dtdt

87 c: insects example Difficulty of Search dtdt Feature space c = 3

88 Example: easy search Difficulty of Search dtdt c = 2

89 Example: hard search Difficulty of Search c = # of candidates dtdt

90 Define the Difficulty using c Lower bound: Every search algorithm needs c calls to the oracle before finding the first target in the worst case Upper bound: There is an algorithm that will need max. c calls to the oracle to find the first target, for all search tasks Difficulty of Search

91 Lower bound Every search algorithm needs c calls to the oracle before finding the first target in the worst case Difficulty of Search 1 2 3 4 5 dtdt dtdt dtdt dtdt

92 Upper bound There is an algorithm that will need max. c calls to the oracle to find the first target, for all search tasks FLNN-Farthest Labeled Nearest Neighbor Difficulty of Search

93 Outline What is Attention Attention in Object Recognition Saliency Model Feature Integration Theory Saliency Algorithm Saliency & Object Recognition Comparison Inner Scene Similarity Model Biological motivation Difficulty of Search Tasks Algorithms FLNN VSLE

94 FLNN Farthest Labeled Nearest Neighbor Efficient Algorithms 1 2 3 4 5 c is a tight bound!

95 How do we compute c? Difficulty of Search dtdt –Need to know d t –Compute the minimal d t -cover –Count number of circles c=7 dtdt

96 –Need to know d t –Compute the minimal d t -cover –Count number of circles = c To know the exact d t we need to know all the targets and non-targets, but that’s what we’re looking for… Computing the minimal d t -cover is NP-complete! Ok, that’s easy… Difficulty of Search dtdt How do we compute c?

97 Upper & Lower Bounds on c Upper bounds: –The number of candidates –Know that d t is larger than some d 0 : Can approximate cover size Lower bounds: –FLNN worst case –Know that d t is larger than some d 0 : Can approximate cover size Difficulty of Search

98 Outline What is Attention Attention in Object Recognition Saliency Model Feature Integration Theory Saliency Algorithm Saliency & Object Recognition Comparison Inner Scene Similarity Model Biological motivation Difficulty of Search Tasks Algorithms FLNN VSLE

99 Improving FLNN What’s wrong with FLNN? –Relates only to the nearest known neighbor –Finds only the first target efficiently –Cannot be easily extended to include top- down information Efficient Algorithms

100 VSLE Visual Search using Linear Estimation Each candidate has a prob. to be a target Query the candidate with the highest probability Update other candidates’ prob. according to the known results –Every known target/non-target affects other candidates in reverse order to its distance. If we know results for candidates 1,…,m: Dynamic priority map Efficient Algorithms

101 0.650.4 0.45 0.6 0.5 0.54 0.45 0.51 0.53 0.46 0.58 0.51 0.1 0.4 0.45 0.5 0.56 0.48 0.5 0.56 0.63 0.7 0.68 VSLE Visual Search using Linear Estimation

102 Efficient Algorithms 0.15 0.45 0.6 0.63 0.45 0.65 0.2 0.25 0.53 0.23 0.55 0.1 0.62 0.15 0.59 0.21 0.27 0.65 VSLE Visual Search using Linear Estimation 0.06 0.45 0.12 0.55 0.18 0.95 0.22 0.28 More

103 Combining Top-Down Information Simply specify the initial probabilities to match previous known data Add known target objects to the space. This will alter the probabilities accordingly and speed up search Efficient Algorithms

104 Experiment 1: COIL-100 Efficient Algorithms Columbia Object Image Library [96]

105 Experiment 1: COIL-100 Features: –1 st, 2 nd, 3 rd gaussian derivatives 9 basis filters –5 scales 9x5 = 45 features Euclidean distance Efficient Algorithms Rao & Ballard [95]

106 Experiment 1: COIL-100 Efficient Algorithms 10 cars 10 cups # queries

107 Experiment 2: hand segmented Efficient Algorithms Every large segment is a candidate 24 candidates 4 targets Berkeley hand segmented DB Martin, Fowlkes, Tal & Malik [01]

108 Experiment 2: hand segmented Features: color histograms and separated into 8 bins each 64 features Euclidean distance Efficient Algorithms

109 Experiment 3: automatic color segmentation Automatic color segmented image for face detection Efficient Algorithms

110 Experiment 3: color segmentation 146 candidates 4 features: segment size, mean value of red, green and blue Euclidean distance Efficient Algorithms # queries

111 Combining top-down information Add known targets to the space Efficient Algorithms Without additional targets With additional targets # queries

112 Summary: similarity model Saliency model Biologically motivated Uses bottom-up, allows combining top-down information Segmentation Static priority map Similarity model Biologically motivated Uses bottom-up, allows combining top-down information No segmentation Dynamic priority map Measures the search difficulty

113 Summary What is attention Aid object recognition tasks by choosing the area of interest Two approaches: saliency model and similarity model –Biological motivation –Algorithms

114 Thank You!

115 Linearly Estimating l(x k ) A linear estimation for l(x k ): Which, of course, minimizes the error Solving a set of equations gives an estimation:

116 Linearly Estimating l(x k ) Estimation: Where vector of known labels, and is computed as follows (i,j=1,…,m) : R and r depend only on the distances, computed in advance once


Download ppt "Attention in Computer Vision Mica Arie-Nachimson and Michal Kiwkowitz May 22, 2005 Advanced Topics in Computer Vision Weizmann Institute of Science."

Similar presentations


Ads by Google