Attention and Selection: Part 2. The increasingly important role played by objects in studies of visual attention Miller’s ‘Magic Number 7’ has continued.

Attention and Selection: Part 2

The increasingly important role played by objects in studies of visual attention Miller’s ‘Magic Number 7’ has continued to haunt us even beyond studies of short-term memory (STM). There is a limitation in visual information processing that is beyond the limitation of acuity and of channel capacity: The perceptual system is limited in what it can individuate and how many of these individuals it can deal with at one time. The capacity to individuate is different from memory capacity and discrimination capacity.  This notion of individuating and of individuals may be related to Miller’s “chunks”, but it has a special role in vision which we will explore later. A chunk is a relatively ill-defined notion in general whereas the units of visual attention are better thought of as objects

Experimental evidence for attentional selection of objects Single Object Advantage: pairs of judgments are faster when both apply to the same perceived object Entire objects acquire enhanced sensitivity from focal attention to a part of the object Single-Object advantage occurs even with generalized “objects” defined in feature space Simultanagnosia and hemispatial neglect show object-based effect Attention moves with Moving Objects  IOR  Object Files  MOT

Single-object superiority even when the shapes are controlled

More controls for the Baylis study… (Baylis, 1994) Controls for separability, convexity, area…

Attention spreads over perceived objects Using a priming method (Egly, Driver & Rafal, 1994) showed that the effect of a prime spreads to other parts of the same visual object compared to equally distant parts of different objects. Spreads to B and not C Spreads to B and not C Spreads to C and not B Spreads to C and not B

There is also evidence from neuropsychology that is consistent with the object-based view Neglect Balint and simultanagnosic patients

Visual neglect syndrome is object-based When a right neglect patient is shown a dumbbell that rotates, the patient continues to neglect the object that had been on the right, even though It is now on the left (Behrmann & Tipper, 1999).

Simultanagnosic (Balint Syndrome) patients only attend to one object at a time Simultanagnosic patients cannot judge the relative length of two lines, but they can tell that a figure made by connecting the ends of the lines is not a rectangle but a trapezoid (Holmes & Horax, 1919).

Balint patients can only attend to one object at a time even if they are overlapping Luria, 1959

Objecthood endures over time Picking out objects is an example of the parsing of a scene into things that are likely to be physical objects. But the same must occur in time – temporal parsing entails solving the correspondence problem Several studies have shown that what counts as an object (as the same object) endures over time and over changes in location;  Certain forms of changes in location as well as disappearances in time preserve objecthood. This gives what we have been calling a “visual object” a real physical-object character and partly justifies our calling it an “object”.

The “Ternus Configuration” to demonstrate the early visual effect of objecthood Short time delays result in “element motion” in which the middle object persists as the “same object” and does not appear to move so the end objects appear to move

Long time delays results in “group motion” in which the middle object does not persist but is perceived as a new object each time it reappears

Inhibition of return appears to be object-based (as well as to some extent location-based) The original study used static objects. Then (Tipper, Driver & Weaver, 1991) showed that IOR moves with the inhibited object.

IOR appears to be object-based (it travels with the object that was attended)

Objects endure despite changes in location; and they carry their history with them! Object File Theory of Kahneman & Treisman Letters are faster to read if they appear in the same box where they appeared initially. Priming travels with the object. According to the theory, when an object first appears, a file is created for it and the properties of the object are encoded and subsequently accessed through this object-file.

Demo of Object File Experiment

Object File 1 Object File 2

The limitation of individuation and selection There are obviously limitations on the input side of vision that depend on the acuity of the sensors and the range of physical properties to which they respond. But there is a limitation beyond that of acuity: The perceptual system is limited in what it can individuate and how many of these individuals it can deal with at one time. The capacity to individuate is different from the capacity to discriminate.  There is reason to think that individuating is a separate and distinct process in early vision

Picking out is different from discriminating: Pick out the third contour from the left

Individuating as a distinct process Individuating has its own psychometric function: The minimum distance for individuating is much larger than for discriminating. It may be that in vision our attention is limited in the number of things we can individuate and simultaneously access (more on this later). But how do you determine what counts as a “thing”? Individuating is a prerequisite for recognition of patterns and other properties defined among a number of individual parts  An example of how we can easily detect patterns if they are defined over a small enough number of parts is subitizing  Another area where the concept of an individual has become important is in cognitive development, where it is clear that babies are sensitive to the numerosity of individual things in a way that is distinct from their perceptual abilities but is limited in its capacity

Pick out 3 dots and keep track of them  In a field of identical elements you can select a number of them and move your attention among them (e.g., “move one up” or Move 2 right” etc) so long as at no time do you have to hold on to more than 4 dots

Pick out 3 dots I will cue and keep track of them  After you pick out the 3 cued dots, I’ll ask you move your attention from the center one. Describe the new relation among the three dots.  In a field of identical elements you can select several of them and move your attention among them (e.g., “move one up” or Move 2 right” etc) so long as at no time do you have to hold on to more than 4 dots (Intriligator & Cavanagh, 2001)

Individuals and patterns Vision does not recognize patterns by applying templates since the size, shape, retinal location, orientation, and other properties must be abstracted away, A pattern is encoded over time (and often over saccades), therefore the visual system must keep track of the individual parts and merge descriptions of the same part at different times and stages of encoding Therefore in order to recognize a pattern, the visual system must pick out individual parts and bind them to the representation being constructed Examples include what Ullman called “visual routines”

Are there collinear items (n>3)?

Several objects must be picked out at once in making relational judgments The same is true for other relational judgments like inside or on-the-same- contour… etc. We must pick out the relevant individual objects first. Respond: Inside-same contour? On-same contour?

When items cannot be individuated, predicates over them cannot be evaluated When items cannot be individuated, predicates over them cannot be evaluated  Do these figures contain one or two distinct curves?  Individuating these curves requires a “curve tracing” operation, so Number_of_curves (C 1, C 2, …) takes time proportional to the length of the shortest curve.

The figure on the left is one continuous curve, the one on the right is two distinct curves – as shown in color.

Subitizing vs Counting How many squares are there? Concentric squares cannot be subitized because individuating them requires the serial operation of curve tracing Subitizing indexed objects is fast, accurate and (relatively) independent of how many items there are. But a prerequisite for subitizing is being able to pick out the relevant individuals. Only the squares on the right can be subitized because picking out concentric items requires serial attention.

Signature subitizing phenomena only appear when objects are automatically individuated and indexed Trick, L. M., & Pylyshyn, Z. W. (1994). Why are small and large numbers enumerated differently? A limited capacity preattentive stage in vision. Psychological Review, 101(1), 80-102.

Example of subitizing popout and non- popout features (Count Pink vs. Count Online)

A different view of the role of attention What’s the equivalent of “chunks” in vision (Visual Chunks?) Attention as the “glue” that allows properties that occur together to be represented as conjoined Experiments showing the special difficulty that vision has in detecting conjunctions of several properties have provided a basis for understanding an important problem in in visual analysis

Encoding conjunctions of properties Experiments showing the special difficulty that vision has in detecting conjunctions of several properties have provided a basis for understanding an important problem in in visual analysis

How are conjunctions of features detected? Read the vertical line of digits in the following display Under these conditions Conjunction Errors are very frequent

Rapid visual search (Treisman) Find the following simple figure in the next slide:

This case is easy – and the time is independent of how many nontargets there are – because there is only one red item. This is called a ‘popout’ search

This case is also easy – and the time is independent of how many nontargets there are – because there is only one right-leaning item. This is also a ‘popout’ search.

Rapid visual search (conjunction) Find the following simple figure in the next slide:

Find the unique item in this slide

Serial vs parallel search? Finding an element that differs from all others in a scene by a single feature – called a single-feature search – is fast, error-free and almost independent of how many nontargets there are; Finding an object that differs from all others by a conjunction of two or more features (and that shares at least one feature with each object in the scene) – called a conjunction search – is usually slow, error-prone, and is worse the more nontargets there are in the scene*. These results suggest that in order to find a conjunction, which requires solving the binding problem, attention has to be scanned serially to all objects. * This way of putting is simplifies things. Under certain conditions the serial-parallel distinction breaks down

Single-Feature vs Conjunction-feature search

Treisman’s Attention as Glue Hypothesis Another answer to what attention is for  The purpose of visual attention is to Bind properties together in order to recognize objects  This is called the “binding problem” or the “many properties problem” and it is of considerable interest to philosophers as well as vision scientists  We can recognize not only the presence of “squareness” and “redness” in our field of view, but we can also distinguish between different ways they may be conjoined

 Break 

The “Binding Problem” This sort of model of early vision does not work: it does not provide information that would allow the “cognitive demons” to solve the binding problem. Information about grouping (conjunction) of properties is lost. “Pandemonium”

How is the binding problem solved? Here is the most common view. To determine whether properties P and Q are conjoined, you detect P, encode its location, then check whether Q is found at that location. A more detailed model is Treisman’s Feature Integration Theory. It postulates separate maps for each feature type, as well as a master map that allows attention to direct the search for features.

The role of attention to location in Treisman’s Feature Integration Theory

Problems with solving the binding problem in terms of co-location of features In order for the conjunction-by-location to work one has to have determined the location of each property  A punctate location will not do since properties have extension and the extension is relevant (cf encode INSIDE )  A region will not do unless one knows the boundaries of the region In either case one has to have identified the relevant object before one can use its location It’s the object that has the properties in question that determines whether the properties are conjoined Properties are conjoined just in case they are properties of the same object

But in encoding properties, early vision can’t just bind them together according to their spatial co-occurrence – even their co- occurrence within the same region. That’s because the relevant region depends on the object. So the selection and binding must be according to the objects that have those properties

The problem is even worse when the relation between items is “Inside”

If co-location of properties will not give us a way of solving the binding problem, what will?  It’s not being at the same location that binds properties together, it’s being properties of the same object  This is why we need object-based selection and why the object-based attention literature is relevant …

An alternative view of how we solve the binding problem  If we assume that only properties of selected objects are encoded and that these are stored in object files associated with each object, then properties that belong to the same object are stored in the same object file, which is why they get bound together  This automatically solves the binding problem!  This is the view exemplified by both FINST Theory (1989) and Object File Theory (1992)  to be described next   The assumption that only properties of selected objects are encoded raises the question of what happens to properties of the other objects or properties in a display (more on this later) The logical answer is that they are not encoded and therefore not available to conceptualization and cognition But this is counter-intuitive!

The FINST theory  Why do we need Indexes?  Some background on nonconceptual selection

We need to be able to pick out individual visual objects directly – without mediation of concepts We need to make nonconceptual contact with the world through perception in order to stop the regress of concepts being defined in terms of other concepts which are defined in terms of still other concepts  What must you be able to do to decide that object O falls under concept C?  Sometimes called the symbol grounding problem The current proposal is that nonconceptual selection of individual objects is the primitive basis for all conceptualization and predication  My argument for nonconceptual selection of token objects as the primitive operation is primarily empirical  I begin with the problem of incremental construction of visual representations

Incremental construction of visual representations and the correspondence problem A personal experience: Drawing geometry diagrams and reasoning from the diagram This problem arises because the visual representation is constructed incrementally over time But visual representations are always constructed over time  Amodal completition (Kanizsa)

Begin by drawing a line….

Now draw a second line….

And draw a third line….

Notice what you have so far…. (noticings are local – you encode what you attend to) There is an intersection of two lines… But which of the two lines you drew are they? There is no way to indicate which individual things are seen again unless there is a way to refer to individual things

Look around some more to see what is there …. Here is another intersection of two lines… Is it the same intersection as the one seen earlier? To be able to tell without a reference to individuals you would have to encode unique properties of the individual lines. Which properties should you encode? L3L3 L6L6

Example of geometrical figure used in solving a problem in plane geometry: Not all of it is seen or noticed at once – coding is incremental Consider what happens when vertices are encountered while the figure is scanned. When are two such encounters of the very same vertex?

When a new property of a vertex is noticed, which part of the current representation should be updated? When should a new vertex-representation be added? Answering these questions requires keeping track of individual distal objects. We proposed the mechanism of visual indexes (FINSTs) for this function.

Keeping track by encoding unique properties of individual items will not work in general A description cannot keep picking out the same individual when the individual is changing its properties unpredictably, even if the description is continually updated  A perceptual representation is always built up over time, so you would need a way to retrieve and update the previous representation of a particular token element when new properties of that token element are noticed  Some writers have postulated a “marking” process for counting or computing relational predicates. But where is the “mark” placed? It can’t be placed in the representation, because its purpose is to keep track of things in the world.  People can pick out several individual items even if the items are in a field of identical items – e.g., pick out a dot in a uniform field of dots (examples later)

* Footnote Notice that in the previous example it would not help if you labeled the diagram as you drew it. Why not?  In order to refer to something by its label, say “L 1 ” you would have to be able to think “X is the thing labeled L 1 ” which requires that X be able to pick out that particular thing. But picking out a particular thing is the original problem!  Another way to use the label would be if you could think “This is line L1” But of course you couldn’t have that thought unless you had a way to think “this”!  See Perry quote  Being able to think “this” is another way to view the very problem under discussion. You need an independent way to pick out and refer to an individual visual object – even if it is labeled! (You also need to do this for several individuals simultaneously – this 1, this 2, … this n – but more on that later).

John Perry gives the following example of how behavior can depend on the realization that a particular object in a description and a particular token thing one sees are one and the same thing. The author of the book Hiker’s Guide to the Desolation Wilderness stands in the wilderness beside Gilmore Lake, looking at the Mt. Tallac trail as it leaves the lake and climbs the mountain. He desires to leave the wilderness. He believes that the best way out from Gilmore Lake is to follow the Mt. Tallac trail up the mountain … But he doesn’t move. He is lost. He is not sure whether he is standing beside Gilmore Lake, looking at Mt. Tallac, or beside Clyde Lake, looking at the Maggie peaks. Then he begins to move along the Mt. Tallac trail. If asked, he would have to explain the crucial change in his beliefs in this way: “I came to believe that this is the Mt. Tallac trail and that is Gilmore Lake”. (Perry, 1979, p 4) The point is that while it’s true that in explaining why people do what they do we need to appeal to what they believe and to their goals, that’s not enough. We also need to appeal to the demonstrative content of their thoughts, to how the thought connects to the world.

Attention and Selection: Part 2. The increasingly important role played by objects in studies of visual attention Miller’s ‘Magic Number 7’ has continued.

Similar presentations

Presentation on theme: "Attention and Selection: Part 2. The increasingly important role played by objects in studies of visual attention Miller’s ‘Magic Number 7’ has continued."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Attention and Selection: Part 2. The increasingly important role played by objects in studies of visual attention Miller’s ‘Magic Number 7’ has continued.

Similar presentations

Presentation on theme: "Attention and Selection: Part 2. The increasingly important role played by objects in studies of visual attention Miller’s ‘Magic Number 7’ has continued."— Presentation transcript:

Similar presentations

About project

Feedback