High level vision
High level vision Models of object recognition Top down influences Navigation/Movement
Last Time. . .. We spent a lot of time focusing on lines; how you get them, why you would want them, and so on. We need to move from lines to objects. How do you recognize an object from an organization of lines? How does perception connect to memory?
Models of object recognition Template Feature “New wave” of feature models (3D features)
Template model
Template--problems Problems: Size Orientation Need too many templates
Feature model--pandemonium
Feature models Good: visual input does seem to be decomposed into features Good: Physiological evidence about simple features from Hubel & Wiesel Problems: orientation missing features natural objects
Natural objects: What are the features of a dog? Nose Ear Front Leg Tail Back Leg
Principle from Gestalt Psych Good continuation B D
Good continuation can be used to find the parts of objects
“new wave” of feature models These models use three-dimensional features.
Biederman’s geon model Geons Simple objects You usually only need to see the edges of a geon Geons have properties that are invariant to rotations
Experiment Which can be better identified at a very brief exposure?
Problems with Geons Do geons really represent all shapes? How are relationships among geons coded?
An alternative: Local viewpoints Note that you can identify objects from many different orientations. Templates & Feature models couldn’t account for this—geons can. BUT how good are you at doing this really?
Alternative Some researchers have suggested that object representations are NOT viewpoint independent. Rather, we store views of objects the way we see them. HUH? Isn’t that the template theory? The difference is that you do some “fixing” (size, rotation) of the image to fit the template
Tarr’s local view experiments
Tarr Results
Problems with local view Chicken & egg: how do you know how to rotate the image before you can identify it? What is stored is clearly not literal pictures…but what is it? How is what you see and what is in memory matched?
Chicken & egg problem. . . Bottom up processing refers to beginning with relatively raw, unprocessed sensory information, and building towards more conceptual representations. Top-down processing refers to conceptual knowledge influencing the processing or interpretation of lower-level perceptual processes
NOTE--we’ve been acting as though all processing were bottom up.
Example: ambiguous figures
Example
Example:
It appears that in top down processing you use conceptual information to generate hypotheses about what the stimulus might be, then test these hypotheses
More formal work Watch for the object appearing
The Parsing Paradox If perceptual organization is a matter of mapping sensations onto structural schema, which happens first: interpreting the whole or interpreting the parts? How can someone recognize a face until he has first recognized the eyes, nose, mouth and ears? Then again, how can you recognize the parts until you know that they are part of a face? --Stephen Palmer
Question: do you process the top-down information at the same time as the bottom-up info? You’ll see a circle--try to identify the object that appears in the circle.
The result: people are better at identifying the object when the scene makes sense, compared to when it’s jumbled
How is this possible? Word superiority effect It is easier to recognize a letter in a word than in isolation.
Identify the letter that will appear in the circle
TAKE
WOLP
Word superiority effect Faster and more reliable in identifying a letter when it’s part of a word than a non-word. Isn’t there the same chicken-egg problem? Don’t you need to know the letters to identify the word? So then how is the word helping to identify the letters (which you already know?)
Model of word identification
Navigation vs. Object identification There is increasing evidence that spatial information that helps us get around is independent of the information that helps us identify objects.
Mishkin & Ungerlieder
Mishkin & Ungerlieger
Mishkin & Ungerliedger Spatial Object