Presentation on theme: "Object Comprehension: It’s more than it looks. Bob McMurray Dept. of Brain and Cognitive Sciences University of Rochester."— Presentation transcript:
Object Comprehension: It’s more than it looks. Bob McMurray Dept. of Brain and Cognitive Sciences University of Rochester
What is object comprehension? “The most plausible picture, then, is one of a patchwork of mechanism, including Indexing… ….cooperating in such a way as to yield a total object-apprehending apparatus…” Let’s take this further….
In order to perform in this small set of infant experiments, this patchwork of mechanisms (or sub faculties) must include: visual memory attention scene segregation (into objects) object individuation/indexing 2D -> 3D transformation object category membership (is it a bird, a plane or superman?) object localization naïve physics (how do object relate to each other?) expectation generation (either conceptual or perceptual) goal generation
Some of these are classically thought of as conceptual, others perceptual. Does it make sense to talk about where a single object comprehension faculty lies on this continuum? If the system was modular: We could discuss whether individual cognitive subfaculties are conceptual or not. If the system was interactive It may be hard to discuss even this.
Perception Cognition Feedforward and modular: Cognition = F(perception) Easy to draw line between higher And lower levels of processing.
Perception Cognition Feedforward and modular: Cognition = F(perception) Perception = F 2 (cognition) Cognition = F(F 2 (F(F 2 …(cognition))…) Perception influenced by conceptual processes. Is it still purely “perceptual”?
Plenty of evidence that perceptual/cognitive subfaculties are independent: Object Indexing is Independent of Object Categorization Xu & Carey, 1996 Duck emerges from occluder and returns. Ball emerges and returns. 10 month olds expect 1 object Occluder
Object Indexing is Independent of Object Categorization Levin & Simons (1998): 50% of adults don’t detect when a conversational partner has been switched (behind an occluder). What/Where pathways in visual cortex suggest dissociation between spatial location and object identity.
Evidence for interaction good too: Object identity and segregation: In speech, people segment words better than nonwords (or foreign words). Visual Attention and Language: Visual search impaired or improved by order of items in spoken instructions (Spivey, Tyler and Eberhard, 2001). 2D->3D and Object Identity: Perception of possible objects establishes depth for occluded objects.
Leslie and Scholl: argue for pure indexing as a perceptual process. ignore other cognitive subfaculties necessary for object comprehension. Model can’t account for all the data because it doesn’t cover all of processes necessary for object comprehension
Carey and Xu: Argue that object comprehension has more in common with conceptual representations because it Guides Volitional Actions Encodes Physical Knowledge
Object representations must be stored in memory of objects. Information stored in memory must be generally available (promiscuous) to guide action, compare with other available representations, etc… Guides Volitional Actions
Bernal argues that storage in a memory register does not entail informational promiscuity. Perceptual tasks (like color comparisons) require memory but don’t engage central processes. But any meta-perceptual judgment will engage higher processes. Even purely perceptual tasks can engage conceptual thoughts. There is no way to disconnect perception and cognition when one happens, the other does (even with little information). Guides Volitional Actions
But Bernal is right: storage in a memory register does not entail informational promsicuity. “Memory” exist at many levels of visual processing: For example: Saccadic blindness requires 20- 50ms of visual memory to maintain a coherent scene. Visuo-spatial sketchpad (lasts a few seconds). No reason to think that there is necessarily one memory register. Guides Volitional Actions
For Carey and Xu, the issue is guiding action, not informational promiscuity. But even in our color experiment a volitional action (a button press) is generated! A purely perceptual object comprehension could feed into a system that tracked goals and planned actions? Guides Volitional Actions
Could a non promiscuous representation guide action? Possibly for simple actions: Turtles can accurately localize an itch and scratch—with their brains removed (Gammon & Stein, 2000). Infants Reaching for objects? Object Comprehension Thoughts, Goals, Concepts A purely perceptual object comprehension could feed into a system that tracked goals and planned actions? ActionPerception Action Planning Guides Volitional Actions
Volitional Action, however, is not sufficient to mark object comprehension a conceptual. Object Comprehension Thoughts, Goals, Action Plans Perception Cognition Could a purely perceptual object comprehension output to a system that tracked goals and planned actions? Action Guides Volitional Actions
Object Indexing Thoughts, Goals, Action Planning More Perceptual What if: Indexing is perceptual and Categorization is conceptual? What exactly is guiding the actions? Object Comprehension is too vague… Computational models necessary. Object Categorization Perception More Conceptual Guides Volitional Actions
Bernal is right: Naïve physics could be encoded by low-level perceptual mechanisms. Lawful behavior of objects (e.g. falling when released, not passing through one another) create statistical regularities in the input. Encodes Physical Knowledge An infant sees someone hits their head against a wall 100 times. 100 of those times, the head does not pass through the wall…
Such statistical regularities can easily be learned by perceptual systems. This knowledge is yet another component in a complex system. Is it independent of other mechanism? Which mechanisms is it dependent on (gets input from)? Encodes Physical Knowledge !
I agree with the conclusion that the mechanisms involved in object perception explanation need not be conceptual, This is due to the complexity and interactivity of the system Not to a lack of promiscuity (which might be seen as a lunge towards modularity). Carey and Xu wrongly assume things like action and physical knowledge must make use of conceptual representations.
“Everyone in the debate holds that this capacity falls somewhere between the poles of thought and perception.” What exactly does between mean? The poles of thought and perception: A metatheoretical construct? Hard to answer this question given the elusive, distributed computational nature of object comprehension. A processing model (perception -> thought)? Even harder: known feedback connections and parallel computations make it hard to attach these labels even to subfaculties. What is Object Comprehension?
Bernal’s Approach: “Objective” list of criteria for conceptual thought. Great approach, but criteria are vague and don’t match intuitions/data on concepthood. The list: Independence of here and now. Unit carving Amodal or cross-modal representation With a dash of Modularity What is Object Comprehension?
Persistence, memory, abstractness… This invariance exists at all levels of cognition: During eye-movements, we are momentarily blind. System must form invariant representation of scene that persists across eye-movements. Speech perceived in a speaker invariant way (in the form of words), ignoring pitch, timbre, rate, etc.. We recognize common objects with reference to viewpoint invariant mental concepts. Do we want to argue that these are conceptual, not perceptual processes? Independence of here and now.
“While the output of perceptual processes is continuous, thought carves the world into units.” Output of many perceptual processes is discrete units (not continuous ones) word recognition object recognition Intermediate steps of perceptual processes are also often discrete: Phonemes Geons (ala Biederman) even object indexing is discrete Unit Carving
“While the output of perceptual processes is continuous, thought carves the world into units.” Many conceptual processes must work on continuous information Judgments of size (will the body fit in the trunk?). Judgments of quantity (is there enough sugar in the pie filling?) Unit Carving
Although many concepts certainly are amodal many are not: loud green lukewarm stinky There are many cross-modal connections very early in perceptual processes: parts of V1 (early cortical visual processing) seem to respond to visual stimulation relative to eye position. V3 responds to same stimulation relative to head position. Inferior colliculus computes location based on inter-aural time difference and visual information. Amodal or Cross-modal Representation
Assumption: Conceptual processes are central. Perceptual processes are modular. Not well supported: Cross-modal semantic priming seems conceptual but shows automaticity. Speech perception is perceptual but shows immediate interaction with pragmatic (reference establishing) processing and the lexicon. A dash of modularity
Are interactive processes slower? In Parallel Distributed Systems, sometimes: If a single hypothesis (among many) is heavily favored by one input source competition from others can slow down processing. Does Modularity improve Efficiency? AB B A A BBA I can’t hear anything!
Hard to hear the correct one through all that noise. Modularity can help by encapsulating reliable input/output pairs. Does Modularity improve Efficiency? AB B A A BBA Much Better…
Sometimes they are faster… If a no one input is really definitive, having lots of correlated inputs can help. Does Modularity improve Efficiency? AB B B B BBA Perception seems more likely to have lots of weak, correlated cues than singly definitive cues.
Object comprehension is a patchwork of processes. Hard to nail down a single faculty. Perhaps the question of whether it is perceptual or conceptual is ill formed? Need a more definitive set of criteria: preferably empirically testable ones. Conclusions (if you could call them that)