Presentation is loading. Please wait.

Presentation is loading. Please wait.

Neural Processing in the Ventral Visual Pathway

Similar presentations


Presentation on theme: "Neural Processing in the Ventral Visual Pathway"— Presentation transcript:

1 Neural Processing in the Ventral Visual Pathway
CS332 Visual Information Processing Neural Processing in the Ventral Visual Pathway - neural processing in ventral visual pathway - multiple areas of brain in color, sometimes referred as “what pathway” - visual processing along this pathway contributes to functions like object & face recognition - second pathway projects to one of upper lobes of brain, called dorsal visual pathway, sometimes referred as “where pathway”, focused on things like inferring location of objects in space relative to viewer & analysis of motion, contributes to functions like visually guided motor behavior (communication between pathways) - earlier in course, learned about electrical signals generated by single neurons & how record with microelectrodes – today learn about fMRI, enables us to measure neural activity in whole brain, invaluable tool in exploring neural processing underlying tasks like face recognition in brain - start quick review processing in early visual areas, retina to V1, then hop to areas in yellow, known as IT cortex, focus on regions within IT that appear to be dedicated to analysis of faces - eventually turn to model of neural processing in ventral stream proposed by Tommy Poggio & colleagues, inspired by studies of brain mechanisms

2 Receptive field V1 Hubel & Wiesel found 3 cell classes in V1: simple
complex hypercomplex V1 - light reflected from surfaces in scene projects through lens of eye to retina, thin sheet cells back of eye, light first registered by photoreceptors, then processed by multiple layers of neurons in retina, result of this processing passed along optic nerve, through structure called lateral geniculate nucleus, on to first visual area in brain called primary visual cortex or V1 - response of individual neurons in pathway only affected by light stimulation within limited area of visual field, called receptive field, cells respond differently to light stimulation at different locations within receptive field - described output of retinal ganglion cells as weighted sum of intensity values within receptive field, mimicking operation of convolution of image, pattern of responses approximates convolution with operator whose shape ~Laplacian of Gaussian - image of Tommy Poggio, visualization of result of retinal processing, preserves information about changes of intensity, capture important information about structure of image - result of retinal processing eventually reaches visual cortex at back of head, can probe neural activity of cells in V1 in response to visual input, in early work by Hubel & Wiesel, measured behavior of neurons in monkey & identified 3 classes of neurons with different response pattern, simple/complex/hypercomplex neural processing enhances intensity changes in the image projected onto the retina

3 V1 simple & complex cells
simple cells respond best to edges or bars of a particular position, orientation, and sign of contrast tuning curve - simple cells respond best to edges or bars at particular position within receptive field, having particular orientation, sign of contrast (e.g. dark or light bar, or edge light on left/dark on right), cell will respond most to bar at its preferred orientation and response drops as turn feature away from preferred orientation, measure response of simple cell to many orientations around preferred orientation, construct tuning curve that typically has Gaussian shape - given image of lion, figures depict roughly where simple cells would be most active, if selective to one of these four orientations - complex cells also respond to edges and bars of light at particular orientation, typically have larger receptive fields than simple cells by factor ~ two, reminder: incoming image processed at much higher resolution at center of eye, photoreceptors and neurons in subsequent layers are much more densely packed and have smaller receptive fields in center of visual field, size of receptive fields increases with distance from eye, image more coarse sampled in periphery, when say complex cells have larger receptive fields, mean in the same region of visual field - complex cells also more tolerant to position, example, cell gives strong response to bright bar over range of positions within receptive field - in early work, Hubel & Wiesel proposed that simple cells provide inputs to complex cells, and complex cell with larger receptive field could pool together responses from many simple cells that each respond to vertical bar at different location in larger area, allowing complex cell respond to presence of bar anywhere in receptive field (keep this point in back of mind) complex cells have larger receptive fields and are more tolerant to position complex cell may “pool” inputs from many simple cells within receptive field Kreiman, 2013

4 Ventral visual pathway
- touched on response properties of neurons in V1, from there, visual information is processed by higher cortical areas that include areas labeled V2, V4, three areas of IT cortex, first highlight some general changes in response properties as progress to higher areas - first point related to timing of neural responses - if change in visual image, takes about 50 ms for neurons in V1 to respond to change, for each successive area takes another 10 ms, within this short time frame of 100 ms, can recognize objects in scene, at least at level of object category (face, car, dog, etc.) - numbers in boxes hard to see, but total number of neurons in each area decreases as move up pathway, still many millions neurons in IT, size of receptive fields increases, see in moment that neurons become selective to more complex spatial patterns, also more tolerant to changes in properties like position, size, pose - already saw that complex cells in V1 are more tolerant to position of edge or bar in receptive field, as move to higher areas, find neurons that respond to particular kinds of objects, like faces, regardless of size, or pose relative to viewer - picture oversimplification connections between areas, in addition to flow of information up pathway, also very extensive connections in reverse direction, as well as connections with many other areas of brain Progressing to higher areas along the ventral pathway: response latency increases receptive field size increases neurons become selective to more complex spatial patterns neural responses become more invariant to changes in position, scale, pose, etc.

5 Face selective cells in IT cortex
- Hop over V2/V4 to IT cortex, areas in yellow in earlier picture, this picture of brain flipped horizontally, consider responses of some neurons in region shown in darker blue, sample results from early study by Bob Desimone, Charles Gross & others, among first to explore area systematically using single cell recording in monkeys, two rows show pattern of responses for two different cells, plots of firing rate of cell over time (spikes per sec  frequency), during time period highlighted in blue, monkey was viewing image below trace, how would you characterize behavior of cell in top row? responds to images that preserve essence of face (not jumbled, not hand), human and monkey faces (natural photos and cartoons), bottom row? profile view - figure from review paper by Tsao & Livingstone assigned, collects data from several early studies (dates off, not really early 1900s), dot where face selective cells found, hint at existence of clusters vs. random distribution, later studies explored question more explicitly, using different technique, fMRI Locations of face selective cells in IT, from single cell recordings Desimone et al., 1984 Tsao & Livingstone, 2008

6 low spatial resolution
MRI vs. fMRI functional Magnetic Resonance Imaging (fMRI) low spatial resolution (~1 mm) many images (~ every 2 sec for 5 mins)  best spatial resolution available for measuring neural activity noninvasively in the whole human brain - point to website, nancysbraintalks.mit.edu, lots of great videos from Nancy Kanwisher that include number of videos about fMRI, what is, how design fMRI experiment, and what kinds of things learn about brain using this technique - highly recommend if learn more about technique, just going to give very quick introduction - may be familiar with structural MRI or even had MRI done, uses strong magnetic field and radio waves to create detailed images of organs and tissues in body at high spatial resolution - fMRI uses MRI technology to measure brain activity in whole brain in indirect way - increased neural activity leads to increase blood flow in region, results in change of oxygenation of hemoglobin, increases MRI signal, this change happens at slow time scale, takes few seconds for increase in neural activity to impact blood flow - brain is scanned in slices, in one of three different directions, images captured about every 2 sec, results 3D volume of measurements of brain activity at relatively low spatial resolution, can get down to voxels with ~1 mm on side, but each voxel spans hundreds of thousands of neurons ...  increased neural activity  increased local blood flow  change in oxygenation of hemoglobin  increase in MRI signal  Blood Oxygenation Level Dependent (BOLD) signal is an indirect measure of neural activity  raw data: ~30,000 3D “voxels” (each voxel: hundreds of thousands of neurons)

7 fMRI experiment time functional images ~2s condition 1 condition 2
- absolute MRI signal at particular moment not meaningful, because there are many factors that can impact blood flow, so in typical fMRI experiment, measure difference in signal that’s obtained for two conditions experienced fairly close in time - for example, in an experiment on object recognition, over period of time, might have subject view images of faces in one condition and view images of other objects in another condition, may also have blank periods just viewing noise pattern, ask whether there are regions of the brain where fMRI signal during face viewing much greater than signal measured when other objects are viewed - results typically displayed with color encoding places with large differences in activity between two conditions (activation map), in this case yellow areas have largest difference, shown superimposed on structural MRI image of brain, higher spatial resolution condition 2 time ... ~ 5 min activation map

8 Fusiform Face Area (FFA) in the human brain
- in early fMRI study by Nancy Kanwisher and her colleagues, done with human subjects, discovered an area of brain appeared to be specialized for face perception, called fusiform face area, consists of multiple small regions on both sides of brain, seen superimposed on MRI slices three different orientations, occurs in similar locations in brains of different people

9 Face patches in macaque IT cortex
- fMRI studies in monkeys reveal similar regions in IT cortex, work from early study by Doris Tsao, Winrich Freiwald and others, here 3D picture of inflated brain, monkeys viewed images of faces and other objects (examples here), regions shown in yellow much more active when monkey is viewing faces vs. other objects, set of face patches or large clusters of neurons emerge, pattern face patches is repeatable in different monkeys and analogous to regions found in humans - fMRI coarse spatial and temporal resolution, single voxel hundreds of thousands of neurons, individual neurons within patches may have different response patterns, how probe more detail? Tsao, Freiwald, Tootell, Livingstone, 2006

10 Targeting neurons in middle face patch using single cell recording
Tsao et al. 2006 - in study, first used fMRI to locate patches, then moved monkey outside MR machine and recorded signals from single neurons in those regions, spatially target face areas with electrodes, e.g. record response for individual images of faces vs. others - example of response single neuron to face and each of other photos (altogether 96 photos, within blocks, not just one image, 16 different examples of face, etc.), strength of signal adjusted in graph so highest response is 1.0, here’s visualization of same responses using colored bars to encode signal strength (highest response red, no response white, and if response were inhibited, would appear in blue), using this representation, then created visualization of responses of many cells in middle face patch, each row corresponds to different cell, all cell responses stacked on top of one another, immediate impression have is many cells have much stronger response to face image than others, average response of population

11 The face patch network AL AM
combined microstimulation & fMRI to measure connectivity of face patches used single cell recording to probe viewpoint dependence of neural responses AL AM - just summarize two other aspects of this and another study, one explored connectivity between neurons in different face patches, stimulated cells in one face patch with electrical current and measured subsequent activation of cells in other patches with fMRI, create map of connectivity between areas looks something like this (highly interconnected) - conducted more single cell recordings, 25 different faces at 8 orientations (4 individuals shown here, rotated in horizontal & vertical directions), determined what subset of images yields significant response, arrow shows main progression from earlier to higher levels of pathway, find most neurons in earlier face patches show viewpoint dependence conveyed by high contrast picture of profile face (different cells selective for different poses of face), later find cells with mirror symmetry (e.g. profile face pointing in either direction), then highest/latest face patches, find cells selective for particular identity of face independent of orientation relative to viewer - suggests network of areas appear to be dedicated to processing of faces, within network, responses become more invariant to pose of face while becoming more specialized for particular identity

12 Other observations… intact faces yield larger neural responses than scrambled or inverted faces composite face effect: greater response for aligned vs. misaligned faces IT neurons: response to whole face = sum of responses to parts some face areas show large increase in neural responses when natural face movements are added, e.g. facial expressions human fMRI studies dorsal pathway ventral pathway Bernstein & Yovel, 2015

13 … HMAX model of recognition
model of processing in ventral stream for rapid object categorization ( ms) complex cells: more invariant to feature position and size important property to capture: invariance to position, size, orientation S stages: detect kinds of features, or patterns of multiple features, first edges (simple cells in V1) position/size/orientation C stages: achieve some invariance to position/size via pooling over simple cells at different positions, size (max of S responses is output by C cell) solid: weighted sum (template), dashed: max pool simple cells: detect edges at different positions, orientations, scales early stages are “hard-wired”

14 HMAX model of recognition, cont’d
C2b, C3: pool inputs with MAX S2b, S3, S4: combine more complex features with weights that are learned C2 units: same selectivity as S2 units with more tolerance to position & size (pool S2 units of same selectivity but different positions and sizes) S2: composite feature cells, C2: complex composite cells S2 detect combinations of edges of different orientations (edge representations from C1, ~invariant to position, size) C2 selective for similar patterns but again more invariant to position, size higher levels, wiring and weights between units at different layers are learned S2 units: combinations of C1 units at different orientations with weights that are learned

15 HMAX model of recognition, cont’d
learning of wiring and weights for top-level object classification by supervised learning wiring and weights between C and S units at early levels are also learned e.g. C1  S2, S2b C2  S3 unsupervised learning of feature combinations that appear most often in natural images (1) unsupervised learning of wiring & weights between units from C to S layers, learn combos of features that appear most frequently in natural images (dictionary of common image patterns), from lots of natural images (2) supervised learning of categories at the top good match to neural responses, V1  IT “neural tuning size” (number of C1 inputs to each S2 unit) accounts for holistic effects (composite-face, face-inversion, whole-part) good performance on natural images


Download ppt "Neural Processing in the Ventral Visual Pathway"

Similar presentations


Ads by Google