11. System level organization and coupled networks

11. System level organization and coupled networks
Fundamentals of Computational Neuroscience, T. P. Trappenberg, 2002. Lecture Notes on Brain and Computation Byoung-Tak Zhang Biointelligence Laboratory School of Computer Science and Engineering Graduate Programs in Cognitive Science, Brain Science and Bioinformatics Brain-Mind-Behavior Concentration Program Seoul National University This material is available online at

Outline 11.1 11.2 11.3 11.4 11.5 11.6 System level anatomy of the brain Modular mapping networks Coupled attractor networks Working memory Attentive vision An interconnecting workspace hypothesis

11.1 System level anatomy of the brain
The brain is more than just a big neural network with completely interconnected neurons Combine the basic networks Associative and competitive networks Global architectures reflecting large-scale organizations of the brain Modular networks resulting from combining the basic networks Display some structure within their architecture as opposed to completely interconnected networks

11.1.1 Large-scale anatomical and functional organization in the brain
Fig Example of a map of connectivities between cortical areas involved in visual processing.

11.1.2 Advantages of modular organizations
Modular specialization is used in the brain The functional significance of modular specialization in visual processing The cortex uses inhibition to sharpen various visual attributes Color, edges or orientations Local inhibition The separate attentional amplifications of separate features Learning speed, generalization abilities, representation capabilities and task realizations Modular mapping networks Modular attractor networks

11.2 Modular mapping networks 11.2.1 Mixture of experts
Fig An example of a type of modular mapping network called mixture of experts. Each expert, the gating network, and the integration network are usually mapping networks. The input layer of the integration network is composed of sigma-pi nodes, as the output of the gating network weights (modulates) the output of the expert networks to form the inputs of the integration network.

Divide-and-conquer Fig (A) Illustration of the absolute function f(x) = |x| that is difficult to approximate with a single mapping network. (B) A modular mapping network in the form of a mixture of experts that can represent the absolute function accurately.

11.2.3 Training modular structures
Training such networks To solve specific tasks in flexible manner Training the experts alone has two component: To assign the expert to particular task To train each expert on the designated task Training the gating network Credit-assignment problem Biological systems A task assignment phase and an expert training phases are not separated

11.2.4 The ‘what-and-where’ task
The ventral visual pathway: ‘what’ The dorsal visual pathway: ‘where’ Performing object recognition and determining the location of objects in a modular network Retina of 5 x 5 cells Object of 3 x 3 patterns 26 input channels 25 for retina inputs 1 for task specification 18 output nodes 9 for objects patterns 9 for location 36 hidden nodes Back-propagation learning Fig Example of the ‘what-and-where’ tasks. (A) 5 x 5 model retina with 3 x 3 image of an object as an example.

11.2.5 Temporal and spatial cross-talk
Conflicting training information Temporal cross-talk The network will quickly adapt to reasonable performances of the ‘what’ task if we train the network first entirely on this task The representations of the hidden layers will change in a subsequent learning period on the ‘where’ task, which is likely to conflict with the representation necessary for the ‘what’ task Training sets with conflicting training pattern Spatial cross-talk One training example due to the distributed nature of the representations The division of the tasks into separate networks Abolish both problematic cross-talk conflicts

Task decomposition Modular networks can learn task decomposition Solution of Jacobs, Jordan and Barto Using a gating network that increased the strength to that expert network that significantly improved the output the system The back-propagated error signal is modulated by the gating weights The module that contributed most to the answer will adapt most to the new example Specialized expert Solution of Jacobs and Jordan A physical location of the nodes in a single mapping network Used a distance-dependent term in the objective function Leads to a weight decay favoring short connections The objective (or error) function (11.1) Fig (B) Connection weights between hidden and output nodes in a single mapping network. Positive weights are shown by solid squares, while open squares symbolize negative values. (C) Connection weight between hidden and output nodes in a single mapping network when trained with a bias toward short connections.

Product of experts Output can be interpreted as the probability that an input vector has a certain feature If the summed output of the expert networks are normalized Modular networks can easily be generalized to allow similar interpretations View the mixture of experts as a collection of experts whose weighted opinion is averaged to determine the probability of the feature value The probability of the independent events The advantages of a product of experts relative to the weighted mean calculated by the mixture of experts A large probability assigned by one expert can be largely suppressed by low probabilities assigned by other experts A large probability is only indicated if there is some agreement between the experts Allows the individual experts to assign unreasonably large probabilities to some event as long as other experts represent such events more accurately with low probabilities

11.3 Coupled attractor networks
Fig Coupled (or subdivided) recurrent neural networks. The nodes in this example are divided into two groups (the nodes of each group are indicated with different shadings). There are connections within the nodes of each group and between nodes of different groups.

11.3.1 Imprinted and composite patterns
One single point attractor network versus two single point attractor networks Imprint into this system all objects Two independent feature vectors representing m possible feature values for each feature of an object Build m2 possible objects A network with 1000 nodes can store around 138 patterns Hebbian rule The storage capacity αc ≈ (see Chapter 8) A network with two such independent subnetworks could store , N is number of nodes The number of patterns that can be stored in a single networks is only (11.2) (11.3)

11.3.2 Signal-to-noise analysis
The behavior of coupled attractor networks using a signal-to-noise analysis An overall network of N nodes that can be divided into modules, each having the same number of nodes N′ The weights are trained with the Hebbian rule Modulate the weight values between the modules with a factor g Define a new weight matrix with components A modulation matrix g (11.4) (11.5) (11.6) (11.7)

Imprinted pattern To evaluate the stability of the imprinted pattern Separate the signal term from the noise terms To simplify the formulas The capacity bound (11.8) (11.9) Fig Coupled attractor neural networks: result from signal-to-noise analysis. (A) Dependence of the load capacity of the imprinted pattern on relative intermodule coupling strength g for different numbers of modules m. (11.10) (11.11)

Composite pattern (1) All kinds of combination of patterns in different modules All the subpatterns in the modules are chosen to be the first training pattern except for the module to which the node under consideration belongs (11.12) (11.13) Fig (B) Bounds on relative intermodule coupling strength g. For g value greater than the upper curve the imprinted patterns are stable. For g less then the lower curve the composite patterns are stable. In the narrow band in between we can adjust the system to have several composite and some imprinted patters stable. This band gets narrower as the number of modules m increase and vanishes for networks with many modules. (11.14) (11.15)

11.3.4 Composite pattern (2) The reverse case (11.16) (11.17) (11.18)
Fig (B) Bounds on relative intermodule coupling strength g. For g value greater than the upper curve the imprinted patterns are stable. For g less then the lower curve the composite patterns are stable. In the narrow band in between we can adjust the system to have several composite and some imprinted patters stable. This band gets narrower as the number of modules m increase and vanishes for networks with many modules. (11.18) (11.19)

11.4 Working memory A specific hypothesis on the implementation of working memory in the brain Working memory is a construct that is hard to define precisely Workspace that provides the necessary information to solve complex task Language comprehension Mental arithmetic Reasoning for problem-solving and decision-making The specific form of short-term memory (STM)

11.4.1 Distributed model of working memory
A conceptual model of working memory Based on modular structure How the interaction of these modules enable the brain to utilize the kind of information Fig A modular system of short-, intermediate-, and long-term memory, which are associated with functionalities of the prefrontal cortex (PFC), the hippocampus and related areas (HCMP), and the perceptual and motor cortex (PMC), respectively. A

11.4.2 Limited capacity of working memory
‘31, 27, 4, 18’ ‘62, 97, 12, 73, 27, 54, 8’ The limited capacity of working memory ‘Magical number 7±2’ The very limited capacity of working memory is puzzling Correlated classical measurement of IQ factors A lager storage capacity should make us fitter to survive The search for the reasons behind the limited capacity of working memory is prominent in cognitive neuroscience

11.4.3 Spurious synchronization
The reasons behind the limited capacity of working memory Fig (A) The percentage of correct responses (recall ability) in human subjects in a sequential comparison task of two images with different numbers of objects Nobj (solid squares). The dotted lines illustrate examples of the functional form as suggested by the synchronization hypothesis. The dashed line corresponds to the results with PSS(2) = 0.04, where PSS is the probability of spurious synchronization. (B) Illustration of the spurious synchronization hypothesis. The features of the object are represented by different spike trains, so that the number of synchronous spikes within a certain resolution increases with increasing an number of objects.

11.4.4 Quantification of the spurious synchronization hypothesis
Each additional feature of one object would in any case be synchronized with the other features of the objects so that only the number of objects with different spike trains would matter The number of pairs The probability of spurious synchronization between at least two spike trains in a set of Nobj spike trains (pattern) A functional expectation of the percentage of correct recall (11.20) (11.21) (11.22)

11.4.5 Other hypotheses on the capacity limit
Lisma and Idiart The limit might be solely due to the limiting ability of reverberating neural activity for short-term memory The representation of different objects is kept in different high-frequency subcycles of low-frequency oscillations found in the brain Nelson Cowan Based on the limits of an Attentional system Thought to be a necessary ingredient in working memory models

11.5 Attentive vision 11.5.1 Object recognition versus visual search
Fig Illustration of a visual search and an object recognition task. Each task demands a different strategy in exploring the visual scene.

The overall model Fig Outline of a model of the visual processing in primates to simulate visual search and object recognition. The main parts of the model are inspired by structural features of the cortical areas thought to be central in these processes. Theses include early visual areas (labeled ‘V1-V4’) that represent the content of the visual field topographically with basic features, the inferior-temporal cortex (labeled ‘IT’) that is known to be central for object recognition, and the posterior parietal cortex (labeled ‘PP’) that is strongly associated with spatial attention.

11.5.3 Object representation in early visual areas
The first part of the model labeled ‘V1-V4’ represents early visual areas The striate cortex (V1) and adjacent visual areas The primary visual area V1 is actually the main cortical area Receives visual input from the LGN of the thalamus The major target of the optic nerves from the eyes Neuronal response to visual input from the eyes Gabor functions The principal role The decomposition of the visual field into feature Orientation, color, motion, etc. The modeling point of view The feature representation in this part of the model is topographic Features are represented in modules Correspond to the location of the object in the visual field

11.5.4 Translation-invariant object recognition with ANN
The representation of the visual field in ‘V1-V4’ in this model feed into the part labeled ‘IT’ The inferior-temporal cortex Involved in object recognition The connections between the ‘V1-V4’ and the ‘IT’ Trained with Hebbian learning Translation-invariant object recognition The point attractor network to ‘recognize’ trained objects in test trials at all locations in the visual field Cortical magnification

11.5.5 Size of the receptive field
The size of the receptive field of inferior-temporal neurons depends on the content of the visual field and the specifics of the task Fig (A) Example of the average firing rate from recordings of a neuron in the inferior-temporal cortex of a monkey in response to an effective stimulus that is located at various degrees away from the direction of gaze. (B) Simulation results of a model with the essential components of the model shown in Fig The correlation is thereby a measure of overlap between the activity of ‘IT’ nodes with the representation of the target object that was used during training.

11.5.6 Attentional bias in visual search and object recognition
Simulate with supplying an object bias input to the attractor network in ‘IT’ Top-down information The additional input of an object bias to ‘IT’ can speed-up the recognition process in ‘IT’ The object bias also supports the recognition ability of the input from ‘V1-V4’ that corresponds to the target object Parallel conclusions An object recognition task in which top-down input to a specific location in ‘PP’ is given Enhance the neural activity in ‘V1-V4’ for the features of the object that is located at the corresponding location

11.5.7 Parallel versus serial search
Fig Numerical experiments in which the model simulated a visual search task of a target object (the letter ‘E’) in a visual scene with visual distractors. (A) In one experiment the distractors consist of the letters ‘X’ that are visually very different from the target letter. The activity if a ‘PP’ node that corresponds to the target location increases in these experiments independently of the number of distractors, implying parallel search. (B) The second experiment was doe with distractors (letter ‘F’) that were visually similar to the target letter. The reaction times, as measured from ‘PP’ nodes, depends linearly on the number of objects, a feature that is also characteristic of serial search. Both modes are, however, present in the same ‘parallel architecture’.

11. 6 An interconnecting workspace hypothesis 11. 6
11.6 An interconnecting workspace hypothesis The global workspace Fig Illustration of the workspace hypothesis. Two computational spaces can be distinguished, the subnetworks with localized and specific computational specialization, and an interconnecting network that is the platform of the global workspace.

11.6.2 Demonstration of the global workspace in the Stroop task
Fig (A) In a Stroop task a word for a color, written in a color that can be different from the meaning of the word, is shown to a subject who is ask to perform either a word-naming or color-naming task. (B) Global workspace model that is able to reproduce several experimental findings in the Stroop task.

Conclusion Fundamental example of modular networks
Complex information-processing systems Expert and gating networks Demonstrate the number of modules in coupled attractor networks Working memory with modular networks Attentional vision Object recognition Visual search Workspace hypothesis

11. System level organization and coupled networks

Similar presentations

Presentation on theme: "11. System level organization and coupled networks"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

11. System level organization and coupled networks

Similar presentations

Presentation on theme: "11. System level organization and coupled networks"— Presentation transcript:

Similar presentations

About project

Feedback