Sparse Neural Systems: The Ersatz Brain gets Thinner James A. Anderson Department of Cognitive and Linguistic Sciences Brown University Providence, Rhode.

Slides:

Advertisements

Similar presentations

Introduction to Neural Networks

Advertisements

Read this article for Friday next week [1]Chelazzi L, Miller EK, Duncan J, Desimone R. A neural basis for visual search in inferior temporal cortex. Nature.

Slides from: Doug Gray, David Poole

Neural Network Models in Vision Peter Andras

CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 7: Learning in recurrent networks Geoffrey Hinton.

Introduction to Training and Learning in Neural Networks n CS/PY 399 Lab Presentation # 4 n February 1, 2001 n Mount Union College.

Visual Attention Attention is the ability to select objects of interest from the surrounding environment A reliable measure of attention is eye movement.

Template design only ©copyright 2008 Ohio UniversityMedia Production Spring Quarter  A hierarchical neural network structure for text learning.

5/16/2015Intelligent Systems and Soft Computing1 Introduction Introduction Hebbian learning Hebbian learning Generalised Hebbian learning algorithm Generalised.

Artificial neural networks:

Electrophysiology.

1Neural Networks B 2009 Neural Networks B Lecture 1 Wolfgang Maass

Machine Learning Neural Networks

Soft computing Lecture 6 Introduction to neural networks.

A Brain-Like Computer for Cognitive Applications: The Ersatz Brain Project James A. Anderson Department of Cognitive and Linguistic.

CSE 153 Cognitive ModelingChapter 3 Representations and Network computations In this chapter, we cover: –A bit about cortical architecture –Possible representational.

Self-Organizing Hierarchical Neural Network

A Brain-Like Computer for Cognitive Applications: The Ersatz Brain Project James A. Anderson Department of Cognitive and Linguistic.

EE141 1 Design of Self-Organizing Learning Array for Intelligent Machines Janusz Starzyk School of Electrical Engineering and Computer Science Heidi Meeting.

Un Supervised Learning & Self Organizing Maps Learning From Examples

The Sternberg Experiments: An Ersatz Test Program James Anderson Department of Cognitive and Linguistic Sciences Brown University September 10, 2007.

EE141 1 Design of Self-Organizing Learning Array for Intelligent Machines Janusz Starzyk School of Electrical Engineering and Computer Science Heidi Meeting.

September 16, 2010Neural Networks Lecture 4: Models of Neurons and Neural Networks 1 Capabilities of Threshold Neurons By choosing appropriate weights.

Chapter Seven The Network Approach: Mind as a Web.

A Theory of Cerebral Cortex (or, “How Your Brain Works”) Andrew Smith (CSE)

Associative Learning in Hierarchical Self Organizing Learning Arrays Janusz A. Starzyk, Zhen Zhu, and Yue Li School of Electrical Engineering and Computer.

A Brain-Like Computer for Cognitive Applications: The Ersatz Brain Project James A. Anderson Department of Cognitive and Linguistic.

MAE 552 Heuristic Optimization Instructor: John Eddy Lecture #31 4/17/02 Neural Networks.

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.

October 7, 2010Neural Networks Lecture 10: Setting Backpropagation Parameters 1 Creating Data Representations On the other hand, sets of orthogonal vectors.

Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.

For 3-G Systems Tara Larzelere EE 497A Semester Project.

Artificial Intelligence Lecture No. 28 Dr. Asad Ali Safi Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.

Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.

Introduction to Neural Networks. Neural Networks in the Brain Human brain “computes” in an entirely different way from conventional digital computers.

IE 585 Introduction to Neural Networks. 2 Modeling Continuum Unarticulated Wisdom Articulated Qualitative Models Theoretic (First Principles) Models Empirical.

The search for organizing principles of brain function Needed at multiple levels: synapse => cell => brain area (cortical maps) => hierarchy of areas.

Artificial Neural Network Unsupervised Learning

Low Level Visual Processing. Information Maximization in the Retina Hypothesis: ganglion cells try to transmit as much information as possible about the.

Outline What Neural Networks are and why they are desirable Historical background Applications Strengths neural networks and advantages Status N.N and.

Advances in Modeling Neocortex and its impact on machine intelligence Jeff Hawkins Numenta Inc. VS265 Neural Computation December 2, 2010 Documentation.

1 Computational Vision CSCI 363, Fall 2012 Lecture 3 Neurons Central Visual Pathways See Reading Assignment on "Assignments page"

Neural Network with Memory and Cognitive Functions Janusz A. Starzyk, and Yue Li School of Electrical Engineering and Computer Science Ohio University,

Cognition, Brain and Consciousness: An Introduction to Cognitive Neuroscience Edited by Bernard J. Baars and Nicole M. Gage 2007 Academic Press Chapter.

Dynamic Presentation of Key Concepts Module 5 – Part 1 Fundamentals of Operational Amplifiers Filename: DPKC_Mod05_Part01.ppt.

What to make of: distributed representations summation of inputs Hebbian plasticity ? Competitive nets Pattern associators Autoassociators.

Digital Image Processing CCS331 Relationships of Pixel 1.

CSC321: Introduction to Neural Networks and machine Learning Lecture 16: Hopfield nets and simulated annealing Geoffrey Hinton.

The primate visual systemHelmuth Radrich, The primate visual system 1.Structure of the eye 2.Neural responses to light 3.Brightness perception.

Chapter 3: Neural Processing and Perception. Neural Processing and Perception Neural processing is the interaction of signals in many neurons.

What Brain Research Says About Learning. Copyright © Houghton Mifflin Company. All rights reserved.4 | 2 Brain-based or brain compatible learning is based.

Neural Networks Steven Le. Overview Introduction Architectures Learning Techniques Advantages Applications.

Bain on Neural Networks and Connectionism Stephanie Rosenthal September 9, 2015.

Introduction to Neural Networks and Example Applications in HCI Nick Gentile.

381 Self Organization Map Learning without Examples.

Lecture 5 Neural Control

James L. McClelland Stanford University

Chapter 2. From Complex Networks to Intelligent Systems in Creating Brain-like Systems, Sendhoff et al. Course: Robots Learning from Humans Baek, Da Som.

Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.

DO LOCAL MODIFICATION RULES ALLOW EFFICIENT LEARNING ABOUT DISTRIBUTED REPRESENTATIONS ? A. R. Gardner-Medwin THE PRINCIPLE OF LOCAL COMPUTABILITY Neural.

CSC321: Neural Networks Lecture 18: Distributed Representations

COSC 4426 AJ Boulay Julia Johnson Artificial Neural Networks: Introduction to Soft Computing (Textbook)

CSC321: Neural Networks Lecture 1: What are neural networks? Geoffrey Hinton

Neural Networks Lecture 11: Learning in recurrent networks Geoffrey Hinton.

Introduction to the TLearn Simulator n CS/PY 231 Lab Presentation # 5 n February 16, 2005 n Mount Union College.

Machine Learning Supervised Learning Classification and Regression

Neural Network Models in Vision

The Network Approach: Mind as a Web

ECE 352 Digital System Fundamentals

Presentation transcript:

Sparse Neural Systems: The Ersatz Brain gets Thinner James A. Anderson Department of Cognitive and Linguistic Sciences Brown University Providence, Rhode Island

Speculation Alert! Rampant speculation follows.

Biological Models The human brain is composed of on the order of neurons, connected together with at least neural connections. (Probably underestimates.) Biological neurons and their connections are extremely complex electrochemical structures. The more realistic the neuron approximation the smaller the network that can be modeled. There is good evidence that for cerebral cortex a bigger brain is a better brain. Projects that model neurons are of scientific interest. They are not large enough to model or simulate interesting cognition.

Neural Networks. The most successful brain inspired models are neural networks. They are built from simple approximations of biological neurons: nonlinear integration of many weighted inputs. Throw out all the other biological detail.

Neural Network Systems Units with these approximations can build systems that  can be made large,  can be analyzed,  can be simulated,  can display complex cognitive behavior. Neural networks have been used to model important aspects of human cognition.

Most neural nets assume full connectivity between layers. A fully connected neural net uses lots of connections! A Fully Connected Network

Sparse Connectivity The brain is sparsely connected. (Unlike most neural nets.) A neuron in cortex may have on the order of 100,000 synapses. There are more than neurons in the brain. Fractional connectivity is very low: 0.001%. Implications: Connections are expensive biologically since they take up space, use energy, and are hard to wire up correctly. Therefore, connections are valuable. The pattern of connection is under tight control. Short local connections are cheaper than long ones. Our approximation makes extensive use of local connections for computation.

Few active units represent an event. “In recent years a combination of experimental, computational, and theoretical studies have pointed to the existence of a common underlying principle involved in sensory information processing, namely that information is represented by a relatively small number of simultaneously active neurons out of a large population, commonly referred to as ‘sparse coding.’” Bruno Olshausen and David Field (2004 paper, p. 481). Sparse Coding

There are numerous advantages to sparse coding. Sparse coding provides increased storage capacity in associative memories is easy to work with computationally, We will make use of these properties. Sparse coding also “makes structure in natural signals explicit” is energy efficient. Best of all: It seems to exist! Higher levels (further from sensory inputs) show sparser coding than lower levels. Advantages of Sparse Coding

See if we can make a learning system that starts from the assumption of both sparse connectivity and sparse coding. If we use simple neural net units it doesn’t work so well. But if we use our Network of Networks approximation, it works better and makes some interesting predictions. Sparse Connectivity + Sparse Coding

The simplest sparse system has a single active unit connecting to a single active unit. If the potential connection does exist, simple outer-product Hebb learning can learn it easily. Not very interesting. The Simplest Connection

A useful notion in sparse systems is the idea of a path. A path connects a sparsely coded input unit with a sparsely coded output unit. Paths have strengths just as connections do. Strengths are based on the entire path, from input to output, which may involve intermediate connections. It is easy for Hebb synaptic learning to learn paths. Paths

One of many problems. Suppose there is a common portion of a path for two single active unit associations, a with d (a>b>c>d) and e with f (e>b>c>f). We cannot weaken or strengthen the common part of the path (b>c) because it is used for multiple associations. Common Parts of a Path

Some speculations: If independent paths are desirable an initial construction bias would be to make available as many potential paths as possible. In a fully connected system, adding more units than contained in the input and output layers would be redundant. They would add no additional processing power. Obviously not so in sparse systems! Fact: There is a huge expansion in number of units going from retina to thalamus to cortex. In V1, a million input fibers drive 200 million V1 neurons. Make Many, Many Paths!

Network of Networks Approximation Single units do not work so well in sparse systems. Let us our Network of Networks approximation and see if we can do better. Network of Networks: the basic computing units are not neurons, but small (10 4 neurons) attractor networks. Basic Network of Networks Architecture: 2 Dimensional array of modules Locally connected to neighbors

Received wisdom has it that neurons are the basic computational units of the brain. The Ersatz Brain Project is based on a different assumption. The Network of Networks model was developed in collaboration with Jeff Sutton (Harvard Medical School, now NSBRI). Cerebral cortex contains intermediate level structure, between neurons and an entire cortical region. Examples of intermediate structure are cortical columns of various sizes (mini-, plain, and hyper) Intermediate level brain structures are hard to study experimentally because they require recording from many cells simultaneously. The Ersatz Brain Approximation: The Network of Networks.

Cortical Columns: Minicolumns “The basic unit of cortical operation is the minicolumn … It contains of the order of neurons except in the primate striate cortex, where the number is more than doubled. The minicolumn measures of the order of  m in transverse diameter, separated from adjacent minicolumns by vertical, cell-sparse zones … The minicolumn is produced by the iterative division of a small number of progenitor cells in the neuroepithelium.” (Mountcastle, p. 2) VB Mountcastle (2003). Introduction [to a special issue of Cerebral Cortex on columns]. Cerebral Cortex, 13, 2-4. Figure: Nissl stain of cortex in planum temporale.

Columns: Functional Groupings of minicolumns seem to form the physiologically observed functional columns. Best known example is orientation columns in V1. They are significantly bigger than minicolumns, typically around mm. Mountcastle’s summation : “Cortical columns are formed by the binding together of many minicolumns by common input and short range horizontal connections. … The number of minicolumns per column varies … between 50 and 80. Long range intracortical projections link columns with similar functional properties.” (p. 3) Cells in a column ~ (80)(100) = 8000

Interactions between Modules Modules [columns?] look a little like neural net units. But interactions between modules are vector not scalar! Gain greater path selectivity this way. Interactions between modules are described by state interaction matrices instead of simple scalar weights.

Columnar identity is maintained in both forward and backward projections “The anatomical column acts as a functionally tuned unit and point of information collation from laterally offset regions and feedback pathways.” (p. 12) “… feedback projections from extra-striate cortex target the clusters of neurons that provide feedforward projections to the same extra-striate site. ….” (p. 22). Lund, Angelucci and Bressloff (2003). Cerebral Cortex, 12, Columns and Their Connections

Return to the simplest situation for layers: Modules a and b can display two orthogonal patterns, A and C on a and B and D on b. The same pathways can learn to associate A with B and C with D. Path selectivity can overcome the limitations of scalar systems. Paths are both upward and downward. Sparse Network of Networks

Consider the common path situation again. We want to associate patterns on two paths, a-b-c-d and e-b- c-f with link b-c in common. Parts of the path are physically common but they can be functionally separated if they use different patterns. Pattern information propagating forwards and backwards can sharpen and strengthen specific paths without interfering with the strengths of other paths. Common Paths Revisted

Just stringing together simple associators works: For module b: Change in coupling term between a and b: Δ(S ab ) = ηba T Change in coupling term between c and b Δ(T cb ) = ηbc T For module c: Δ(coupling term U dc ) = ηcd T Δ(coupling term T bc ) = ηcb T If pattern a is presented at layer 1 then: Pattern on d = (U cd ) (T bc ) (S ab ) a = η 3 dc T cb T ba T a = (constant) d Associative Learning along a Path

Because information propagates backward and forward, closed loops are possible and likely. Tried before: Hebb cell assemblies were self exciting neural loops. Corresponded to cognitive entities: for example, concepts. Hebb’s cell assemblies are hard to make work because of the use of scalar interconnected units. But module assemblies can become a powerful feature of the sparse approach. We have more selective connections. See if we can integrate relatively dense local connections with relatively sparse projections to and from other layers to form module assemblies. Module Assemblies

Biological Evidence: Columnar Organization in IT Tanaka (2003) suggests a columnar organization of different response classes in primate inferotemporal cortex. There seems to be some internal structure in these regions: for example, spatial representation of orientation of the image in the column.

IT Response Clusters: Imaging Tanaka (2003) used intrinsic visual imaging of cortex. Train video camera on exposed cortex, cell activity can be picked up. At least a factor of ten higher resolution than fMRI. Size of response is around the size of functional columns seen elsewhere: microns.

Columns: Inferotemporal Cortex Responses of a region of IT to complex images involve discrete columns. The response to a picture of a fire extinguisher shows how regions of activity are determined. Boundaries are where the activity falls by a half. Note: some spots are roughly equally spaced.

Active IT Regions for a Complex Stimulus Note the large number of roughly equally distant spots (2 mm) for a familiar complex image.

Intralayer connections are sufficiently dense so that active modules a little distance apart can become associatively linked. Recurrent collaterals of cortical pyramidal cells form relatively dense projections around a pyramidal cell. The extent of lateral spread of recurrent collaterals in cortex seems to be over a circle of roughly 3 mm diameter. If we assume that: A column is roughly a third of a mm, There are roughly 10 columns in a square mm. A 3 mm diameter circle has an area of roughly 10 square mm, A column projects locally to about 100 other columns. Intralayer Connections

If the modules are simultaneously active the pairwise associations forming the loop abcda are learned. The path closes on itself. Consider a. After traversing the linked path a>b>c>d>a, the pattern arriving at a around the loop is a constant times the pattern on a. If the constant is positive there is the potential for positive feedback if the total loop gain is greater than one. Loops

Loops can be kept separate even with common modules. If the b pattern is different in the two loops, there is no problem. The selectivity of links will keep activities separate. Activity from one loop will not spread into the other (unlike Hebb cell assemblies). Loops with Common Modules If b is identical in the two loops b is ambiguous. There is no a priori reason to activate Loop 1, Loop 2, or both. Selective loop activation is still possible, though it requires additional assumptions to accomplish.

More complex connection patterns are possible. Richer interconnection patterns might have all connections learned. Ambiguous module b will receive input from d as well as a and c. A larger context would allow better loop disambiguation by increasing the coupling strength of modules. Richly Connected Loops

Putting in All Together: Sparse interlayer connections and dense intralayer connections work together. Once a coupled module assembly is formed, it can be linked to by other layers. Now becomes a dynamic, adaptive computational architecture that becomes both workable and interesting. Working Together

Two Parts … Suppose we have two such assemblies that co-occur frequently. Parts of an object say …

As learning continues: Groups of module assemblies bind together through Hebb associative learning. The small assemblies can act as the “sub- symbolic” substrate of cognition and the larger assemblies, symbols and concepts. Note the many new interconnections. Make a Whole!

Conclusion (1) The binding process looks like compositionality. The virtues of compositionality are well known. It is a powerful and flexible way to build cognitive information processing systems. Complex mental and cognitive objects can be built from previously constructed, statistically well- designed pieces.

Conclusion (2) We are suggesting here a possible model for the dynamics and learning in a compositional-like system. It is built based on constraints derived from connectivity, learning, and dynamics and not as a way to do optimal information processing. Perhaps this property of cognitive systems is more like a splendid bug fix than a well chosen computational strategy. Sparseness is an idea worth pursuing. May be a way to organize and teach a cognitive computer.