Associative-Memory Networks Input: Pattern (often noisy/corrupted) Output: Corresponding pattern (complete / relatively noise-free) Process 1.Load input.

Slides:



Advertisements
Similar presentations
Bioinspired Computing Lecture 16
Advertisements

Chapter3 Pattern Association & Associative Memory
Pattern Association.
Memristor in Learning Neural Networks
1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)
Introduction to Neural Networks Computing
CS 678 –Boltzmann Machines1 Boltzmann Machine Relaxation net with visible and hidden units Learning algorithm Avoids local minima (and speeds up learning)
Tuomas Sandholm Carnegie Mellon University Computer Science Department
5/16/2015Intelligent Systems and Soft Computing1 Introduction Introduction Hebbian learning Hebbian learning Generalised Hebbian learning algorithm Generalised.
Unsupervised Learning with Artificial Neural Networks The ANN is given a set of patterns, P, from space, S, but little/no information about their classification,
Artificial neural networks:
Kostas Kontogiannis E&CE
1 Neural networks 3. 2 Hopfield network (HN) model A Hopfield network is a form of recurrent artificial neural network invented by John Hopfield in 1982.
Pattern Association A pattern association learns associations between input patterns and output patterns. One of the most appealing characteristics of.
Correlation Matrix Memory CS/CMPE 333 – Neural Networks.
Introduction to Neural Networks John Paxton Montana State University Summer 2003.
Carla P. Gomes CS4700 CS 4700: Foundations of Artificial Intelligence Prof. Carla P. Gomes Module: Neural Networks: Concepts (Reading:
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
September 30, 2010Neural Networks Lecture 8: Backpropagation Learning 1 Sigmoidal Neurons In backpropagation networks, we typically choose  = 1 and 
Pattern Recognition using Hebbian Learning and Floating-Gates Certain pattern recognition problems have been shown to be easily solved by Artificial neural.
November 30, 2010Neural Networks Lecture 20: Interpolative Associative Memory 1 Associative Networks Associative networks are able to store a set of patterns.
CHAPTER 3 Pattern Association.
December 7, 2010Neural Networks Lecture 21: Hopfield Network Convergence 1 The Hopfield Network The nodes of a Hopfield network can be updated synchronously.
Neural Networks Lecture 17: Self-Organizing Maps
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
CS623: Introduction to Computing with Neural Nets (lecture-10) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
Supervised Hebbian Learning
Chapter 6 Associative Models. Introduction Associating patterns which are –similar, –contrary, –in close proximity (spatial), –in close succession (temporal)
10/6/20151 III. Recurrent Neural Networks. 10/6/20152 A. The Hopfield Network.
Neural Networks Architecture Baktash Babadi IPM, SCS Fall 2004.
Neural Networks Ellen Walker Hiram College. Connectionist Architectures Characterized by (Rich & Knight) –Large number of very simple neuron-like processing.
Neural Network Hopfield model Kim, Il Joong. Contents  Neural network: Introduction  Definition & Application  Network architectures  Learning processes.
Artificial Neural Network Unsupervised Learning
Artificial Neural Network Supervised Learning دكترمحسن كاهاني
Hebbian Coincidence Learning
Recurrent Network InputsOutputs. Motivation Associative Memory Concept Time Series Processing – Forecasting of Time series – Classification Time series.
Boltzmann Machine (BM) (§6.4) Hopfield model + hidden nodes + simulated annealing BM Architecture –a set of visible nodes: nodes can be accessed from outside.
The Boltzmann Machine Psych 419/719 March 1, 2001.
Neural Networks and Fuzzy Systems Hopfield Network A feedback neural network has feedback loops from its outputs to its inputs. The presence of such loops.
IE 585 Associative Network. 2 Associative Memory NN Single-layer net in which the weights are determined in such a way that the net can store a set of.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
What to make of: distributed representations summation of inputs Hebbian plasticity ? Competitive nets Pattern associators Autoassociators.
CSC321: Introduction to Neural Networks and machine Learning Lecture 16: Hopfield nets and simulated annealing Geoffrey Hinton.
Activations, attractors, and associators Jaap Murre Universiteit van Amsterdam
Associative Memory “Remembering”? – Associating something with sensory cues Cues in terms of text, picture or anything Modeling the process of memorization.
1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
Activations, attractors, and associators Jaap Murre Universiteit van Amsterdam en Universiteit Utrecht
Chapter 6 Neural Network.
Lecture 9 Model of Hopfield
ECE 471/571 - Lecture 16 Hopfield Network 11/03/15.
Computational Intelligence Winter Term 2015/16 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.
J. Kubalík, Gerstner Laboratory for Intelligent Decision Making and Control Artificial Neural Networks II - Outline Cascade Nets and Cascade-Correlation.
Hopfield Networks MacKay - Chapter 42.
CSC321 Lecture 18: Hopfield nets and simulated annealing
Neural Networks.
Ch7: Hopfield Neural Model
ECE 471/571 - Lecture 15 Hopfield Network 03/29/17.
Real Neurons Cell structures Cell body Dendrites Axon
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
ECE 471/571 - Lecture 19 Hopfield Network.
Recurrent Networks A recurrent network is characterized by
Boltzmann Machine (BM) (§6.4)
Patrick Kaifosh, Attila Losonczy  Neuron 
Ch6: AM and BAM 6.1 Introduction AM: Associative Memory
Ch4: Backpropagation (BP)
CSC 578 Neural Networks and Deep Learning
Patrick Kaifosh, Attila Losonczy  Neuron 
Presentation transcript:

Associative-Memory Networks Input: Pattern (often noisy/corrupted) Output: Corresponding pattern (complete / relatively noise-free) Process 1.Load input pattern onto core group of highly-interconnected neurons. 2.Run core neurons until they reach a steady state. 3.Read output off of the states of the core neurons. InputsOutputs Input: ( ) Output: ( )

Associative Network Types 1. Auto-associative: X = Y 2. Hetero-associative Bidirectional: X <> Y *Recognize noisy versions of a pattern *Iterative correction of input and output BAM = Bidirectional Associative Memory

Associative Network Types (2) 3. Hetero-associative Input Correcting: X <> Y *Input clique is auto-associative => repairs input patterns 4. Hetero-associative Output Correcting: X <> Y *Output clique is auto-associative => repairs output patterns

Hebb’s Rule Connection Weights ~ Correlations ``When one cell repeatedly assists in firing another, the axon of the first cell develops synaptic knobs (or enlarges them if they already exist) in contact with the soma of the second cell.” (Hebb, 1949) In an associative neural net, if we compare two pattern components (e.g. pixels) within many patterns and find that they are frequently in: a) the same state, then the arc weight between their NN nodes should be positive b) different states, then ” ” ” ” negative Matrix Memory: The weights must store the average correlations between all pattern components across all patterns. A net presented with a partial pattern can then use the correlations to recreate the entire pattern.

Correlated Field Components Each component is a small portion of the pattern field (e.g. a pixel). In the associative neural network, each node represents one field component. For every pair of components, their values are compared in each of several patterns. Set weight on arc between the NN nodes for the 2 components ~ avg correlation. ?? a b a b Avg Correlation ba w ab

Quantifying Hebb’s Rule Compare two nodes to calc a weight change that reflects the state correlation: Hebbian Principle: If all the input patterns are known prior to retrieval time, then init weights as: * When the two components are the same (different), increase (decrease) the weight Ideally, the weights will record the average correlations across all patterns: Weights = Average Correlations Auto-Association: Hetero-Association: Auto: Hetero: Auto:Hetero: i = input component o = output component

Matrix Representation Let X = matrix of input patterns, where each ROW is a pattern. So x k,i = the ith bit of the kth pattern. Let Y = matrix of output patterns, where each ROW is a pattern. So y k,j = the jth bit of the kth pattern. Then, avg correlation between input bit i and output bit j across all patterns is: 1/P (x 1,i y 1,j + x 2,i y 2,j + … + x p,i y p,j ) = w i,j To calculate all weights: Hetero Assoc:W = X T Y Auto Assoc:W = X T X In Pattern 1: x 1,1..x 1,n In Pattern 2: x 2,1..x 2,n In Pattern p: x 1,1..x 1,n : X X 1,i.. P1 P2 Pp XTXT X 2,i X p,i Out P1: y 1,1.. y 1,j ……y 1,n Out P2: y 2,1.. y 2,j ……y 2,n Out P3: y p,1.. y p,j ……y p,n : Y Dot product

Auto-Associative Memory 1 node per pattern unit Fully connected: clique Weights = avg correlations across all patterns of the corresponding units Auto-Associative Patterns to Remember 2. Distributed Storage of All Patterns: Retrieval Comp/Node value legend: dark (blue) with x => +1 dark (red) w/o x => -1 light (green) => 0

Hetero-Associative Memory 1 3 2b a 1 node per pattern unit for X & Y Full inter-layer connection Weights = avg correlations across all patterns of the corresponding units 1 3 b 2 1. Hetero-Associative Patterns (Pairs) to Remember 2. Distributed Storage of All Patterns: 3. Retrieval b a a 1

Hopfield Networks Auto-Association Network Fully-connected (clique) with symmetric weights State of node = f(inputs) Weight values based on Hebbian principle Performance: Must iterate a bit to converge on a pattern, but generally much less computation than in back-propagation networks. InputOutput (after many iterations) Discrete node update rule: Input value

Hopfield Network Example Patterns to Remember p1p1 p2p2 p3p3 2. Hebbian Weight Init: Avg Correlations across 3 patterns W /3 p1p1 p2p2 p3p3 Avg W /3 W /3 W /3 W /3 W [-] [+] -1/3 1/3 3. Build Network 4. Enter Test Pattern /3 1/3 -1/3 1/

Hopfield Network Example (2) 5. Synchronous Iteration (update all nodes at once) Node1234 Output /31 21/3001/31 3-1/ / Inputs -1/3 1/3 -1/3 1/3 Stable State p1p = Values from Input Layer From discrete output rule: sign(sum)

Using Matrices Goal: Set weights such that an input vector Vi, yields itself when multiplied by the weights, W. X = V1,V2..Vp, where p = # input vectors (i.e., patterns) So Y=X, and the Hebbian weight calculation is: W = X T Y = X T X X = X T = Common index = pattern #, so X T X = this is correlation sum w 2,4 = w 4,2 = x T 2,1 x 1,4 + x T 2,2 x 2,4 + x T 2,3 x 3,4

Matrices (2) The upper and lower triangles of the product matrix represents the 6 weights w i,j = w j,i Scale the weights by dividing by p (i.e., averaging). Picton (ANN book) subtracts p from each. Either method is fine, as long we apply the appropriate thresholds to the output values. This produces the same weights as in the non-matrix description. Testing with input = ( ) ( ) = ( ) Scaling* by p = 3 and using 0 as a threshold gives: (2/3 2/3 2/3 -2/3) => ( ) * For illustrative purposes, it’s easier to scale by p at the end instead of scaling the entire weight matrix, W, prior to testing.

Hopfield Network Example (3) 5b. Synchronous Iteration Node1234 Output 111/ / /31/ /3-1/3000 Inputs 4b. Enter Another Test Pattern /3 1/3 -1/3 1/3 Input pattern is stable, but not one of the original patterns. Attractors in node-state space can be whole patterns, parts of patterns, or other combinations. Spurious Outputs

Hopfield Network Example (4) 5c. Asynchronous Iteration (One randomly-chosen node at a time) 4c. Enter Another Test Pattern Update 3 -1/3 1/3 -1/3 1/3 -1/3 1/3 -1/3 1/3 Update 4 -1/3 1/3 -1/3 1/3 -1/3 1/3 -1/3 1/3 Update 2 Stable & Spurious Asynchronous Updating is central to Hopfield’s (1982) original model.

Hopfield Network Example (5) 5d. Asynchronous Iteration 4d. Enter Another Test Pattern Update 3 -1/3 1/3 -1/3 1/3 -1/3 1/3 -1/3 1/3 Update 4 -1/3 1/3 -1/3 1/3 -1/3 1/3 -1/3 1/3 Update 2 Stable Pattern p 3

Hopfield Network Example (6) 5e. Asynchronous Iteration (but in different order) 4e. Enter Same Test Pattern Update 2 -1/3 1/3 -1/3 1/3 -1/3 1/3 -1/3 1/3 Update 3 or 4 (No change) -1/3 1/3 -1/3 1/3 Stable & Spurious

Associative Retrieval = Search Back-propagation: Search in space of weight vectors to minimize output error Associative Memory Retrieval: Search in space of node values to minimize conflicts between a) node-value pairs and average correlations (weights), and b) node values and their initial values. Input patterns are local (sometimes global) minima, but many spurious patterns are also minima. High dependence upon initial pattern and update sequence (if asynchronous) p1p1 p2p2 p3p3

Energy Function Basic Idea: Energy of the associative memory should be low when pairs of node values mirror the average correlations (i.e. weights) on the arcs that connect the node pair, and when current node values equal their initial values (from the test pattern). When pairs match correlations, w kj x j x k > 0 When current values match input values, I k x k > 0 Gradient Descent A little math shows that asynchronous updates using the discrete rule: yield a gradient descent search along the energy landscape for the E defined above.

Storage Capacity of Hopfield Networks Capacity = Relationship between # patterns that can be stored & retrieved without error to the size of the network. Capacity = # patterns / # nodes or # patterns / # weights If we use the following definition of 100% correct retrieval: When any of the stored patterns is entered completely (no noise), then that same pattern is returned by the network; i.e. The pattern is a stable attractor. A detailed proof shows that a Hopfield network of N nodes can achieve 100% correct retrieval on P patterns if: P < N/(4*ln(N)) N Max P In general, as more patterns are added to a network, the avg correlations will be less likely to match the correlations in any particular pattern. Hence, the likelihood of retrieval error will increase. => The key to perfect recall is selective ignorance!!

Stochastic Hopfield Networks Node state is stochastically determined by sum of inputs: Node fires with probability: For these networks, effective retrieval is obtained when P < 0.138N, which is an improvement over standard Hopfield nets. Boltzmann Machines: Similar to Hopfield nets but with hidden layers. State changes occur either: a. Deterministically when b. Stochastically with probability = Where t is a decreasing temperature variable and is the expected change in energy if the change is made. The non-determinism allows the system to ”jiggle” out of local minima.

Hopfield Nets in the Brain?? The cerebral cortex is full of recurrent connections, and there is solid evidence for Hebbian synapse modification there. Hence, the cerebrum is believed to function as an associative memory. Flip-flop figures indicate distributed hopfield-type coding, since we cannot hold both perceptions simultaneously (binding problem)

The Necker Cube Excitatory Inhibitory Closer(A,B)Closer(H,G)Closer(C,D)Closer(G,H) Convex(A)Showing(G)Convex(G)Hidden(G) B G C H D F E A Which face is closer to the viewer? BCGF or ADHE? Only one side of the (neural) network can be active at a time. Steven Pinker (1997) “How the Mind Works”, pg. 107.

Things to Remember Auto-Associative -vs- Hetero-associative –Wide variety of net topologies –All use Hebbian Learning => weights ~ avg correlations One-shot -vs- Iterative Retrieval –Iterative gives much better error correction. Asynchronous -vs- Synchronous state updates –Synchronous updates can easily lead to oscillation –Asynchronous updates can quickly find a local optima (attractor) Update order can determine attractor that is reached. Pattern Retrieval = Search in node-state space. –Spurious patterns are hard to avoid, since many are attractors also. –Stochasticity helps jiggle out of local minima. –Memory load increase => recall error increase. Associative -vs- Feed-Forward Nets –Assoc: Many - 1 mapping Feed-Forward: many-many mapping –Backprop is resource-intensive, while Hopfield iterative update is O(n) –Gradient-Descent on an Error -vs- Energy Landscape: Backprop => arc-weight space Hopfield => node-state space