Download presentation

Presentation is loading. Please wait.

Published byGrace Martinez Modified over 2 years ago

1
Bioinspired Computing Lecture 16 Associative Memories with Artificial Neural Networks Netta Cohen

2
2 Last time Biologically realistic architecture Dynamic interactive behaviour Natural learning protocols Biologically-inspired associative memories Also… steps away from biologically realistic model Unsupervised learning Applications Today Recurrent neural nets:Attractor neural nets:

3
3 Recurrent Nets: Pros & Cons Biologically-realistic architecture/performance Complex self-sustained activity Distributed representations Dynamic interactions with environment Powerful computation Noise tolerance Graceful degradation Hard to formalise in information processing terms Hard to visualise activity Hard to train with no guarantee of convergence No guaranteed solution ProsCons attractor neural nets are a special case of recurrent nets. Pros

4
4 Training Methods for Recurrent Networks Genetic algorithms –Usually in a controller –Fitness evaluated on the controller not the network –Requires sophisticated genome Backpropagation through time (BPTT) –Understand principle

5
5

6
6 +1

7
7 Training Methods for Recurrent Networks Jets and Sharks: interesting dynamic behaviour but weights set by hand Looking for an automatic, grounded way of training networks Hopfield (today)! Elman (also today … turn a ffd net into something recurrent)

8
8 Associative Memory The imprinting and recollection of memories is an important component of what we do & how we process information. If we were to model these processes, here are a few conditions we might want to include in our model: Store and reliably recall multiple independent memories. Given only partial input, recall complete information, or Given noisy input, recall noise-free prototype information. Learn new memories in a biologically realistic manner. Recall memories fast enough (before next input is received) Once recalled, maintain attention or memory long enough (for information processing & transmission elsewhere in brain).

9
9 Attractor Neural Nets In some cases, trajectories in state-space can be guaranteed to lead to one of several stable states (fixed points, cycles or generic attractors). In RNNs, the state of the system is dictated both by internal dynamics & the system response to inputs from the environment. cross-section of energy landscape state space

10
10 Attractor Neural Nets (cont.) Dynamical systems such as RNNs could serve as models of associative memory if it was possible to encode each memory in a specific stable state or attractor. In 1982, John Hopfield realised that by imposing a couple of restrictions on the architecture of the nets, he could guarantee the existence of attractors, such that every initial condition would necessarily evolve to a stable solution, where it would stay. This is tantamount to the requirement that the above picture be described in terms of an energy landscape.

11
11 Attractor Neural Nets: Architecture No self-connectedness All connections are symmetric w ij = w ji w ij w ji i w ii =0 j Gerard Toulouse has called Hopfields use of symmetric connections a clever step backwards from biological realism. The cleverness arises from the existence of an energy function.* * Hertz, Krogh & Palmer Introduction to the theory of neural computation (1990). The existence of an energy function provides us with: A formalism of the process of memory storage and recall A tool to visualise the activity (both learning and recall) A straightforward way to train the net Once trained, a guaranteed solution (recall of the correct memory).

12
12 How Does It Work? Nodes are modelled by conventional binary MP neurons. Each neuron serves both as an input and output unit. (There are no hidden units.) States are given by the pattern of activity of the neurons (e.g. 101 for a network with three neurons). The number of neuron sets the maximum length for a bit-string of memory. Different patterns can be simultaneously stored in the network. The number of independent patterns that can be remembered is less than or equal to the number of nodes. Memory recall corresponds to a trajectory taking the system from some initial state (input) to the local energy minimum (closest association). Each step along the (recall) trajectory results in the same or a lower energy. Since energy is bounded from below, a solution is guaranteed for every input.

13
(neuron i) Weight matrix (neuron j) A working example Input (t=0) t=1 t=2 t=3 t= t x i w ij i Exercise: repeat this example with an initial input of [ ]. threshold = 0

14
14 More general examples Stability: The stable pattern reached in the working example represents a fixed point in the dynamics. While stable solutions are guaranteed, not all stable solutions are fixed point solutions. State update rule: This example used a synchronous updating method. Asynchronous (sequential or random) updating methods can also be implemented.

15
15 Asynchronous random updating Pick any node Update it Repeat Example difference between asynchronous/synchronous: Hopfield network with two nodes: W =1 Initial state -1, 1

16
16 0,1 or -1,1 No essential difference! Suppose we use weights w_ij for nodes that are -1,1

17
17 Trajectories in energy landscape 1) neuron coding [0 1] [-1 1] Where does energy come in? The formalism we need to answer this question comes from physics and requires slight modifications to our notation: 3) Asynchronous update Spin Glass 2) Threshold now becomes a sign function: sign(input) = x=1 if input>0 old x if input=0 x= -1 if input<0 {

18
18 From energies to MP neurons We have just re-discovered that the MP update rule exactly corresponds to magnetic field alignment in Spin Glasses! x i sign( input i ) This is called the mean field approximation: the magnetic field at each node (each spin) corresponds to a weighted average over all the fields generated by all other spins. When a specific node senses this field, it wants to align with the mean field, thus reducing its energy. This is the update rule: Define the energy of node i as input i = x j w ij j E i = - x i input i where input i to node i is the weighted sum over all neurons

19
19 Ansynchronous Random updating Easy to show that updating minimizes energy Example given …

20
20 All connections are symmetric w ij = w ji Attractor Neural Nets The restrictions imposed on the recurrent net are now: No self-connectedness w ij w ji i w ii =0 j Gerard Toulouse has called Hopfields use of symmetric connections a clever step backward from biological realism. The cleverness arises from the existence of an energy function.* The existence of an energy function provides us with: A formalism of the process of memory storage and recall A tool to visualise the activity (both learning and recall) A straightforward way to train the net Once trained, a guaranteed solution (recall of the correct memory).

21
21 Training the Net We need to find the set of weights that encode a single pattern p of length N bits as a minimum energy solution. The minimum energy is obtained when the output of node i exactly matches the inputs to that node. OK Plug in a guess for w: so that

22
22 Training the Net (cont.) Now generalising for M memories or patterns: This weight assignment is remarkably reminiscent of Hebbian learning: If two nodes are spiking at the same time, then the weight connecting them is strengthened. Here anti- correlated nodes result in negative (inhibitory) weights. For gradual learning. Only patterns introduced repeatedly will result in the formation of new memories; noise will be ignored. Generalised Hebb Rule

23
23 Does it work? from where no longer available. This applet demonstrates:The distributed representation The ability to perform powerful computation High storage capacity (7 100-bit patterns in 100 neurons) High fidelity and noise tolerance Graceful degradation (for more memories) Eliminated features:Dynamic inputs in training Intrinsic background activity

24
24 Adding extra training patterns For a single training pattern it is clear that the pattern is an energy minimum (why?) For two training patterns, each of them is still an energy minimum, slightly perturbed by the other (explicit example) General condition p << N Consequence resistance to damage No localisation of memory: holographic

25
25 Storage Capacity How many memories can be stored in the network? To store M memories, each of length N bits, in a network of N neurons, we first ask how many stable patterns can be reached? In 1987, McEliece et al derived an upper limit for the number of memories that can be stored accurately: M = N /(2 log N). e.g. for N = 100 neurons, M = 11 distinct memories, each 100 bits long can be faithfully stored and recalled. To write out these 11 distinct memories, would take 1100 bits! In general, the coding efficiency of the network can be summarised as 2 log N neurons per pattern (each N bits long). This enormous capacity is paid for by a potentially lengthy recall process. McEliece et al., (1987) IEEE Trans. Inf. Theor. IT-33:

26
26 Hopfield nets have obvious applications for any problem that can be posed in terms of optimisation in the sense of maximising or minimising some function, that can be likened to an energy function. The distance matching problem: The travelling salesman problem: Applications shortest length match pairsgiven points find path shortest

27
27 What about the brain (pros)? Hopfield nets maintain some very attractive features from recurrent net architectures. However, the imposition of symmetric weights was a conscious move away from biological realism and toward engineering-like reliability. In contrast, Hopfield nets seem more biologically realistic in disallowing self-connected neurons. Hebbian-like learning is also a great appeal of Hopfield nets, capturing several important principles: (1) unsupervised learning (2) natural synaptic plasticity (3) No necessary distinction between training & testing. (4) robustness to details of training procedure

28
28 While we now have dynamics in training and in recall, we might still ask is this dynamics realistic in the brain? 1) In the memory recall stage: we consider inputs one at a time, waiting for the association to be made before proceeding to the next pattern. Is this how the brain works? 2) The aspiration of every Hopfield net is to arrive at a stable solution. Is this a realistic representation of association, or cognition in general in the brain? In other words, do we represent solutions to problems by relaxing to stable states of activity in the brain, or does the brain represent solutions according to very different, dynamic paradigms that handle continuous inputs and actively resist falling into the abyss of equilibrium? What about the brain (cons)?

29
29 How memories are implanted in our brains remains an exciting research question. While Hopfield nets no longer participate in this discourse, their formative role in shaping our intuition about associative memories remains admirable. What about the brain? (cont.)

30
30 Next time… Reading John Hopfield (1982) Neural Networks and Physical Systems with Emergent Collective Computational Properties, Proc. Nat. Acad. Sci. 79: A highly accessible introduction to the subject, incl. both non- technical and technical approaches can be found at: Some food for thought: a popular article on CNN: Researchers: Its easy to plant false memories, CNN.com, Feb 16, Final lecture about neural networks (for the time being)

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google