Presentation on theme: "Unsupervised learning. Summary from last week We explained what local minima are, and described ways of escaping them. We investigated how the backpropagation."— Presentation transcript:
Summary from last week We explained what local minima are, and described ways of escaping them. We investigated how the backpropagation algorithm can be improved by changing various parameters and re-training.
Unsupervised learning Supervised learning = 'teacher' presents input patterns and desired target result. Unsupervised learning = input patterns but no 'teaching signal'. Self organisation = showing patterns to be classified, network produces own output representation.
Three properties required Value of output used as measure of similarity between input pattern and pattern stored in neuron. Competitive learning strategy selects neuron with largest response. Method of reinforcing largest response.
Self-organising maps (SOMs) Inspiration from Biology: In auditory pathway nerve cells arranged in relation to frequency response (tonotopic organisation). Kohonen took inspiration from to produce self- organising maps (SOMs). In SOM units located physically next to one another will respond to input vectors that are ‘similar’.
SOMs Useful, as difficult for Humans to visualise when data has > 3 dimensions. Large dimensional input vectors 'projected down' onto 2-D map in way maintaining natural order similarity. SOM is 2-D array of neurons, all inputs arriving at all neurons (See Fig.).
SOMs Initially each neuron has own set of (random) weights. When input arrives neuron with pattern of weights most similar to input gives largest response.
SOMs Positive excitatory feedback between SOM unit and nearest neighbours. Causes all the units in ‘neighbourhood’ of winner unit to learn. As distance from winning unit increases degree of excitation falls until it becomes inhibition. Bubble of activity (neighbourhood) around unit with largest net input (Mexican-Hat function, See Fig.).
SOMs Initially each weight set to random number. Euclidean distance D used to find difference between input vectors and weights of SOM units (D = square root of the sum of the squared differences) =
SOMs For a 2-dimensional problem, the distance calculated in each neuron is:
Input vector simultaneously compared to all elements in network, one with lowest D is winner. Update weights all in neighbourhood around winning unit. As learning proceeds size of neighbourhood diminished until has only a single unit. If winner is ‘c’, neighbourhood defined as being Mexican Hat function around ‘c’ (see Fig.).
SOMs Weights of units are adjusted using: wij = k(xi – wij )Yj Where Y j from Mexican Hat function (controlled by N c )
SOMs k is a value which changes over time (high at start of training, low later on). If unit lies within the neighbourhood of winning unit its weight changed by difference between its weight vector and vector x multiplied by time factor k and function Yj. Each weight vector being updated rotates slightly toward input vector x.
Two distinct phases in training Initial ordering phase: units find correct topological order (might take 1000 iterations where k decreases from 0.9 to 0.01, Nc decreases l from ½ diameter of the network to 1 unit. Final convergence phase: accuracy of weights improves. (k may decrease from 0.01 to 0 while Nc stays at 1 unit. Phase could be 10 to 100 times longer depending on desired accuracy.
Examples In notes: 2-D array of elements arranged in square to map rectangular 2-D coordinate space onto array where units learn to recognise their relative positions in two-dimensional space. Mapping world poverty (shown on video). Credit card fraud detection.
SOMs Possible to identify which regions belong to which class by showing network known patterns seeing which areas active.
Feature map classifier Has an additional layer(s) of units that form output layer, can be trained by several methods (including backpropagation) to produce particular output given particular pattern of activation on SOM (see Fig.).
Neural phonetic typewriter (1986) Can transcribe speech into written text from unlimited (Finnish) vocabulary in real time. Accuracy 92-97%. 2-D array of units trained using 15-D inputs from pre-processed speech. Units in the 2-dimensional array are allowed to organise themselves in response to the input vectors. After training SOM calibrated using spectra of phonemes as inputs.
Neural phonetic typewriter (1986) After training SOM calibrated using spectra of phonemes as inputs. Path across network results in phonetic transcription of the word. This used as input to rule-based system to be compared with known words.
Summary Defined unsupervised learning, where no external ‘teacher’ is present. Discussed a self-organizing neural network called a Self-Organising Map (SOM). SOM uses unsupervised learning to physically arrange its neurons so that the patterns that it stores are arranged such that similar patterns are close to each other and dissimilar patterns are far apart.