Presentation is loading. Please wait.

Presentation is loading. Please wait.

Machine Learning Neural Networks (3). Understanding Supervised and Unsupervised Learning.

Similar presentations


Presentation on theme: "Machine Learning Neural Networks (3). Understanding Supervised and Unsupervised Learning."— Presentation transcript:

1 Machine Learning Neural Networks (3)

2 Understanding Supervised and Unsupervised Learning

3 Two possible Solutions…

4 Supervised Learning It is based on a labeled training set. The class of each piece of data in training set is known. Class labels are pre-determined and provided in the training phase.

5 Unsupervised Learning Supervised learning, in which an external teacher improves network performance by comparing desired and actual outputs and modifying the synaptic weights accordingly. However, most of the learning that takes place in our brains is completely unsupervised. This type of learning is aimed at achieving the most efficient representation of the input space, regardless of any output space.

6 Supervised Vs Unsupervised Task performed Classification Pattern Recognition NN model : Preceptron Feed-forward NN “What is the class of this data point?” Task performed Clustering NN Model : Self Organizing Maps “What groupings exist in this data?” “How is each data point related to the data set as a whole?”

7 Unsupervised Learning Input : set of patterns P, from n-dimensional space S, but little/no information about their classification, evaluation, interesting features, etc. It must learn these by itself! : ) Tasks: – Clustering - Group patterns based on similarity – Vector Quantization - Fully divide up S into a small set of regions (defined by codebook vectors) that also helps cluster P. – Feature Extraction - Reduce dimensionality of S by removing unimportant features (i.e. those that do not help in clustering P)

8 Unsupervised learning The network must discover for itself patterns, features, regularities,correlations or categories in the input data and code them for the output. The units and connections must self-organize themselves based on the stimulus they receive. Note that unsupervised learning is useful when there is redundancy in the input data. Redundancy provides knowledge.

9 Unsupervised Neural Networks – Kohonen Learning Also defined – Self Organizing Map Learn a categorization of input space Neurons are connected into a 1-D or 2-D lattice. Each neuron represents a point in N- dimensional pattern space, defined by N weights During training, the neurons move around to try and fit to the data Changing the position of one neuron in data space influences the positions of its neighbors via the lattice connections

10 Unsupervised Learning Applications of unsupervised learning include Clustering Vector quantization Data compression Feature extraction

11 Self Organizing Map – Network Structure All inputs are connected by weights to each neuron size of neighbourhood changes as net learns Aim is to map similar inputs (sets of values) to similar neuron positions. Data is clustered because it is mapped to the same node or group of nodes

12 Self-organising maps (SOMs) Inspiration from Biology: In auditory pathway nerve cells arranged in relation to frequency response (tonotopic organisation). Kohonen took inspiration from to produce self- organising maps (SOMs). In SOM units located physically next to one another will respond to input vectors that are ‘similar’.

13 SOMs Useful, as difficult for Humans to visualise when data has > 3 dimensions. Large dimensional input vectors 'projected down' onto 2-D map in way maintaining natural order similarity. SOM is 2-D array of neurons, all inputs arriving at all neurons.

14 SOMs Initially each neuron has own set of (random) weights. When input arrives neuron with pattern of weights most similar to input gives largest response.

15 SOMs Positive excitatory feedback between SOM unit and nearest neighbours. Causes all the units in ‘neighbourhood’ of winner unit to learn. As distance from winning unit increases degree of excitation falls until it becomes inhibition. Bubble of activity (neighbourhood) around unit with largest net input (Mexican-Hat function).

16 SOMs Initially each weight set to random number. Euclidean distance D used to find difference between input vectors and weights of SOM units (D = square root of the sum of the squared differences) =

17 SOMs For a 2-dimensional problem, the distance calculated in each neuron is:

18 SOM Input vector simultaneously compared to all elements in network, one with lowest D is winner. Update weights all in neighbourhood around winning unit. If winner is ‘c’, neighbourhood defined as being Mexican Hat function around ‘c’.

19 SOMs Weights of units are adjusted using:  w ij = k(x i – w ij )Y j Where Y j from Mexican Hat function k is a value which changes over time (high at start of training, low later on).

20 SOM-Algorithm 1. Initialization :Weights are set to unique random values 2. Sampling : Draw an input sample x and present in to network 3. Similarity Matching : The winning neuron i is the neuron with the weight vector that best matches the input vector i = argmin(j){ x – w j }

21 SOM - Algorithm 4. Updating : Adjust the weights of the winning neuron so that they better match the input. Also adjust the weights of the neighbouring neurons. ∆w j = η. h ij ( x – w j ) neighbourhood function : h ij over time neigbourhood function gets smaller Result: The neurons provide a good approximation of the input space and correspond

22 Two distinct phases in training Initial ordering phase: units find correct topological order (might take 1000 iterations where k decreases from 0.9 to 0.01, Nc decreases l from ½ diameter of the network to 1 unit. Final convergence phase: accuracy of weights improves. (k may decrease from 0.01 to 0 while Nc stays at 1 unit. Phase could be 10 to 100 times longer depending on desired accuracy.

23 23 WEBSOM All words of document are mapped into the word category map Histogram of “hits” on it is formed Self-organizing map. Largest experiments have used: word-category map 315 neurons with 270 inputs each Document-map 104040 neurons with 315 inputs each Self-organizing semantic map. 15x21 neurons Interrelated words that have similar contexts appear close to each other on the map Self-organizing maps of document collections.

24 NN 424 WEBSOM

25 Clustering Data

26 K-Means Clustering K-Means ( k, data ) Randomly choose k cluster center locations (centroids). Loop until convergence Assign each point to the cluster of the closest centroid. Reestimate the cluster centroids based on the data assigned to each.

27 K-Means Clustering K-Means ( k, data ) Randomly choose k cluster center locations (centroids). Loop until convergence Assign each point to the cluster of the closest centroid. Reestimate the cluster centroids based on the data assigned to each.

28 K-Means Clustering K-Means ( k, data ) Randomly choose k cluster center locations (centroids). Loop until convergence Assign each point to the cluster of the closest centroid. Reestimate the cluster centroids based on the data assigned to each.

29 K-Means Animation Example generated by Andrew Moore using Dan Pelleg’s super- duper fast K-means system: Dan Pelleg and Andrew Moore. Accelerating Exact k-means Algorithms with Geometric Reasoning. Proc. Conference on Knowledge Discovery in Databases 1999.

30 30 K-means Clustering – Initialize the K weight vectors, e.g. to randomly chosen examples. Each weight vector represents a cluster. – Assign each input example x to the cluster c(x) with the nearest corresponding weight vector: – Update the weights: – Increment n by 1 and go until no noticeable changes of weight vectors occur.

31 Simple competitive learning

32 32 Issues How many clusters? – User given parameter K – Use model selection criteria (Bayesian Information Criterion) with penalization term which considers model complexity. See e.g. X- means: http://www2.cs.cmu.edu/~dpelleg/kmeans.html What similarity measure? – Euclidean distance – Correlation coefficient – Ad-hoc similarity measure How to assess the quality of a clustering? – Compact and well separated clusters are better … many different quality measures have been introduced. See e.g. C. H. Chou, M. C. Su and E. Lai, “A New Cluster Validity Measure and Its Application to Image Compression,” Pattern Analysis and Applications, vol. 7, no. 2, pp. 205-220, 2004. (SCI)


Download ppt "Machine Learning Neural Networks (3). Understanding Supervised and Unsupervised Learning."

Similar presentations


Ads by Google