Chapter 5 Unsupervised learning

Slides:



Advertisements
Similar presentations
Introduction to Neural Networks Computing
Advertisements

2806 Neural Computation Self-Organizing Maps Lecture Ari Visa.
K Means Clustering , Nearest Cluster and Gaussian Mixture
Un Supervised Learning & Self Organizing Maps. Un Supervised Competitive Learning In Hebbian networks, all neurons can fire at the same time Competitive.
Neural Networks Chapter 9 Joost N. Kok Universiteit Leiden.
Unsupervised learning. Summary from last week We explained what local minima are, and described ways of escaping them. We investigated how the backpropagation.
Self Organization: Competitive Learning
5/16/2015Intelligent Systems and Soft Computing1 Introduction Introduction Hebbian learning Hebbian learning Generalised Hebbian learning algorithm Generalised.
Kohonen Self Organising Maps Michael J. Watts
Unsupervised Learning with Artificial Neural Networks The ANN is given a set of patterns, P, from space, S, but little/no information about their classification,
Artificial neural networks:
Adaptive Resonance Theory (ART) networks perform completely unsupervised learning. Their competitive learning algorithm is similar to the first (unsupervised)
Slides are based on Negnevitsky, Pearson Education, Lecture 8 Artificial neural networks: Unsupervised learning n Introduction n Hebbian learning.
Un Supervised Learning & Self Organizing Maps Learning From Examples
November 9, 2010Neural Networks Lecture 16: Counterpropagation 1 Unsupervised Learning So far, we have only looked at supervised learning, in which an.
KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.
Neural Networks based on Competition
Introduction to Neural Networks John Paxton Montana State University Summer 2003.
Lecture 09 Clustering-based Learning
KOHONEN SELF ORGANISING MAP SEMINAR BY M.V.MAHENDRAN., Reg no: III SEM, M.E., Control And Instrumentation Engg.
Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques.
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
Self Organized Map (SOM)
CZ5225: Modeling and Simulation in Biology Lecture 5: Clustering Analysis for Microarray Data III Prof. Chen Yu Zong Tel:
Chapter 4. Neural Networks Based on Competition Competition is important for NN –Competition between neurons has been observed in biological nerve systems.
Artificial Neural Network Unsupervised Learning
NEURAL NETWORKS FOR DATA MINING
Chapter 5. Adaptive Resonance Theory (ART) ART1: for binary patterns; ART2: for continuous patterns Motivations: Previous methods have the following problem:
Chapter 5 Unsupervised learning. Introduction Unsupervised learning –Training samples contain only input patterns No desired output is given (teacher-less)
Soft Computing Lecture 14 Clustering and model ART.
Machine Learning Neural Networks (3). Understanding Supervised and Unsupervised Learning.
Self Organizing Feature Map CS570 인공지능 이대성 Computer Science KAIST.
Ming-Feng Yeh1 CHAPTER 16 AdaptiveResonanceTheory.
Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering.
IE 585 Competitive Network – Learning Vector Quantization & Counterpropagation.
UNSUPERVISED LEARNING NETWORKS
1 Adaptive Resonance Theory. 2 INTRODUCTION Adaptive resonance theory (ART) was developed by Carpenter and Grossberg[1987a] ART refers to the class of.
381 Self Organization Map Learning without Examples.
1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.
What is Unsupervised Learning? Learning without a teacher. No feedback to indicate the desired outputs. The network must by itself discover the relationship.
Unsupervised Learning Networks 主講人 : 虞台文. Content Introduction Important Unsupervised Learning NNs – Hamming Networks – Kohonen’s Self-Organizing Feature.
Self-Organizing Maps (SOM) (§ 5.5)
CHAPTER 14 Competitive Networks Ming-Feng Yeh.
Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!
Intro. ANN & Fuzzy Systems Lecture 20 Clustering (1)
Supervised Learning – Network is presented with the input and the desired output. – Uses a set of inputs for which the desired outputs results / classes.
Data Classification by using Artificial Neural Networks Mohammed Hamdi CS 6800 Spring 2016, WMU.
Machine Learning 12. Local Models.
Self-Organizing Network Model (SOM) Session 11
Data Mining, Neural Network and Genetic Programming
Adaptive Resonance Theory (ART)
Ch7: Hopfield Neural Model
Unsupervised Learning Networks
Counter propagation network (CPN) (§ 5.3)
Dr. Unnikrishnan P.C. Professor, EEE
Chapter 5 Unsupervised learning
Lecture 22 Clustering (3).
Kohonen Self-organizing Feature Maps
Neuro-Computing Lecture 4 Radial Basis Function Network
Neural Networks and Their Application in the Fields of Coporate Finance By Eric Séverin Hanna Viinikainen.
Competitive Networks.
Computational Intelligence: Methods and Applications
Adaptive Resonance Theory
Competitive Networks.
Self-Organizing Maps (SOM) (§ 5.5)
Adaptive Resonance Theory
Dr. Unnikrishnan P.C. Professor, EEE
Feature mapping: Self-organizing Maps
Artificial Neural Networks
Presentation transcript:

Chapter 5 Unsupervised learning

Unsupervised learning Introduction Unsupervised learning Training samples contain only input patterns No desired output is given (teacher-less) Learn to form classes/clusters of sample patterns according to similarities among them Patterns in a cluster would have similar features No prior knowledge as what features are important for classification, and how many classes are there.

Introduction NN models to be covered Applications Competitive networks and competitive learning Winner-takes-all (WTA) Maxnet Hemming net Counterpropagation nets Adaptive Resonance Theory Self-organizing map (SOM) Applications Clustering Vector quantization Feature extraction Dimensionality reduction optimization

NN Based on Competition Competition is important for NN Competition between neurons has been observed in biological nerve systems Competition is important in solving many problems To classify an input pattern into one of the m classes idea case: one class node has output 1, all other 0 ; often more than one class nodes have non-zero output C_m C_1 x_n x_1 INPUT CLASSIFICATION If these class nodes compete with each other, maybe only one will win eventually and all others lose (winner-takes-all). The winner represents the computed classification of the input

Winner-takes-all (WTA): Among all competing nodes, only one will win and all others will lose We mainly deal with single winner WTA, but multiple winners WTA are possible (and useful in some applications) Easiest way to realize WTA: have an external, central arbitrator (a program) to decide the winner by comparing the current outputs of the competitors (break the tie arbitrarily) This is biologically unsound (no such external arbitrator exists in biological nerve system).

Ways to realize competition in NN Lateral inhibition (Maxnet, Mexican hat) output of each node feeds to others through inhibitory connections (with negative weights) Resource competition output of node k is distributed to node i and j proportional to wik and wjk , as well as xi and xj self decay biologically sound xi xj xi xj xk

Fixed-weight Competitive Nets Maxnet Lateral inhibition between competitors Notes: Competition: iterative process until the net stabilizes (at most one node with positive activation) where m is the # of competitors too small: takes too long to converge too big: may suppress the entire network (no winner)

Fixed-weight Competitive Nets Example θ = 1, ε = 1/5 = 0.2 x(0) = (0.5 0.9 1 0.9 0.9 ) initial input x(1) = (0 0.24 0.36 0.24 0.24 ) x(2) = (0 0.072 0.216 0.072 0.072) x(3) = (0 0 0.1728 0 0 ) x(4) = (0 0 0.1728 0 0 ) = x(3) stabilized

Mexican Hat Architecture: For a given node, close neighbors: cooperative (mutually excitatory , w > 0) farther away neighbors: competitive (mutually inhibitory,w < 0) too far away neighbors: irrelevant (w = 0) Need a definition of distance (neighborhood): one dimensional: ordering by index (1,2,…n) two dimensional: lattice

Equilibrium: negative input = positive input for all nodes winner has the highest activation; its cooperative neighbors also have positive activation; its competitive neighbors have negative (or zero) activations.

Hamming Network Hamming distance of two vectors, of dimension n, Number of bits in disagreement. In bipolar:

Hamming Network Hamming network: net computes – d between an input i and each of the P vectors i1,…, iP of dimension n n input nodes, P output nodes, one for each of P stored vector ip whose output = – d(i, ip) Weights and bias: Output of the net:

Example: Three stored vectors: Input vector: Distance: (4, 3, 2) Output vector If we what the vector with smallest distance to I to win, put a Maxnet on top of the Hamming net (for WTA) We have a associate memory: input pattern recalls the stored vector that is closest to it (more on AM later)

Simple Competitive Learning Unsupervised learning Goal: Learn to form classes/clusters of examplers/sample patterns according to similarities of these exampers. Patterns in a cluster would have similar features No prior knowledge as what features are important for classification, and how many classes are there. Architecture: Output nodes: Y_1,…….Y_m, representing the m classes They are competitors (WTA realized either by an external procedure or by lateral inhibition as in Maxnet)

Training: competing phase: rewarding phase: Train the network such that the weight vector wj associated with jth output node becomes the representative vector of a class of input patterns. Initially all weights are randomly assigned Two phase unsupervised learning competing phase: apply an input vector randomly chosen from sample set. compute output for all output nodes: determine the winner among all output nodes (winner is not given in training samples so this is unsupervised) rewarding phase: the winner is reworded by updating its weights to be closer to (weights associated with all other output nodes are not updated: kind of WTA) repeat the two phases many times (and gradually reduce the learning rate) until all weights are stabilized.

Weight update: Method 1: Method 2 In each method, is moved closer to il Normalize the weight vector to unit length after it is updated Sample input vectors are also normalized Distance il – wj il + wj η (il - wj) il il ηil wj wj + η(il - wj) wj wj + ηil

Node j wins for three training samples: i1 , i2 and i3 is moving to the center of a cluster of sample vectors after repeated weight updates Node j wins for three training samples: i1 , i2 and i3 Initial weight vector wj(0) After successively trained by i1 , i2 and i3 , the weight vector changes to wj(1), wj(2), and wj(3), wj(0) wj(3) wj(1) i3 i1 wj(2) i2

Examples A simple example of competitive learning (pp. 168-170) 6 vectors of dimension 3 in 3 classes (6 input nodes, 3 output nodes) Weight matrices: Node A: for class {i2, i4, i5} Node B: for class {i3} Node C: for class {i1, i6}

Comments Ideally, when learning stops, each is close to the centroid of a group/cluster of sample input vectors. To stabilize , the learning rate may be reduced slowly toward zero during learning, e.g., # of output nodes: too few: several clusters may be combined into one class too many: over classification ART model (later) allows dynamic add/remove output nodes Initial : learning results depend on initial weights (node positions) training samples known to be in distinct classes, provided such info is available random (bad choices may cause anomaly) Results also depend on sequence of sample presentation

will always win no matter the sample is from which class Example will always win no matter the sample is from which class is stuck and will not participate in learning unstuck: let output nodes have some conscience temporarily shot off nodes which have had very high winning rate (hard to determine what rate should be considered as “very high”) w1 w2

Results depend on the sequence of sample presentation Example Results depend on the sequence of sample presentation w1 w2 w2 Solution: Initialize wj to randomly selected input vector il that are far away from each other w1