381 Self Organization Map Learning without Examples.

Slides:



Advertisements
Similar presentations
© Negnevitsky, Pearson Education, Introduction Introduction Hebbian learning Hebbian learning Generalised Hebbian learning algorithm Generalised.
Advertisements

2806 Neural Computation Self-Organizing Maps Lecture Ari Visa.
Instar and Outstar Learning Laws Adapted from lecture notes of the course CN510: Cognitive and Neural Modeling offered in the Department of Cognitive and.
Un Supervised Learning & Self Organizing Maps. Un Supervised Competitive Learning In Hebbian networks, all neurons can fire at the same time Competitive.
Neural Networks Chapter 9 Joost N. Kok Universiteit Leiden.
Unsupervised learning. Summary from last week We explained what local minima are, and described ways of escaping them. We investigated how the backpropagation.
Self Organization: Competitive Learning
5/16/2015Intelligent Systems and Soft Computing1 Introduction Introduction Hebbian learning Hebbian learning Generalised Hebbian learning algorithm Generalised.
DATA-MINING Artificial Neural Networks Alexey Minin, Jass 2006.
Kohonen Self Organising Maps Michael J. Watts
Unsupervised Learning with Artificial Neural Networks The ANN is given a set of patterns, P, from space, S, but little/no information about their classification,
Artificial neural networks:
Adaptive Resonance Theory (ART) networks perform completely unsupervised learning. Their competitive learning algorithm is similar to the first (unsupervised)
Competitive learning College voor cursus Connectionistische modellen M Meeter.
Unsupervised Networks Closely related to clustering Do not require target outputs for each input vector in the training data Inputs are connected to a.
Self-Organizing Map (SOM). Unsupervised neural networks, equivalent to clustering. Two layers – input and output – The input layer represents the input.
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
X0 xn w0 wn o Threshold units SOM.
Self Organizing Maps. This presentation is based on: SOM’s are invented by Teuvo Kohonen. They represent multidimensional.
Simple Neural Nets For Pattern Classification
Un Supervised Learning & Self Organizing Maps Learning From Examples
November 9, 2010Neural Networks Lecture 16: Counterpropagation 1 Unsupervised Learning So far, we have only looked at supervised learning, in which an.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
An Illustrative Example
KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.
Instar Learning Law Adapted from lecture notes of the course CN510: Cognitive and Neural Modeling offered in the Department of Cognitive and Neural Systems.
Neural Networks based on Competition
Introduction to Neural Networks John Paxton Montana State University Summer 2003.
Neural Networks Lecture 17: Self-Organizing Maps
Lecture 09 Clustering-based Learning
Ming-Feng Yeh1 CHAPTER 3 An Illustrative Example.
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Lecture 12 Self-organizing maps of Kohonen RBF-networks
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
Self Organized Map (SOM)
CZ5225: Modeling and Simulation in Biology Lecture 5: Clustering Analysis for Microarray Data III Prof. Chen Yu Zong Tel:
Artificial Neural Networks Dr. Abdul Basit Siddiqui Assistant Professor FURC.
Artificial Neural Network Unsupervised Learning
A Scalable Self-organizing Map Algorithm for Textual Classification: A Neural Network Approach to Thesaurus Generation Dmitri G. Roussinov Department of.
Self-Organizing Maps Corby Ziesman March 21, 2007.
George F Luger ARTIFICIAL INTELLIGENCE 6th edition Structures and Strategies for Complex Problem Solving Machine Learning: Connectionist Luger: Artificial.
Machine Learning Neural Networks (3). Understanding Supervised and Unsupervised Learning.
So Far……  Clustering basics, necessity for clustering, Usage in various fields : engineering and industrial fields  Properties : hierarchical, flat,
Self Organizing Feature Map CS570 인공지능 이대성 Computer Science KAIST.
Ming-Feng Yeh1 CHAPTER 16 AdaptiveResonanceTheory.
UNSUPERVISED LEARNING NETWORKS
CUNY Graduate Center December 15 Erdal Kose. Outlines Define SOMs Application Areas Structure Of SOMs (Basic Algorithm) Learning Algorithm Simulation.
What is Unsupervised Learning? Learning without a teacher. No feedback to indicate the desired outputs. The network must by itself discover the relationship.
Unsupervised Learning Networks 主講人 : 虞台文. Content Introduction Important Unsupervised Learning NNs – Hamming Networks – Kohonen’s Self-Organizing Feature.
Semiconductors, BP&A Planning, DREAM PLAN IDEA IMPLEMENTATION.
Self-Organizing Maps (SOM) (§ 5.5)
CHAPTER 14 Competitive Networks Ming-Feng Yeh.
Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!
Soft Computing Lecture 15 Constructive learning algorithms. Network of Hamming.
Another Example: Circle Detection
Chapter 5 Unsupervised learning
Self-Organizing Network Model (SOM) Session 11
Deep Feedforward Networks
Data Mining, Neural Network and Genetic Programming
Unsupervised Learning Networks
Lecture 22 Clustering (3).
Competitive Networks.
Competitive Networks.
Introduction to Cluster Analysis
Feature mapping: Self-organizing Maps
Artificial Neural Networks
Unsupervised Networks Closely related to clustering
Presentation transcript:

381 Self Organization Map Learning without Examples

2 Learning Objectives 1. Introduction 2. Hamming Net and MAXNET 3. Clustering 4. Feature Map 5. Self-organizing Feature Map 6. Conclusion

3 Introduction 1. Learning without examples 2. Data are input to the system one by one 3. Mapping the data into one or more dimension space 4. Competitive learning

4 Hamming Distance(1/5) 1. Define HD(Hamming Distance) between two binary codes A and B of the same length as the number of place in which a i, and b j differ. 2. Ex: A: [ ] B: [ ] => HD(A, B) = 3

5 Hamming Distance(2/5)

6 Hamming Distance(3/5)

7 Hamming Distance(4/5) 3.Or, HD can be defined as the lowest number of edges that must be traversed between the two relevant codes. 4.If A and B are bipolar binary components, then the scalar product of A, B :

8 Hamming Distance(5/5)

9 Hamming Net and MAXNET(1/30) 1. For a two layer classifier of binary bipolar vectors, and p classes, p output neurons, the strongest response of a neuron indicates it has the minimum HD value and the category this neuron represents.

10 Hamming Net and MAXNET(2/30)

11 Hamming Net and MAXNET(3/30) 2. The MAXNET operates to suppress values at output instead of output values of Hamming network. 3. For the Hamming net in next figure, we have input vector X p classes => p neurons for output output vector Y = [y 1, …… y p ]

12 Hamming Net and MAXNET(4/30)

13 Hamming Net and MAXNET(5/30) 4.for any output neuron, m, m=1, …… p, we have W m = [w m1, w m2, …… w mn ] t and m=1,2, …… p to be the weights between input X and each output neuron. 5.Also, assuming that for each class m, one has the prototype vector S (m) as the standard to be matched.

14 Hamming Net and MAXNET(6/30) 6.For classifying p classes, one can say the m ’ th output is 1 if and only if X= S (m) => happens only W (m) = S (m) output for the classifier are X t S (1), X t S (2), … X t S (m), … X t S (p) So when X= S (m), the m ’ th output is n and other outputs are smaller than n.

15 Hamming Net and MAXNET(7/30) 7. X t S (m) = (n - HD(X, S (m) ) ) - HD(X, S (m) ) ∴ ½ X t S (m) = n/2 – HD(X, S (m) ) So the weight matrix: WH=½S WH=½S WH=½S WH=½S

16 Hamming Net and MAXNET(8/30) 8.By giving a fixed bias n/2 to the input then net m = ½ X t S (m) + n/2 for m=1,2, …… p or net m = n - HD(X, S (m) )

17 Hamming Net and MAXNET(9/30) 9. To scale the input 0~n to 0~1 down, one can apply transfer function as f(net m ) = net m for m=1,2, …..p

18 Hamming Net and MAXNET(10/30)

19 Hamming Net and MAXNET(11/30) 10.So for the node with the the highest output means that the node has smallest HD between input and prototype vectors S (1) …… S (m) i.e. f(net m ) = 1 for other nodes f(net m ) < 1

20 Hamming Net and MAXNET(12/30) 11.MAXNET is employed as a second layer only for the cases where an enhancement of the initial dominant response of m ’ th node is required. i.e., the purpose of MAXNET is to let max{ y 1, …… y p } equal to 1 and let others equal to 0.

21 Hamming Net and MAXNET(13/30)

22 Hamming Net and MAXNET(14/30) 12.To achieve this, one can let where i=1, …… p ; j=1, …… p ; j ≠i ∵ y i is bounded by 0 ≦ y i ≦ 1 for i=1, …… p ↗ output of Hamming Net. and can only be 1 or 0.

23 Hamming Net and MAXNET(15/30) 13.So ε is bounded by 0<ε<1/p and ε : lateral interaction coefficient

24 Hamming Net and MAXNET(16/30)And

25 Hamming Net and MAXNET(17/30) 14. So the transfer function

26 Hamming Net and MAXNET(18/30)

27 Hamming Net and MAXNET(19/30) Ex: To have a Hamming Net for classifying C, I, T then S (1) = [ ] t S (2) = [ ] t S (3) = [ ] t

28 Hamming Net and MAXNET(20/30)

29 Hamming Net and MAXNET(21/30)So

30 Hamming Net and MAXNET(22/30)For And

31 Hamming Net and MAXNET(23/30) Input to MAXNET and select  =0.2 < 1/3(=1/p) So

32 Hamming Net and MAXNET(24/30)And

33 Hamming Net and MAXNET(25/30)k=0

34 Hamming Net and MAXNET(27/30)k=1

35 Hamming Net and MAXNET(28/30)k=2

36 Hamming Net and MAXNET(29/30)k=3

37 Hamming Net and MAXNET(30/30)Summary: Hamming network only tells which class it will mostly like, not to restore distorted pattern. Hamming network only tells which class it will mostly like, not to restore distorted pattern.

38 Clustering—Unsupervised Learning(1/20) 1. Introduction a.to categorize or cluster data b.grouping similar objects and separating of dissimilar ones 2.Assuming pattern set {X 1, X 2,……X N } is submitted to determine the decision function required to identify possible clusters, => by similarity rules

39 Clustering—Unsupervised Learning(2/20)

40 Clustering—Unsupervised Learning(3/20)

41 Clustering—Unsupervised Learning(4/20) 3. Winner-Take-All learning Assuming input vectors are to be classified into one of specific number of p categories according to the clusters detected in the training set {X 1, X 2, …… X N }.

42 Clustering—Unsupervised Learning(5/20)

43 Clustering—Unsupervised Learning(6/20) Kohonen Network Y=f(W X) for vector size = n

44 Clustering—Unsupervised Learning(7/20) prior to the learning, normalization of all weight vectors is required So

45 Clustering—Unsupervised Learning(8/20) The meaning of training is to find from all such that i.e., the m ’th neuron with vector is the closest approximation of the current X.

46 Clustering—Unsupervised Learning(9/20) From

47 Clustering—Unsupervised Learning(10/20)So The m ’ th neuron is the winning neuron and has the largest value of net i, i=1, …… p.

48 Clustering—Unsupervised Learning(11/20) Thus W m should be adjusted such that ||X - W m || is reduced. ||X - W m || is reduced. Then, ||X - W m || can be reduced by the direction of gradient. ▽ W m ||X - W m || 2 = -2(X - W m ) Or, increase in (X - W m ) direction.

49 Clustering—Unsupervised Learning(12/20) For any X is just a single step, we want only fraction of (X - W m ), thus for neuron m: ▽ W m ’ = α(X - W m )0.1< α <0.7 for other neurons: ▽ W i ’ = 0

50 Clustering—Unsupervised Learning(13/20) In general:

51 Clustering—Unsupervised Learning(15/20) 旐 4. Geometrical interpretation input vector (normalized) and weight is winner => =>

52 Clustering—Unsupervised Learning(16/20)

53 Clustering—Unsupervised Learning(17/20) So is created. And

54 Clustering—Unsupervised Learning(18/20) Thus the weight adjustment is mainly the rotation of the weight vector toward input vector without a significant length change. And W m ’ is not a normal, so in the new training stage, W m ’ must be normalized again. is a vector point to the gravity of each cluster on a unity sphere.

55 Clustering—Unsupervised Learning(19/20) 5. I n the case of some patterns are known class, then α > 0 for correct node α < 0 otherwise This will accelerate the learning process significantly.

56 Clustering—Unsupervised Learning(20/20) 6. Another modification is to adjust the weight for both winner and losers.  leaky competitive learning 7. Recall Y= f(W X) 8. Initialization of weights Randomly choose weight from U(0,1) or convex combination

57 Feature Map(1/6) 1. Transform from high dimensional pattern space to low-dimension feature space. 2. Feature extraction: two category:natural structure no natural structure -- depend on pattern is similar to human perception or not

58 Feature Map(2/6)

59 Feature Map(3/6) 3. another important aspect is to represent the feature as natural as possible. 4. Self organizing neural array that can map features from pattern space.

60 Feature Map(4/6) 5. Use one dimensional array or Two dimensional array 6. Example: X: input vector i: neuron i w ij :weight from input x j to neuron i y i : output of neuron i

61 Feature Map(5/6)

62 Feature Map(6/6) 7. So define 8.Thus, (b) is the result of (a)

63 Self-organizing Feature Map(1/6) 1. One dimensional mapping set of pattern X i, i=1,2,…. use a linear array i 1 > i 2 > i 3 >….. if

64 Self-organizing Feature Map(2/6) 2. Thus, this learning is to find the best matching neuron cells whch can activate their spatial neighbors to react to the same input. 3.Or to find W c such that

65 Self-organizing Feature Map(3/6) 4. In case of two winners, choose lower i. 5. If c is the winning neuron then define N c as the neighborhood around c and N c is always changing as learning going. 6. Then

66 Self-organizing Feature Map(4/6) 7.  : where t: current training iteration T: total # of training steps to be done. 8.  starts from  0 and is decreased until it reaches value of 0

67 Self-organizing Feature Map(5/6) 9.If rectangular neighborhood is used then c-d < x < c+d c-d < y < c+d and as learning continues d is decreased from d 0 to 1

68 Self-organizing Feature Map- example 10. Example: a. Input patterns are chosen randomly from U[0,1]. b. The data employed in the experiment comprised 500 points distributed uniformly over the bipolar square [−1,1] × [−1, 1] c. The points thus describe a geometrically square topology. d. Thus, initial weights can be plotted as in next figure. d. “ - ” connection line connects two adjacent neuron in competitive layer.

69 Self-organizing Feature Map- example

70 Self-organizing Feature Map-- Example

71 Self-organizing Feature Map— Example

72 Self-organizing Feature Map Example