Self Organizing Maps (SOM) Unsupervised Learning.

Slides:



Advertisements
Similar presentations
Memristor in Learning Neural Networks
Advertisements

Introduction to Neural Networks Computing
2806 Neural Computation Self-Organizing Maps Lecture Ari Visa.
Un Supervised Learning & Self Organizing Maps. Un Supervised Competitive Learning In Hebbian networks, all neurons can fire at the same time Competitive.
Neural Networks Chapter 9 Joost N. Kok Universiteit Leiden.
Self Organization: Competitive Learning
DATA-MINING Artificial Neural Networks Alexey Minin, Jass 2006.
Kohonen Self Organising Maps Michael J. Watts
Self-Organizing Map (SOM). Unsupervised neural networks, equivalent to clustering. Two layers – input and output – The input layer represents the input.
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
X0 xn w0 wn o Threshold units SOM.
Machine Learning Neural Networks
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
November 19, 2009Introduction to Cognitive Science Lecture 20: Artificial Neural Networks I 1 Artificial Neural Network (ANN) Paradigms Overview: The Backpropagation.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
Un Supervised Learning & Self Organizing Maps Learning From Examples
KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
CS 4700: Foundations of Artificial Intelligence
Un Supervised Learning & Self Organizing Maps Learning From Examples
Lecture 09 Clustering-based Learning
CS 484 – Artificial Intelligence
Radial Basis Function (RBF) Networks
Last lecture summary.
Projection methods in chemistry Autumn 2011 By: Atefe Malek.khatabi M. Daszykowski, B. Walczak, D.L. Massart* Chemometrics and Intelligent Laboratory.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Maps in the Brain and their models. Where are things happening in the brain. Is the information represented locally ? The Phrenologists view at the brain.
 C. C. Hung, H. Ijaz, E. Jung, and B.-C. Kuo # School of Computing and Software Engineering Southern Polytechnic State University, Marietta, Georgia USA.
Lecture 12 Self-organizing maps of Kohonen RBF-networks
KOHONEN SELF ORGANISING MAP SEMINAR BY M.V.MAHENDRAN., Reg no: III SEM, M.E., Control And Instrumentation Engg.
Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques.
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
Self Organized Map (SOM)
CZ5225: Modeling and Simulation in Biology Lecture 5: Clustering Analysis for Microarray Data III Prof. Chen Yu Zong Tel:
Artificial Neural Networks Dr. Abdul Basit Siddiqui Assistant Professor FURC.
Self organizing maps 1 iCSC2014, Juan López González, University of Oviedo Self organizing maps A visualization technique with data dimension reduction.
Chapter 3 Neural Network Xiu-jun GONG (Ph. D) School of Computer Science and Technology, Tianjin University
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
NEURAL NETWORKS FOR DATA MINING
Stephen Marsland Ch. 9 Unsupervised Learning Stephen Marsland, Machine Learning: An Algorithmic Perspective. CRC 2009 based on slides from Stephen.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
Machine Learning Neural Networks (3). Understanding Supervised and Unsupervised Learning.
Self Organizing Feature Map CS570 인공지능 이대성 Computer Science KAIST.
Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering.
UNSUPERVISED LEARNING NETWORKS
Non-Bayes classifiers. Linear discriminants, neural networks.
Clustering.
TreeSOM :Cluster analysis in the self- organizing map Neural Networks 19 (2006) Special Issue Reporter 張欽隆 D
381 Self Organization Map Learning without Examples.
CUNY Graduate Center December 15 Erdal Kose. Outlines Define SOMs Application Areas Structure Of SOMs (Basic Algorithm) Learning Algorithm Simulation.
1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.
Computational Biology Clustering Parts taken from Introduction to Data Mining by Tan, Steinbach, Kumar Lecture Slides Week 9.
Introduction to Neural Networks Introduction to Neural Networks Applied to OCR and Speech Recognition An actual neuron A crude model of a neuron Computational.
Semiconductors, BP&A Planning, DREAM PLAN IDEA IMPLEMENTATION.
Fast Learning in Networks of Locally-Tuned Processing Units John Moody and Christian J. Darken Yale Computer Science Neural Computation 1, (1989)
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Supervised learning network G.Anuradha. Learning objectives The basic networks in supervised learning Perceptron networks better than Hebb rule Single.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Adaptive FIR Neural Model for Centroid Learning in Self-Organizing.
Unsupervised learning: simple competitive learning Biological background: Neurons are wired topographically, nearby neurons connect to nearby neurons.
Chapter 6 Neural Network.
Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!
Supervised Learning – Network is presented with the input and the desired output. – Uses a set of inputs for which the desired outputs results / classes.
J. Kubalík, Gerstner Laboratory for Intelligent Decision Making and Control Artificial Neural Networks II - Outline Cascade Nets and Cascade-Correlation.
Machine Learning 12. Local Models.
Self-Organizing Network Model (SOM) Session 11
Data Mining, Neural Network and Genetic Programming
Self organizing networks
Lecture 22 Clustering (3).
Collaborative Filtering Matrix Factorization Approach
Artificial Neural Networks
Presentation transcript:

Self Organizing Maps (SOM) Unsupervised Learning

Self Organizing Maps T. Kohonen (1995), Self-Organizing Maps. T. Kohonen Dr. Eng., Emeritus Professor of the Academy of Finland His research areas are the theory of self-organization, associative memories, neural networks, and pattern recognition, in which he has published over 300 research papers and four monography books.

SOM – What is it? The most popular ANN algorithm in the unsupervised learning category Converts relationships between high-dimensional data items into simple geometric relationships on a low-dimensional display Compresses information while preserving the most important topological and metric relationships of the primary data items Data visualization, feature extraction, pattern classification, adaptive control of robots, etc.

Vector quantization (VQ) Signal approximation method that forms an approximation to the probability density function p(x) of stochastic variable x using a finite number of so-called codebook vectors (reference vectors/basis vectors) w i, i=1, 2,…,k. Finding closest reference vector w c : c = arg min i {||x-w i ||}, where, ||x-w i || - Euclidian norm Reference vector w i Voronoi set

VQ: Optimization Average expected square of quantization error: E = ||x-w c || 2 p(x)dx ∫ For every x, with occurrance probability given via p(x), we calculate the error how good some w c would approximate x and then integrate over all x to get the total error. Gradient descent method: =   ci (x-w i ),  ci – Kronecker delta (=1 for c=i, 0 otherwise) dwidwi dt Gradient descent is used to find those w c for which the error is minimal.

SOM: Feed-forward network X … … W …

SOM: Components Inputs: x Weights: w X=(R,G,B) is a vector! Of which we have six here. We use 16 codebook vectors (you can choose how many!)

SOM: Algorithm 1.Initialize map (weights) 2.Select a sample (input) 3.Determine neighbors 4.Change weights 5.Repeat from 2 for a finite number of steps

SOM: possible weight initialization methods Random initialization Using initial samples Ordering

SOM: determining neighbors Hexagonal gridRectangular grid

SOM: Gaussian neighborhood function h ci =exp (- ||r c -r i || 2 2t22t2 )

SOM: Neighborhood functions

SOM: Learning rule =  t  h ci (x-w i ), 0<  <1, h ci – neighbourhood function dwidwi dt =    ci (x-w i ),  ci – Kronecker delta (=1 for c=i, 0 otherwise) dwidwi dt Gradient-descent method for VQ: SOM learning rule:

SOM: Learning rate function Linear:  t =  0 (1-t(1/T)) tt Time (steps) Power series:  t =  0 (  0 /  T ) t/T Inverse-of-time:  t =a/(t+b)  0 – initial learning rate  T – final learning rate a, b – constants

W X SOM: Weight development ex.1 Neighborhood relationships are usually preserved (+) Final structure depends on initial condition and cannot be predicted (-) Eight Inputs 40x40 codebook vectors

SOM: Weight development Time (steps) wiwi

W X SOM: Weight development ex Inputs 40x40 codebook vectors

SOM: Examples of maps Good, all neighbors meet! Bad, some neighbors stay apart! Bad Bad cases could be avoided by non-random initialization! Bad

W X SOM: weight development ex.3

SOM: Calculating goodness of fit d j = ∑ir∑ir ||w j -w i || Average distance to neighboring cells: 1r1r Where i=1…r, r is the number of neighboring cells, and j=1…N, N is the number of reference vectors w. The „amount of grey“ measures how good neighbors meet. The less grey the better!

SOM: Examples of grey-level maps Worse Better

X SOM: Classification 1) Use input vector: W 2) Do SOM: 3) Take example: 4) Look in SOM Map who is close to example! 5) Here is your cluster for classification! Some more examples

Biological SOM model =  t  h ci (x-w i w i T x), 0<  <=1, h ci – neighbourhood function dwidwi dt Biological SOM equation: =  t  h ci (x-w i ), 0<  <1, h ci – neighbourhood function dwidwi dt SOM learning rule:

Variants of SOM Neuron-specific learning rates and neighborhood sizes Adaptive or flexible neighborhood definitions Growing map structures

“Batch map” 1.Initialize weights (the first K training samples, where K is the number of weights) 2.For each map unit i collect a list of copies of all those training samples x whose nearest reference vector belongs to the topological neighborhood set N i of unit i 3.Update weights by taking weighted average of the respective list 4.Repeat from 2 a few times w i = ∑j∑j h ic(j) x j ∑j∑j h ic(j) Learning equation: (K-means algorithm)

LVQ-SOM =  t  h ci (x-w i ), if x and w i belong to the same class dwidwi dt = -  t  h ci (x-w i ), if x and w i belong to different classes dwidwi dt 0  <  t < 1 is the learning rate and decreases monotonically with time

Orientation maps using SOM Orientation map in visual cortex of monkey Combined ocular dominance and orientation map. Input consists of rotationally symmetric stimuli (Brockmann et al, 1997) Stimuli

Image analysis Learned basis vectors from natural images. The sampling vectors consisted of 15 by 15 pixels. Kohonen, 1995

Place cells developed by SOM As SOMs have lateral connections, one gets a spatially ordered set of PFs, which is biologically unrealistic.

Place cells developed by VQ In VQ we do not have lateral connections. Thus one gets no orderint, which is biologically more realistic.

Learning perception-action cycles

Features  x

Input and output signals  x Steering (s) Inputs Output

Learning procedure  x  xx Input Layer Output Layer Associative Layer Steering (s a ) … …

Learning procedure For training we used 1000 data samples which contained input-ouput pairs:  (t), x(t) -> s(t). We initialize weights and values for SOM from our data set:    k  (j),  x (k) = x(j) and s a (k)=s(j), where k=1…250 and j denotes indices of random samples from data set. Initialization Learning 1. Select a random sample and present it to the network X(i) = {a(i), x(i)} 2. Find a best matching unit by c=arg min ||X(i)-W(k)|| 3. Update weights W and values of associated output s a by 4. Repeat for a finite number of times =  t  h ci (x(i)-  k ), dkdk dt =  t  h ci (s(i)-s k ) ds kds k dt a a

Generalization and smoothing  x  xx Training data Steering (s) Learned weights and steering Steering (s a )

Learned steering actions Time (Steps) Steering Real (s) Learnt (s a ) With 250 neurons we were able relatively well to approximate human behavior

Autonomous driving robot