Presentation is loading. Please wait.

Presentation is loading. Please wait.

Unsupervised models and clustering 1. Table of contents Introduction The unsupervised learning framework SelfOrganizing Maps (SOMs) 2.

Similar presentations


Presentation on theme: "Unsupervised models and clustering 1. Table of contents Introduction The unsupervised learning framework SelfOrganizing Maps (SOMs) 2."— Presentation transcript:

1 Unsupervised models and clustering 1

2 Table of contents Introduction The unsupervised learning framework SelfOrganizing Maps (SOMs) 2

3 Introduction  1 In order to efficiently mimic the nervous system, it is necessary to get an idea of the nature of the biological processes that actually take place in the brain The only reasonable assumption is that they are driven by mechanisms aimed at optimizing the target they have to pursue Each cognitive process evolves from an initial situation associated to a stimulus, up to a terminal state in which a response is produced, which is the result of the process itself information transfer It is intuitive that, in this evolution, there is a sort of information transfer 3

4 In fact, it is just the initial stimulus that provides the information necessary to obtain the desired response  Such information must be transmitted and preserved until the conclusion of the process A reasonable way for interpreting the processes that take place in the nervous system is therefore to consider them as transfers of information that follow a “maximum conservation” criterion 4 Introduction  2

5 The unsupervised learning framework  1 Sometimes, in realworld applications, we do not have sufficient knowledge on the problem to be solved in order to entrust to the supervised learning paradigm, i.e. we do not know the solution to the problem, albeit sampled on a set of examples, just having at disposal only data to be analyzed and interpreted, without an a priori ”guideline” of specific information on them A classic problem of this class is to highlight some data groups, with common characteristics, within a “disordered and miscellaneous” set This problem ca be addressed by using a selforganizing neural network  capable of learning data without a supervisor  which provides solutions at specific points in the weight space 5

6 Learning without supervision means aggregating “similar” examples in topologically neighboring regions 6 While in the supervised learning protocol the variation of synaptic connections is done in an attempt to minimize the error with respect to the information provided by the supervisor, in this case, the learning procedure is guided by “similarity” criteria, inherently present within the data The unsupervised learning framework  2

7 Hebb rule An unsupervised learning example: the Hebb rule if two connected neurons are active at the same time, the force (or the weight) of their connection is increased Hebb (1949): if two connected neurons are active at the same time, the force (or the weight) of their connection is increased The connection weight changes by an amount proportional to the product of the activation values of the two neurons with the weight itself, thus tending to “strengthen” the strongest links and to “weaken” the weakest ones More realistic, from the biological point of view, with respect to BackPropagation,7 The unsupervised learning framework  3

8 Summing up… examplessomething interesting Given a set of examples we want to find something interesting The experience is given by the collected “examples” The problem is that of “finding something interesting…” Without any information other than the data itself… The performance depends on how “interesting” is the obtained result 8 The unsupervised learning framework  4

9 9 The unsupervised learning framework  5

10 Selforganizing networks  1 In the central nervous system, the ganglion cells, which constitute the output stage of the retina, are organized according to receptive fields, sensitive to particular stimuli In the auditory system cortex, neurons and fibers are anatomically arranged in an orderly manner with respect to the acoustic frequencies Studies on brain maps during the prenatal stage, i.e. in the absence of sensory stimuli, have demonstrated that, in the lower levels, the neural structure is genetically predetermined, while at higher levels the organization matures and emerges as a result of the learning from examples process 10

11 Trivially, it can be observed that when children learn to read, mark the letters one by one and “think” while reading; in adults, however, whole words and sentences are recognized at a glance This shows that there is a kind of cerebral selforganization, that develops the brain power to classify and easily recognize some “common” patterns, which is also confirmed by the difficulty of reading a text upside down or containing attached words 11 Selforganizing networks  2

12 Studied by Grossberg, Von Malsburg, Rumelhart, Zipser and Kohonen since the mid ‘80s, self organizing neural networks are particularly indicated in those applications where there is no a priori knowledge on how to process certain information, so that the network is expected to automatically “discover” interesting concepts within the “data population” Both the stimulus and the network response are characterized by a given number of components, describing points in appropriate spaces 12 Selforganizing networks  3

13 The learning process can be interpreted as a geometric transformation from the input to the output space The output space has a size smaller than that of the input space, since the stimulus normally contains the information required to activate many simultaneous processes Compared to a single one, it is redundant This means that during selforganization, it is always present an operation of redundancy reduction In both the input and the output space, typical regions are evinced to which the information is associated 13 Selforganizing networks  4

14 The natural mechanism that controls the information transfer must then identify those regions, that are important for the cognitive process, and make sure that they will correspond in the transformation Therefore, in the cognitive process, a data clustering operation is carried out, realized with the acquisition of experience To the two operations of redundancy reduction and clustering biological evidence can be attached, based on the functioning of the nervous system 14 Selforganizing networks  5

15 15 Mapping from the input space to the space defined by the neurons of the selforganizing network Selforganizing networks  6

16 The main use of selforganizing networks is to perform data analysis, in order to highlight regularities among data or to classify them (they are particularly used for shape classification, in image and signal recognition) Selforganizing neural networks are suitable for the solution of different problems with respect to super- vised learning approaches They can be used for classification tasks, such as for handwritten character recognition or for the identification of radar signals clusters They are used to implement clustering, for partitioning data sets into collections of classes, or clusters containing similar patterns They can perform data compression 16 Selforganizing networks  7

17 SelfOrganizing Maps  1 Biological motivation The Kohonen network is modeled on the basis of a characteristic behaviour of neurons in laminar nervous tissues: such neurons are activated, under the action of a stimulus, in groups characterized by the presence or the absence of activity, defining as “activity” the emission of a number of pulses (per unit time) that exceeds a certain threshold The spatial demarcation between the two groups is clear, so that we talk about the formation of “activation bubbles”, obtained starting from a situation of undifferentiated low activity of all the neurons in the network 17

18 In the case of biological neurons, those neurons that are physically closer to active neurons have strong and excitatory connections while the peripheral ones have inhibiting connections In the cerebral cortex, there are projections of sensory stimuli on specific networks of cortical neurons The sensorimotor neurons constitute a distorted map (the extent of each region is proportional to the sensitivity of the corresponding area, not to its size) of the body surface However, adjacent parts of the cortex correspond to adjacent parts on the body surface 18 SelfOrganizing Maps  2

19 19 SelfOrganizing Maps  3

20 Kohonen modelled this feature by restricting the variation of the weights only to those neurons that are close to a selected neuron SOM SOM (T. Kohonen, 1981): “sensorial” maps, constituted by a single layer of neurons, where the units specialize themselves to respond to different stimuli, in such a way that: different inputs activate different (far away) units similar topologically neighboring units are activated by similar inputs 20 SelfOrganizing Maps  4

21 Architecture A single layer of neurons n i, i  1,…, wh (where w is the width and h the height of the map) Each input X = [ x 0, x 1,…, x N  1 ] is connected with all the neurons To each connection, a weight w ij is associated Activation function f i = 1/d(W i,X), where d is a distance Lateral untrainable connections 21 SelfOrganizing Maps  5

22 The weights of each neuron are modified: in an excitatory sense, in proportion to the value of its activation and to those of the neurons belonging to its neighborhood, and also proportionally to the distance from them in an inhibitory sense, in proportion to the value of the activation of the neurons outside the neighborhood, and also proportionally to the distance from them Therefore, proposing the same input to the network: neurons that originally had a high activation value, and their neighbors, will show even a greater activation neurons that responded little, will react even less 22 SelfOrganizing Maps  6

23 If “welldistributed” data are presented to the network, in an iterative fashion, each neuron specializes itself in responding to inputs of a certain type Moreover, neighboring neurons respond to “similar” stimuli projecting, in practice, the input space on the layer of neurons Data dimenionality reduction, from N (input size) to m (map size, usually 23) Each data is represented by the coordinate of the unit on which it is projected, that is the one that has the maximum activation, i.e. the one whose weight is more similar (closer) to the data itself 23 SelfOrganizing Maps  7

24 Lateral interaction The output layer neurons are connected with a “neighborhood” of neurons, according to a system of lateral interaction defined “Mexican hat” These connection weights are not subject to learning, but they are fixed and positive in the periphery of each neuron 24 SelfOrganizing Maps  8

25 In the output layer, a unique neuron must be the “winner” (showing the maximum activation value) for each pattern presented to the network The winner neuron identifies the class to which the input belongs The “Mexican hat” connections tend to favour the formation of bubbles of activation, that identify similar input 25 SelfOrganizing Maps  9

26 In practice: the input space is partitioned into n subspaces, if n is the dimension of the layer (in neurons) each subspace s i of S  {X k } is defined by: s i  {X j t.c. d(X j,W i )  min t d(X j,W t )} 26 Voronoi tessellation SelfOrganizing Maps  10

27 Simplifications of the model to implement the training algorithm: competitive learning weights are changed only in the neighborhood of the neuron that has the maximum activation (this type of training is also known as competitive learning) we consider only the excitatory lateral interactions within a limited neighborhood of the winner neuron Note: Note: changing the weights in the excitatory sense means making them more and more close to the inputs; changing the inhibitory weights causes an opposite behaviour 27 SelfOrganizing Maps  11

28 a  C ( a  learning rate small positive constant  1 ) a  C ( a  learning rate, C small positive constant  1 ) 28 Repeat:  For each input x i belonging to the training set:  For each input x i belonging to the training set: 1. find the winner neuron n j 2. update the weights of the winner neuron and of the neurons belonging to its neighborhood, as: w j (t  1)  w j (t)  a|x i (t  1)  w j (t)|  a(k  1)  a(k)  (1  g) ( g small positive constant  1)  a(k  1)  a(k)  (1  g) ( g small positive constant  1) until the network comes to a stable equilibrium point SelfOrganizing Maps  12

29 29 The role of learning is to rotate the synaptic weight vector to align it to the input vector; in this way, the winner neuron is even more sensitized to the recognition of a particular input pattern SelfOrganizing Maps  13

30 The SOM learning algorithm is very simple (it does not require the derivative calculation): Neuron j*, with the weight vector closest to the input pattern, is selected; that neuron “recall” the input vector and changes its weight in order to align it with the input Also the weight vectors of the neighboring neurons of j* are updated the network tries to create regions consisting of a large set of values ​​ around the input with which it learns those vectors that are spatially close to the training values will be correctly classified even if the network has never seen them before (generalization capability) 30 SelfOrganizing Maps  14

31 The purpose of a SOM is to have, for similar inputs, neighboring winner neurons, so that each activation bubble represents a class, containing all the input data with common characteristics The neighboring region can be chosen as a square, a circle or a hexagon around the winner neuron The neighborhood must be chosen large, at the beginning, whereas it should slowly decrease during learning 31 SelfOrganizing Maps  15

32 Summing up… SOMs are able to realize data clustering, i.e. they can identify, in the input space, a set of partitions induced by the similarities/differences among data Each partition is represented by a prototype (centroid) defined by the weights of the corresponding neuron SOMs produce an unsupervised clustering: they do not exploit any a priori information about the classes (number and type) After training, it is possible to label (classify) data, based on the partition of the input space to which they belong 32 SelfOrganizing Maps  16

33 Gene clustering from microarray data Prediction of microRNA target (miRNAs are small non coding RNAs that regulate transcriptional processes via binding the target gene mRNA) Human tissue classification DNA motif identification for extracting binding sites in DNA sequences Prediction of the behavior of synthetic peptides in inhibiting the adhesion of human tumor cells to extracellular proteins … 33 SOMs and Bioinformatics

34 Since additional structural information is often available in possible applications of self organizing maps (SOMs), a transfer of standard unsupervised learning methods to sequences and more complex graph structures would be valuable Nevertheless SOM constitutes a metricbased approach, and therefore it can be applied directly to structured data if data comparison is defined and a notion of adaptation within the data space can be found 34 SOMs and structured data


Download ppt "Unsupervised models and clustering 1. Table of contents Introduction The unsupervised learning framework SelfOrganizing Maps (SOMs) 2."

Similar presentations


Ads by Google