EE141 1 Self-organization and error correction Janusz A. Starzyk

Slides:



Advertisements
Similar presentations
© Negnevitsky, Pearson Education, Introduction Introduction Hebbian learning Hebbian learning Generalised Hebbian learning algorithm Generalised.
Advertisements

2806 Neural Computation Self-Organizing Maps Lecture Ari Visa.
Un Supervised Learning & Self Organizing Maps. Un Supervised Competitive Learning In Hebbian networks, all neurons can fire at the same time Competitive.
Unsupervised learning. Summary from last week We explained what local minima are, and described ways of escaping them. We investigated how the backpropagation.
5/16/2015Intelligent Systems and Soft Computing1 Introduction Introduction Hebbian learning Hebbian learning Generalised Hebbian learning algorithm Generalised.
Kohonen Self Organising Maps Michael J. Watts
Artificial neural networks:
Kostas Kontogiannis E&CE
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
X0 xn w0 wn o Threshold units SOM.
Machine Learning Neural Networks
Simple Neural Nets For Pattern Classification
Slides are based on Negnevitsky, Pearson Education, Lecture 8 Artificial neural networks: Unsupervised learning n Introduction n Hebbian learning.
Self-Organizing Hierarchical Neural Network
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
Un Supervised Learning & Self Organizing Maps Learning From Examples
November 9, 2010Neural Networks Lecture 16: Counterpropagation 1 Unsupervised Learning So far, we have only looked at supervised learning, in which an.
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Before we start ADALINE
Data Mining with Neural Networks (HK: Chapter 7.5)
Neural Networks Lecture 17: Self-Organizing Maps
A Hybrid Self-Organizing Neural Gas Network James Graham and Janusz Starzyk School of EECS, Ohio University Stocker Center, Athens, OH USA IEEE World.
Lecture 09 Clustering-based Learning
EE141 1 Self-organization and error correction Janusz A. Starzyk Computational Intelligence Based on a course taught by Prof. Randall O'ReillyRandall O'Reilly.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Radial Basis Function (RBF) Networks
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
CHAPTER 12 ADVANCED INTELLIGENT SYSTEMS © 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang.
Neurons, Neural Networks, and Learning 1. Human brain contains a massively interconnected net of (10 billion) neurons (cortical cells) Biological.
Lecture 12 Self-organizing maps of Kohonen RBF-networks
MSE 2400 EaLiCaRA Spring 2015 Dr. Tom Way
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
Self Organized Map (SOM)
CZ5225: Modeling and Simulation in Biology Lecture 5: Clustering Analysis for Microarray Data III Prof. Chen Yu Zong Tel:
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
Artificial Neural Networks Dr. Abdul Basit Siddiqui Assistant Professor FURC.
Chapter 9 Neural Network.
Artificial Neural Network Unsupervised Learning
Self organizing maps 1 iCSC2014, Juan López González, University of Oviedo Self organizing maps A visualization technique with data dimension reduction.
Chapter 3 Neural Network Xiu-jun GONG (Ph. D) School of Computer Science and Technology, Tianjin University
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
NEURAL NETWORKS FOR DATA MINING
Chapter 7 Neural Networks in Data Mining Automatic Model Building (Machine Learning) Artificial Intelligence.
CS 478 – Tools for Machine Learning and Data Mining Backpropagation.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
Machine Learning Neural Networks (3). Understanding Supervised and Unsupervised Learning.
Self Organizing Feature Map CS570 인공지능 이대성 Computer Science KAIST.
UNSUPERVISED LEARNING NETWORKS
Neural Network Basics Anns are analytical systems that address problems whose solutions have not been explicitly formulated Structure in which multiple.
381 Self Organization Map Learning without Examples.
1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.
Semiconductors, BP&A Planning, DREAM PLAN IDEA IMPLEMENTATION.
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
EEE502 Pattern Recognition
Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!
Supervised Learning – Network is presented with the input and the desired output. – Uses a set of inputs for which the desired outputs results / classes.
Computational Intelligence: Methods and Applications Lecture 9 Self-Organized Mappings Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.
J. Kubalík, Gerstner Laboratory for Intelligent Decision Making and Control Artificial Neural Networks II - Outline Cascade Nets and Cascade-Correlation.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Self-Organizing Network Model (SOM) Session 11
Data Mining, Neural Network and Genetic Programming
Structure learning with deep autoencoders
CSE P573 Applications of Artificial Intelligence Neural Networks
Lecture 22 Clustering (3).
Computational Intelligence: Methods and Applications
Self Organizing Maps A major principle of organization is the topographic map, i.e. groups of adjacent neurons process information from neighboring parts.
Artificial Neural Networks
Presentation transcript:

EE141 1 Self-organization and error correction Janusz A. Starzyk Based on a courses taught by Prof. Randall O'Reilly, University of Colorado,Randall O'Reilly Prof. Włodzisław Duch, Uniwersytet Mikołaja Kopernika and Cognitive Neuroscience and Embodied Intelligence

EE141 2 Learning: types 1. How should an ideal learning system look? 2. How does a human being learn? Detectors (neurons) can change local parameters but we want to achieve a change in the functioning of the entire information processing network. We will consider two types of learning, requiring other mechanisms:  Learning an internal model of the environment (spontaneous).  Learning a task set by the network (supervised).  Combination of both.

EE141 3 Learning operations One output neuron can't learn much. Operation = sensomotor transformation, perception-action. Stimulation and selection of the correct operation, interpretation, expectations, plan… What type of learning does this allow us to explain? What types of learning require additional mechanisms?

EE141 4 Simulation Select self_org.proj.gz, in Chapter inputs 20 hidden neurons, kWTA; The network will learn interesting features.

EE141 5 Chose Self_org.proj from Ch4. 5x5 input has either a single horizontal or vertical line (10 samples) or a combination of 2 lines (45 samples). Learning is possible only for individual lines. Miracle: Hebbian lerning + kWTA is sufficient for the network to make correct internal representations. Simulation

EE x5 = 20 hidden neurons, kWTA. After training (30 epochs presenting all line pairs), selective units responding to single lines appear, (2 units for 2 lines) giving a combinatorial representation! Initially responses to inputs are random but winners quickly appear. Some units (5) remain inactive and they can be used to learn new inputs. Self-organization, but no topological representation, since neighbors respond to different features. 10 unique representations for single line inputs – all correct. Simulation

EE141 7 Sensomotor maps Self-organization is modeled in many ways; simple models are helpful in explaining qualitative features of topographic maps. Fig. from: P.S. Churchland, T.J. Sejnowski, The computational brain. MIT Press, 1992

EE141 8 Motor and somatosensory maps This is a very simplified image, in reality most neurons are multimodal, neurons in the motor cortex react to sensory, aural, and visual impulses (mirror neurons) - many specialized circuits of perception-action-naming.

EE141 9 Finger representation: plasticity Hand Face Before After stimulation stimulation Sensory fields in the cortex expand after stimulation – local dominances resulting from activation Plasticity of cortical areas to sensory-motor representations

EE Simplest models SOM or SOFM (Self-Organized Feature Mapping) – self-organizing feature map, one of the most popular models. How can topographical maps be created in the brain? Local neural connections create strong groups interacting with each other, weaker across greater distances and inhibiting nearby groups. History: von der Malsburg and Willshaw (1976), competitive learning, Hebbian learning with "Mexican hat" potential, mainly visual system Amari (1980) – layered models of neural tissue. Kohonen (1981) – simplification without inhibition; only two essential variables: competition and cooperation.

EE SOM: idea Data: vectors X T = (X 1,... X d ) from d-dimensional space. A net of nodes with local processors (neurons) in each node. Local processor # j has d adaptive parameters W (j). Goal: adjust the W (j) parameters to model the clusters in p-ni X.

EE Training SOM Fritzke's algorithm Growing Neural Gas (GNG) Demonstrations of competitive GNG learning in Java:

EE SOM algorithm: competition Nodes should calculate the similarity of input data to their parameters. Input vector X is compared to node parameters W. Similar = minimal distance or maximal scalar product. Competition: find node j=c with W most similar to X. Node number c is most similar to the input vector X It is a winner, and it will learn to be more similar to X, hence this is a “competitive learning” procedure. Brain: those neurons that react to some signals activate and learn.

EE SOM algorithm: cooperation Cooperation: nodes on a grid close to the winner c should behave similarly. Define the “neighborhood function” O(c): t – iteration number (or time); r c – position of the winning node c (in physical space, usually 2D). ||r-r c || – distance from the winning node, scaled by  c ( t ). h 0 (t) – slowly decreasing multiplicative factor The neighborhood function determines how strongly the parameters of the winning node and nodes in its neighborhood will be changed, making them more similar to data X

EE SOM algorithm: dynamics Adaptation rule: take the winner node c, and those in its neighborhood O(rc), change their parameters making them more similar to the data X Randomly select new sample vector X, and repeat. Decrease h0(t) slowly until there will be no changes. Result:  W(i) ≈ the center of local clusters in the X feature space  Nodes in the neighborhood point to adjacent areas in X space

EE Maps and distortions Initial distortions may slowly disappear or may get frozen... giving the user a completely distorted view of reality.

EE Demonstrations with the help of GNG Growing Self-Organizing Networks demo Growing Self-Organizing Networks demo Growing Self-Organizing Networks demo Growing Self-Organizing Networks demo Parameters of the SOM program: t – iterations  (t) =  i (  f /  i ) t/tmax specifies a step in learning  (t) =  i (  f /  i ) t/tmax specifies the size of the neighborhood Maps 1x30 show the formation of Peano's curves. We can try to reconstruct Penfield's maps.reconstruct Penfield's maps

EE Mapping kWTA CPCA Hebbian learning finds relationship between input and output. Example: pat_assoc.proj.gz in Chapter 5, described in 5.2 Simulations for 3 tasks, from easy to impossible.

EE Derivative based Hebbian learning Hebb's rule:  w kj =  (x k -w kj ) y j will be replaced by derivative based learning based on time domain correlation of firing between neurons. This can be implemented in many ways;  For the signal normalization purpose let us assume that the maximum rate of change between two consecutive time frames is 1.  Let us represent derivative of the signal x(t) change by dx(t). Assume that the neuron responds to signal changes instead of signal activation

EE Derivative based Hebbian learning Define product of derivatives pd kj (t)=dx k (t)*dy j (t). Derivative based weight adjustment will be calculated as follows: Feedforward weights are adjusted as  w kj =   (pd ki (t) - w kj ) |pd ki (t)| and feedback weight are adjusted as  w jk =   (pd ki (t) - w jk ) |pd ki (t)| This adjustment gives symmetrical feedforward and feedback weights. x k (t) y j (t) pd kj (t) t

EE Derivative based Hebbian learning Asymmetrical weight can be obtained by using product of shifted derivative values pd kj (+)=dx k (t)*dy j (t+1) and pd kj (-)=dx k (t)*dy j (t-1). Derivative based weight adjustment will be calculated as follows: Feedforward weights are adjusted as  w kj =   (pd ki (+) - w kj ) |pd ki (+)| and feedback weight are adjusted as  w jk =   (pd ki (-) - w jk ) |pd ki (-)| yjyj xkxk w jk w kj x1x1 x2x2

EE Derivative based Hebbian learning Feedforward weights are adjusted as  w kj =   (pd ki (+) - w kj ) |pd ki (+)| x k (t) y j (t) y j (t+1) pd kj (+) t yjyj xkxk w kj x1x1 x2x2

EE Derivative based Hebbian learning and feedback weight are adjusted as  w jk =   (pd ki (-) - w jk ) |pd ki (-)| x k (t) y j (t) y j (t-1) pd kj (-) t yjyj xkxk w jk x1x1 x2x2

EE Task learning Unfortunately Hebbian learning won't suffice to learn arbitrary relationship between input and output. This can be done by learning based on error correction. Where do goals come from? From the "teacher," or confronting the predictions of the internal model.

EE The Delta rule Idea: weights w ik should be revised so that they change strongly for large errors and not undergo a change if there is no error, so  w ik ~ ||t k – o k || s i Change is also proportional to the size of the activation by input s i Phase + is the presentation of the goal, phase – is the result of the network. This is the delta rule.

EE Credit Assignment Credit/blame assignment  w ik =  ||t k – o k || s i The error is local, for image k. If a large error formed and output o k is significantly smaller than expected then input neurons with a large activation will make the error even larger. If output o k is significantly larger than expected then input neurons with a large activation will decrease it significantly. Eg. input s i is the number of calories in different foods, output is a moderate weight; if it's too big then we must decrease high-calorie weights (food), if it's too small then we must increase them. Representations created by an error-minimalization process are the result of the best assignment of credit to many units, and not the greatest correlation (like in Hebbian models).

EE We don't want the weights to change without limits and not accept negative values. This is consistent with biological demands which separate inhibitory and excitatory neurons and have upper weight limits. The weight change mechanism below, based on the delta rule, ensures the fulfillment of both restrictions.  w ik =  ik (1- w ik ) if  ik >0  w ik =  ik w ik if  ik <0 where  ik is the weight change resulting from error propagation Limiting weights This equation limits the weight values to the 0-1 range. The upper limit is biologically justified by the maximum amount of NT which can be emitted and the maximum density of the synapses  ik  w ik weight 0 1

EE Task learning We want: Hebbian learning and learning using error correction, hidden units and biologically justified models. The combination of error correction and correlations can be aligned with what we know about LTP/LTD  w ij =  [  x i y j  +  x i y j   ] Hebbian networks model states of the world but not perception-action. Error correction can learn mapping. Unfortunately the delta rule is only good for output units, and not hidden units, because it has to be given a goal. Backpropagation of errors can teach hidden units. But there is no good biological justification for this method…

EE Simulations Select: pat_assoc.proj.gz, in Chapt. 5 Description: Chapt The delta rule can learn difficult mappings, at least theoretically...