Neural Networks Part 4 Dan Simon Cleveland State University 1.

Slides:



Advertisements
Similar presentations
Multi-Layer Perceptron (MLP)
Advertisements

Introduction to Neural Networks Computing
Perceptron Learning Rule
Artificial Intelligence 13. Multi-Layer ANNs Course V231 Department of Computing Imperial College © Simon Colton.
5/16/2015Intelligent Systems and Soft Computing1 Introduction Introduction Hebbian learning Hebbian learning Generalised Hebbian learning algorithm Generalised.
Artificial neural networks:
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
X0 xn w0 wn o Threshold units SOM.
Classification and Prediction: Regression Via Gradient Descent Optimization Bamshad Mobasher DePaul University.
Machine Learning Neural Networks
Overview over different methods – Supervised Learning
Simple Neural Nets For Pattern Classification
RBF Neural Networks x x1 Examples inside circles 1 and 2 are of class +, examples outside both circles are of class – What NN does.
Slides are based on Negnevitsky, Pearson Education, Lecture 8 Artificial neural networks: Unsupervised learning n Introduction n Hebbian learning.
September 21, 2010Neural Networks Lecture 5: The Perceptron 1 Supervised Function Approximation In supervised learning, we train an ANN with a set of vector.
KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.
Neural Networks Part 3 Dan Simon Cleveland State University 1.
Before we start ADALINE
Data Mining with Neural Networks (HK: Chapter 7.5)
1 Pertemuan 9 JARINGAN LEARNING VECTOR QUANTIZATION Matakuliah: H0434/Jaringan Syaraf Tiruan Tahun: 2005 Versi: 1.
Learning Techniques for Information Retrieval We cover 1.Perceptron algorithm 2.Least mean square algorithm 3.Chapter 5.2 User relevance feedback (pp )
Introduction to Neural Networks John Paxton Montana State University Summer 2003.
CS 4700: Foundations of Artificial Intelligence
Neural Networks Lecture 17: Self-Organizing Maps
Lecture 09 Clustering-based Learning
Dan Simon Cleveland State University
Radial Basis Function (RBF) Networks
Radial-Basis Function Networks
Ranga Rodrigo April 5, 2014 Most of the sides are from the Matlab tutorial. 1.
Neural Networks Lecture 8: Two simple learning algorithms
CHAPTER 12 ADVANCED INTELLIGENT SYSTEMS © 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang.
CSSE463: Image Recognition Day 21 Upcoming schedule: Upcoming schedule: Exam covers material through SVMs Exam covers material through SVMs.
Neurons, Neural Networks, and Learning 1. Human brain contains a massively interconnected net of (10 billion) neurons (cortical cells) Biological.
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
Neural NetworksNN 11 Neural netwoks thanks to: Basics of neural network theory and practice for supervised and unsupervised.
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Machine Learning Neural Networks (3). Understanding Supervised and Unsupervised Learning.
So Far……  Clustering basics, necessity for clustering, Usage in various fields : engineering and industrial fields  Properties : hierarchical, flat,
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 24 Nov 2, 2005 Nanjing University of Science & Technology.
Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering.
IE 585 Competitive Network – Learning Vector Quantization & Counterpropagation.
ADALINE (ADAptive LInear NEuron) Network and
CSSE463: Image Recognition Day 14 Lab due Weds, 3:25. Lab due Weds, 3:25. My solutions assume that you don't threshold the shapes.ppt image. My solutions.
1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.
Semiconductors, BP&A Planning, DREAM PLAN IDEA IMPLEMENTATION.
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
Supervised learning network G.Anuradha. Learning objectives The basic networks in supervised learning Perceptron networks better than Hebb rule Single.
November 21, 2013Computer Vision Lecture 14: Object Recognition II 1 Statistical Pattern Recognition The formal description consists of relevant numerical.
CHAPTER 14 Competitive Networks Ming-Feng Yeh.
Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!
Neural NetworksNN 21 Architecture We consider the architecture: feed- forward NN with one layer It is sufficient to study single layer perceptrons with.
Giansalvo EXIN Cirrincione unit #4 Single-layer networks They directly compute linear discriminant functions using the TS without need of determining.
Supervised Learning – Network is presented with the input and the desired output. – Uses a set of inputs for which the desired outputs results / classes.
Lecture 2 Introduction to Neural Networks and Fuzzy Logic President UniversityErwin SitompulNNFL 2/1 Dr.-Ing. Erwin Sitompul President University
A Presentation on Adaptive Neuro-Fuzzy Inference System using Particle Swarm Optimization and it’s Application By Sumanta Kundu (En.R.No.
Pattern Recognition Lecture 20: Neural Networks 3 Dr. Richard Spillman Pacific Lutheran University.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Self-Organizing Network Model (SOM) Session 11
Classification with Perceptrons Reading:
Competitive Networks.
network of simple neuron-like computing elements
CSSE463: Image Recognition Day 18
Competitive Networks.
CSSE463: Image Recognition Day 13
CSSE463: Image Recognition Day 18
Introduction to Cluster Analysis
CSSE463: Image Recognition Day 18
Perceptron Learning Rule
Presentation transcript:

Neural Networks Part 4 Dan Simon Cleveland State University 1

Outline 1.Learning Vector Quantization (LVQ) 2.The Optimal Interpolative Net (OINet) 2

Learning Vector Quantization (LVQ) Invented by Tuevo Kohonen in 1981 Same architecture as the Kohonen Self Organizing Map Supervised learning x1x1 xixi xnxn y1y1 ykyk ymym w 11 w 1k w 1m w i1 w ik w im w n1 w nk w nm 3

LVQ Notation: x = [x 1, …, x n ] = training vector T(x) = target; class or category to which x belongs w k = weight vector of k-th output unit = [w 1k, …, w nk ] a = learning rate LVQ Algorithm: Initialize reference vectors (that is, vectors which represent prototype inputs for each class) while not (termination criterion) for each training vector x k 0 = argmin k || x – w k || if k 0 = T(x) then w k0  w k0 + a(x – w k0 ) else w k0  w k0 – a(x – w k0 ) end if end for end while 4

w1w1 w2w2 w3w3 x x–w 2 We have three input classes. Training input x is closest to w 2. If x  class 2, then w 2  w 2 + a(x – w 2 ) that is, move w 2 towards x. If x  class 2, then w 2  w 2 – a(x – w 2 ) that is, move w 2 away from x. LVQ Example LVQ reference vector initialization: 1.Use a random selection of training vectors, one from each class. 2.Use randomly-generated weight vectors. 3.Use a clustering method (e.g., the Kohonen SOM). 5

LVQ Example: LVQ1.m (1, 1, 0, 0)  Class 1 (0, 0, 0, 1)  Class 2 (0, 0, 1, 1)  Class 2 (1, 0, 0, 0)  Class 1 (0, 1, 1, 0)  Class 2 Final weight vectors: (1.04, 0.57,  0.04, 0.00) (0.00, 0.30, 0.62, 0.70) 6

Final classification results on the training data, and final weight vectors. 14 classification errors after 20 iterations. LVQ Example: LVQ2.m Training data from Fausett, p Four initial weight vectors are at the corners of the training data. 7

8 LVQ Example: LVQ3.m Final classification results on the training data, and final weight vectors. Four classification errors after 600 iterations. Training data from Fausett, p initial weight vectors are randomly chosen with random classes. In practice it would be better to use our training data to assign the classes of the initial weight vectors.

LVQ Extensions: 9 w1w1 w2w2 w3w3 x x–w 2 The graphical illustration of LVQ gives us some ideas for algorithmic modifications. Always move the correct vector towards x, and move the closest p vectors that are incorrect away from x. Move incorrect vectors away from x only if they are within a distance threshold. Popular modifications are called LVQ2, LVQ2.1, and LVQ3 (not to be confused with the names of our Matlab programs).

10 LVQ Applications to Control Systems: Most LVQ applications involve classification Any classification algorithm can be adapted for control Switching control – switch between control algorithms based on the system features (input type, system parameters, objectives, failure type, …) Training rules for a fuzzy controller – if input 1 is A i and input 2 is B k, then output is C ik – LVQ can be used to classify x and y User intent recognition – for example, a brain machine interface (BMI) can recognize what the user is trying to do

11 The optimal interpolative net (OINet) – March 1992 Pattern classification; M classes; if x  class k, then y i =  ki The network grows during training, but only as large as needed. x1x1 x2x2 xNxN y1y1 y2y2 yMyM    w 11 w 12 w 1m w 21 w 22 w 2m w q1 w q2 w qM q hidden neurons v 11 v 12 v 1q v 21 v 22 v 2q v N1 v N2 v Nq v i = weight vector to i-th hidden neuron; v i is N-dimensional v i = prototype; { v i }  { x i }

12 Suppose we have q training samples: y(x i ) = y i, for i  {1, …, q}. Then: Note if x i = x k for some i  k, then G is singular

13 The OINet works by selecting a set of {x i } to use as input weights. These are the prototype vectors {v i }, i = 1, …, p. Choose a set of {x i } to optimize the output weights W. These are the subprototypes {z i }, i = 1, …, l. Include x i in {z i } only if needed to correctly classify x i. Include x i in {v i } only if G is not ill-conditioned, and only if it decreases the total classification error. Use l inputs for training. Use p hidden neurons. G ik =  (||v i  z k ||)

14 Suppose we have trained the network for a certain l and p. All training inputs considered so far have been correctly classified. We then consider x i, the next input in the training set. Is x i correctly classified with the existing OINet? If so, everything is fine, and we move on to the next training input. If not, then we need to add x i to the subprototype set {z i } and obtain a new set of output weights W. We also consider adding x i to the prototype set {v i }, but only if it does not make G ill-conditioned, and only if it reduces the error by enough.

15 Suppose we have trained the network for a certain l and p. Suppose x i, the next input in the training set, is not correctly classified. We need to add x i to the subprototype set {z i } and retrain W. This is going to get expensive if we have lots of data, and if we have to perform a new matrix inversion every time we add a subprototype. Note: Equation numbers from here on refer to those in Sin & deFigueiredo

16 We only need scalar division to compute the new inverse, because we already know the old inverse! (Eq. 10)

17 We can implement this equation, but we can also find a recursive solution.

18 We have a recursive equation for the new weight matrix.

19 Now we have to decide if we should add x i to the prototypeset {v i }. We will do this if it does not make G ill-conditioned, and if it reduces the error by enough. I wonder if we can think of something clever to avoid the new matrix inversion …

20

21  Homework: Derive this

22 We already have everything on the right side, so we can derive the new inverse with only scalar division (no additional matrix inversion). Small   ill conditioning So don’t use x i as a prototype if  <  1 (threshold) Even if  >  1, don’t use x i as a prototype if the error only decreases by a small amount, because it won’t be worth the extra network complexity. Before we check  e, let’s see if we can find a recursive formula for W …

23

24 We have a recursive equation for the new weight matrix.

25 Back to computing  e (see Eq. 38): Suppose Ax = b, and dim(x) < dim(b), so system is over-determined Least squares: Now suppose that we add another column to A and another element to x. We have more degrees of freedom in x, so the approximation error should decrease.

26 Matrix inversion lemma: But notice:

27

28 We have a very simple formula for the error decrease due to adding a prototype. Don’t add the prototype unless (  e / e 1 ) >  2 (threshold)

29 The OINet Algorithm Training data {x i }  {y i }, i  {1, …, q} Initialize prototype set: V = {x 1 }, and # of prototypes = p = |V| = 1 Initialize subprototype set: Z = {x 1 }, and # of subprototypes = l = |Z| = 1 x1x1 x2x2 xNxN y1y1 y2y2 yMyM  w 11 w 12 w 1m 1 hidden neuron v 11 v 21 v N1 What are the dimensions of these quantities?

30 Begin outer loop: Loop until all training patterns are correctly classified. n = q – 1 Re-index {x 2 … x q } & {y 2 … y q } from 1 to n For i = 1 to n (training data loop) Send x i through the network. If correctly classified, continue loop. Begin subprototype addition:

31

32 Homework: Implement the OINet using FPGA technology for classifying subatomic particles using experimental data. You may need to build your own particle accelerator to collect data. Be careful not to create any black holes. Find the typo in Sin and deFigueiredo’s original OINet paper.

References L. Fausett, Fundamentals of Neural Networks, Prentice Hall, 1994 S. Sin and R. deFigueiredo, An evolution-oriented learning algorithm for the optimal intepolative net, IEEE Transactions on Neural Networks, vol. 3, no. 2, pp. 315–323, March