Neural Networks Part 3 Dan Simon Cleveland State University 1.

Neural Networks Part 3 Dan Simon Cleveland State University 1

Outline 1.Sugeno RBF Neurofuzzy Networks 2.Cardiology Application 3.Hopfield Networks 4.Kohonen Self Organizing Maps 5.Adaptive Neuro-Fuzzy Inference Systems (ANFIS) 2

3 Sugeno RBF Sugeno fuzzy system; p fuzzy rules, scalar output Defuzzified output (centroid defuzzification) Summation over all p fuzzy rules w i = firing strength of i-th rule (Chapter 4) Suppose we use product inference. Then: x1x1 x2x2 i1(x1)i1(x1) i2(x2)i2(x2) wiwi

4 Suppose the outputs are singletons (zero-order Sugeno system). Then z i (x) = z i and:

5 Suppose the input MFs are Gaussian. Then: Recall the RBF network: y=  w i f (x, c i ) =  w i  ( ||x  c i || )  (.) is a basis function { c i } are the RBF centers

6 x1x1 xmxm y w1w1 w2w2 m 1 (x) … m p (x) x2x2 … m 2 (x) wpwp We started with a Sugeno fuzzy system and ended up with an RBF network that has input-dependent weights w i. This is a neuro-fuzzy network.

7 c ik and  ik : p  m z i : p A total of p(2m+1) adjustable parameters m = input dimension, p = number of hidden layers Gradient descent or BBO Chen and Linkens example : y = x 2 sin(x 1 ) + x 1 cos(x 2 ) NeuroFuzzy.zip / BBO.m p = 4

TargetNeurofuzzy Approximation 6,000 BBO generations, RMS error = 0.6 We can also use gradient descent training 8

Neurofuzzy Diagnosis of Heart Disease Cardiovascular disease is the leading cause of death in the western world – Over 800,000 deaths per year in the United States – One in five Americans has cardiovascular disease Cardiomyopathy: weakening of the heart muscle Could be inherited or acquired (unknown cause) Biochemical considerations indicate that cardiomyopathy will affect the P wave of an ECG 9

Neurofuzzy Diagnosis of Heart Disease Cardiologists tell us that primary indicators include: P wave duration P wave amplitude P wave energy P wave inflection point This gives us a neurofuzzy system with four inputs. 10

Neurofuzzy Diagnosis of Heart Disease ECG data collection – Data collected for 24 hours – Average P wave data calculated each minute Duration Inflection Energy Amplitude – 37 cardiomyopathy patients, 18 control patients 11

Neurofuzzy Diagnosis of Heart Disease Normalized P wave features with 1-  bars. Data is complex due to its time- varying nature. 12

Neurofuzzy Diagnosis of Heart Disease p Training Error Training CCRTesting CCR BestMeanBestMeanBestMean 2 0.850.8876726658 3 0.770.8482777562 4 0.780.8384776555 5 0.780.8382766358 BBO training error and correct classification rate (CCR) percent as a function of the number of middle layer neurons p. What about statistical significance? 13

Neurofuzzy Diagnosis of Heart Disease Mutation rate (%) Training Error Training CCRTesting CCR BestMeanBestMeanBestMean 0.1 0.790.8581767161 0.2 0.820.8680757259 0.5 0.770.8582766962 1.0 0.800.8580746757 2.0 0.830.8679746962 5.0 0.820.8781746858 10.0 0.800.8778736559 Training error and correct classification rate (CCR) percent for different mutation rates using BBO (p = 3). 14

Neurofuzzy Diagnosis of Heart Disease Typical BBO training and test results 15

Neurofuzzy Diagnosis of Heart Disease Patient number Percent correct Success varies from one patient to the next. Does demographic information needs to be included in the classifier? 16

The Discrete Hopfield Net John Hopfield, molecular biologist, 1982 Proc. of the National Academy of Sciences Autoassociative network: recall a stored pattern similar to the input pattern Number of neurons = pattern dimension Fully connected network except w ii = 0 Symmetric connections: w ik = w ki Stability proof 17

The Discrete Hopfield Net The neuron signals comprise an output pattern. The neuron signals are initially set equal to some input pattern. The network converges to the nearest stored pattern. Example: Store [1, 0, 1], [1, 1, 0], and [0, 0, 1] Input [0.9, 0.4, 0.6] Network converges to [1, 0, 1] 18

Store P binary patterns, each with n dimensions: s(p) = [s 1 (p), …, s n (p)], p = 1, …, P Suppose the neuron signals are given by y = [s 1 (q), …, s n (q)] When these signals are updated by the network, they are updated to 19

Recall s i  [0, 1]. Therefore, the average value of the term in brackets is 0, unless q = p, in which case the average value is n / 2. Therefore, we adjust the neuron signals as: where  i = threshold. This results in s(p) being a stable network pattern. (We still have not proven convergence.) One neuron update at a time 20

Binary Hopfield Net Example: Two patterns, p = 1 and p = 2, so P = 2 s(1) = [1, 0, 1, 0], s(2) = [1, 1, 1, 1] 21

Input y = [1 0 1 1] – close to s(2) = [1, 1, 1, 1] Threshold  i = 1 Convergence 22

Recall s(1) = [1, 0, 1, 0], s(2) = [1, 1, 1, 1] Is s(1) stable? Is s(2) stable? Are any other patterns stable? Storage capacity: P = 0.15n (experimental) P = n / (2 log 2 n) 23

Hopfield Net Stability: Consider the “energy” function Is E bounded? How does E change when y i changes? 24

Recall our activation function: If y i = 1, it will decrease if  w ik y k <  i This gives a negative change for E (see prev. page) If y i = 0, it will increase if  w ik y k >  i This gives a negative change for E (see prev. page) We have a bounded E which never increases. E is a Lyapunov function. 25

Control Applications of Hopfield Nets If we have optimal control trajectories, and noise drives us away from the optimal trajectory, the Hopfield net can find the closest optimal trajectory Transform a linear-quadratic optimal control performance index into the form of the Hopfield network energy function. Use the Hopfield network dynamics to minimize the energy. 26

Kohonen Self Organizing Map Clustering; associative memory – given a set of input vectors { x }, find a mapping from the input vectors onto a grid of models { m } (cluster centers) Nearby models are similar Visualize vector distributions Tuevo Kohonen, engineer, Finland, 1982 Unsupervised learning 27

All the weights from the n input dimensions to a given point in output space correspond to a cluster point. Note: The inputs are not multiplied by the weights, unless the inputs are normalized – then max k [x.w(k)] gives the cluster point that is closest to x, because the dot product of x and w(k) is the cosine of the angle between them. 28

Kohonen SOM Learning Algorithm We are given a set of input vectors {x}, each of dimension n Choose the maximum number of clusters m Random weight initialization {w ik }, i  [1,n], k  [1,m] Note w ik is the weight from x i to cluster unit k Iterate for each input training sample x: Find k such that D(k)  D(k’) for all k’ scalar form: w ik  w ik +  (x i  w ik ), for i  [1,n] n-dimensional vector form: w k  w k +  (x  w k )  = some function that decreases with the distance between x and w k, and decreases with time (# of iterations). This update equation moves the w k vector closer to x. 29

Kohonen SOM Example Cluster [1, 1, 0, 0]; [0, 0, 0, 1]; [1, 0, 0, 0]; [0, 0, 1, 1] Maximum # of clusters m = 2  (t) = (0.6)(0.95) t, where t = iteration # (coarse clustering to start, fine-tuning later) Random initialization: First vector: D(1) = 1.86, D(2) = 0.98 w 2  w 2 + 0.6(x  w 2 ) = [0.92, 0.76, 0.28, 0.12] T Second vector: D(1) = 0.66, D(2) = 2.28 w 1  w 1 + 0.6(x  w 1 ) = [0.08, 0.24, 0.20, 0.96] T 30

Third vector: D(1) = 1.87, D(2) = 0.68 w 2  w 2 + 0.6(x  w 2 ) = [0.97, 0.30, 0.11, 0.05] T Fourth vector: D(1) = 0.71, D(2) = 2.72 w 1  w 1 + 0.6(x  w 1 ) = [0.03, 0.10, 0.68, 0.98] T This is the end of the first iteration (epoch) Adjust  for the next iteration Each cluster point (weight column) is converging to about the average of the two sample inputs that are closest to it. Kohonen.m 31

Control Applications of Kohonen Networks  Fault accomodation Suppose we have a family of controllers, one controller for each fault condition When a fault occurs, classify it in the correct fault class to choose the control This idea can also apply to operating modes, reference input types, user intent, etc.  Missing sensor data – the Konohen network can fill in the most likely values of missing sensor data 32

Adaptive Neuro-Fuzzy Inference Systems Originally called adaptive network-based fuzzy inference systems Roger Jang, 1993 (Zadeh’s student) 33

Figure 12.1(b) in Jang’s book Two-input, single-output ANFIS Layer 1: Fuzzy system; outputs = membership grades Layer 2: Product Layer 3: Normalization Layer 4: Sugeno fuzzy system Layer 5: Sum 34

Layer 1 outputs:  A1 (x),  A2 (x),  B1 (y),  B2 (y) Layer 2 outputs: w 1 =  A1 (x)  B1 (y), w 2 =  A2 (x)  B2 (y) Layer 3 outputs: Layer 4 outputs: Layer 5 output: 35 (or any other T-norm)

36 So ANFIS is a Sugeno fuzzy system. Neural network architecture It can be trained with neural network methods (e.g., backpropagation). Consequent parameters = p i, q i, and r i. Output is linear with respect to these parameters. We can optimize with respect to the consequent parameters using least-squares. This is called the forward pass. 1st-order Sugeno system with n inputs and m fuzzy Sugeno partitions per input  3m n linear parameters.

37 Premise parameters = parameters of fuzzy sets A 1, A 2, B 1, B 2, etc. ANFIS output is nonlinear with respect to these parameters. Gradient descent can be used to optimize the output with respect to these parameters. This is called the backward pass. Premise fuzzy system with n inputs, q fuzzy partitions per input, and k parameters per MF  kq n nonlinear parameters.

References M. Chen and D. Linkens, A systematic neuro-fuzzy modelling framework with application to material property prediction, IEEE Transactions on Systems, Man, and Cybernetics, Part B, vol. 31, no. 5 (2001), pp. 781-790 M. Ovreiu and D. Simon, Biogeography-Based Optimization of Neuro-Fuzzy System Parameters for Diagnosis of Cardiac Disease, Genetic and Evolutionary Computation Conference, Portland, Oregon, pp. 1235-1242, July 2010 J. Hopfield, Neural Networks and Physical Systems with Emergent Collective Computational Abilities, 1982 P. Simpson, Artificial Neural Systems, Pergamon Press, 1990 L. Fausett, Fundamentals of Neural Networks, Prentice Hall www.scholarpedia.org/article/Kohonen_network J.-S. Jang, C.-T. Sun, and E. Mizutani, Neuro-Fuzzy and Soft Computing, Prentice Hall, 1997 38

Neural Networks Part 3 Dan Simon Cleveland State University 1.

Similar presentations

Presentation on theme: "Neural Networks Part 3 Dan Simon Cleveland State University 1."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Neural Networks Part 3 Dan Simon Cleveland State University 1.

Similar presentations

Presentation on theme: "Neural Networks Part 3 Dan Simon Cleveland State University 1."— Presentation transcript:

Similar presentations

About project

Feedback