Artificial Neural Networks Brian Talecki CSC 8520 Villanova University.

Slides:



Advertisements
Similar presentations
Multi-Layer Perceptron (MLP)
Advertisements

Neural Network Toolbox COMM2M Harry R. Erwin, PhD University of Sunderland.
A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C. Wunsch.
Slides from: Doug Gray, David Poole
1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)
Introduction to Neural Networks Computing
Artificial Neural Networks (1)
Perceptron Learning Rule
1 Neural networks. Neural networks are made up of many artificial neurons. Each input into the neuron has its own weight associated with it illustrated.
Neural Network I Week 7 1. Team Homework Assignment #9 Read pp. 327 – 334 and the Week 7 slide. Design a neural network for XOR (Exclusive OR) Explore.
Machine Learning Neural Networks
Supervised learning 1.Early learning algorithms 2.First order gradient methods 3.Second order gradient methods.
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Introduction to Neural Networks John Paxton Montana State University Summer 2003.
Lecture 4 Neural Networks ICS 273A UC Irvine Instructor: Max Welling Read chapter 4.
CHAPTER 11 Back-Propagation Ming-Feng Yeh.
CS 4700: Foundations of Artificial Intelligence
September 28, 2010Neural Networks Lecture 7: Perceptron Modifications 1 Adaline Schematic Adjust weights i1i1i1i1 i2i2i2i2 inininin …  w 0 + w 1 i 1 +
NEURAL NETWORKS Introduction
Artificial Neural Network
Neural Networks William Lai Chris Rowlett. What are Neural Networks? A type of program that is completely different from functional programming. Consists.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Artificial neural networks:
Dr. Hala Moushir Ebied Faculty of Computers & Information Sciences
1 Introduction to Artificial Neural Networks Andrew L. Nelson Visiting Research Faculty University of South Florida.
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
© Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C.
11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering
1 Chapter 6: Artificial Neural Networks Part 2 of 3 (Sections 6.4 – 6.6) Asst. Prof. Dr. Sukanya Pongsuparb Dr. Srisupa Palakvangsa Na Ayudhya Dr. Benjarath.
Artificial Neural Network Yalong Li Some slides are from _24_2011_ann.pdf.
1 Machine Learning The Perceptron. 2 Heuristic Search Knowledge Based Systems (KBS) Genetic Algorithms (GAs)
 Diagram of a Neuron  The Simple Perceptron  Multilayer Neural Network  What is Hidden Layer?  Why do we Need a Hidden Layer?  How do Multilayer.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 16: NEURAL NETWORKS Objectives: Feedforward.
LINEAR CLASSIFICATION. Biological inspirations  Some numbers…  The human brain contains about 10 billion nerve cells ( neurons )  Each neuron is connected.
Artificial Intelligence Techniques Multilayer Perceptrons.
Artificial Neural Networks. The Brain How do brains work? How do human brains differ from that of other animals? Can we base models of artificial intelligence.
Introduction to Artificial Intelligence (G51IAI) Dr Rong Qu Neural Networks.
11 1 Backpropagation Multilayer Perceptron R – S 1 – S 2 – S 3 Network.
ADALINE (ADAptive LInear NEuron) Network and
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.
Neural Networks Demystified by Louise Francis Francis Analytics and Actuarial Data Mining, Inc.
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
IE 585 History of Neural Networks & Introduction to Simple Learning Rules.
NEURAL NETWORKS LECTURE 1 dr Zoran Ševarac FON, 2015.
Artificial Neural Networks (ANN). Artificial Neural Networks First proposed in 1940s as an attempt to simulate the human brain’s cognitive learning processes.
1 Perceptron as one Type of Linear Discriminants IntroductionIntroduction Design of Primitive UnitsDesign of Primitive Units PerceptronsPerceptrons.
Artificial Intelligence Methods Neural Networks Lecture 1 Rakesh K. Bissoondeeal Rakesh K.
Perceptrons Michael J. Watts
Previous Lecture Perceptron W  t+1  W  t  t  d(t) - sign (w(t)  x)] x Adaline W  t+1  W  t  t  d(t) - f(w(t)  x)] f’ x Gradient.
Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D.
Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Computational Intelligence Semester 2 Neural Networks Lecture 2 out of 4.
Learning with Neural Networks Artificial Intelligence CMSC February 19, 2002.
Neural networks.
Artificial neural networks
Learning with Perceptrons and Neural Networks
Artificial neural networks:
CSE 473 Introduction to Artificial Intelligence Neural Networks
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
Perceptron as one Type of Linear Discriminants
CSE 573 Introduction to Artificial Intelligence Neural Networks
Backpropagation.
Machine Learning: Lecture 4
Machine Learning: UNIT-2 CHAPTER-1
Backpropagation.
David Kauchak CS158 – Spring 2019

Presentation transcript:

Artificial Neural Networks Brian Talecki CSC 8520 Villanova University

ANN - Artificial Neural Network A set of algebraic equations and functions which determine the best output given a set of inputs. An artificial neural network is modeled on a very simplified version of the a human neuron which make up the human nervous system. Although the brain operates at 1 millionth the speed of modern computers, it functions faster than computers because of the parallel processing structure of the nervous system.

Human Nerve Cell picture from: G5AIAI Introduction to AI by Graham Kendall

At the synapse – the nerve cell releases a chemical compounds called neurotransmitters, which excite or inhibit a chemical / electrical discharge in the neighboring nerve cells. The summation of the responses of the adjacent neurons will elicit the appropriate response in the neuron.

Brief History of ANN McCulloch and Pitts (1943) designed the first neural network Hebb (1949) who developed the first learning rule. If two neurons were active at the same time then the strength between them should be increased. Rosenblatt (1958) – introduced the concept of a perceptron which performed pattern recognition. Widrow and Hoff (1960) introduced the concept of the ADALINE (ADAptive Linear Element). The training rule was based on the idea of Least-Mean-Squares learning rule which minimizing the error between the computed output and the desired output. Minsky and Papert (1969) stated that the perceptron was limited in its ability to recognize features that were separated by linear boundaries. “Neural Net Winter” Kohonen and Anderson – independently developed neural networks that acted like memories. Webros(1974) – developed the concept of back propagation of an error to train the weights of the neural network. McCelland and Rumelhart (1986) published the paper on back propagation algorithm. “Rebirth of neural networks”. Today - they are everywhere a decision can be made. Source : G5AIAI - Introduction to Artificial Intelligence Graham Kendall:

Basic Neural Network Inputs – normally a vector of measured parameters Bias – may/may not be added f() – transfer or activation function Outputs = f( ∑ W p + b) f() W ∑ Outputs InputsInputs b - Bias ∑ Wp +b T

Activation Functions Source: Supervised Neural Network Introduction CISC 873. Data Mining Yabin Meng

Log Sigmoidal Function Source: Artificial Neural Networks Colin P. Fahey

Hard Limit Function 1.0 x y

Log Sigmoid and Derivative Source : The Scientist and Engineer’s Guide to Digital Signal Processing by Steven Smith

Derivative of the Log Sigmoidal Function s(x) = (1 + e ) s’(x) = -(1+e ) * (-e ) = e * (1+ e ) = ( e ) * ( 1 ) (1+ e ) ( 1 + e ) = (1 + e – 1) * ( 1 ) ( 1+ e ) ( 1 + e ) = (1 - ( 1 ) ) * ( 1 ) (1+ e ) (1 + e ) s’(x) = (1-s(x)) * s(x) -x -2 -x -2 -x Derivative is important for the back error propagation algorithm used to train multilayer neural networks.

Example : Single Neuron Given : W = 1.3, p = 2.0, b = 3.0 Wp + b = 1.3(2.0) = 5.6 Linear: f(5.6) = 5.6 Hard limit f(5.6) = 1.0 Log Sigmoidal f(5.6) = 1/(1+exp(-5.6) = 1/( ) =.9963

Simple Neural Network One neuron with a linear activation function => Straight Line Recall the equation of a straight Line : y = mx +b m is the slope (weight), b is the y-intercept (bias). Bad Good Decision Boundary p2 p1 Mp1 + b >= p2 Mp1 + b < p2

Perceptron Learning Extend our simple perceptron to two inputs and hard limit activation function F() W bias Output W1 W2 o = f ( ∑ W p + b) W is the weight matrix p is the input vector o is our scalar output p1 p2 Hard limit function ∑ T

Rules of Matrix Math Addition/Subtraction / = Multiplication by a scalar Transpose a 1 2 = a 2a 1 = a 4a 2 Matrix Multiplication = 18, = T

Data Points for the AND Function q1 = 0, o1 = 0 0 q2 = 1, o2 = 0 0 q3 = 0, o3 = 0 1 q4 = 1, o4 = 1 1 Truth Table P1 P2 O

Weight Vector and the Decision Boundary W = Magnitude and Direction Decision Boundary is the line where W p = b or W p – b = 0 TT W p < b W p > b T T As we adjust the weights and biases of the neural network, we change the magnitude and direction of the weight vector or the slope and intercept of the decision boundary

Perceptron Learning Rule Adjusting the weights of the Perceptron Perceptron Error : Difference between the desired and derived outputs. e = Desired – Derived When e = 1 W new = W old + p When e = -1 W new = W old - p When e = 0 W new = W old Simplifing W new = W old + λ * ep b new = b old + e λ is the learning rate ( = 1 for the perceptron).

AND Function Example Start with W1 = 1, W2 = 1, and b = -1 W p + b => t - a = e => = 0 N/C => = => = 0 N/C => = 1 1 T

W p + b => t - a = e => = 0 N/C => = => = => = 1 1 T

W p + b => t - a = e => = 0 N/C => = 0 N/C => = => = 1 1 T

W p + b => t - a = e => = 0 N/C => = => = 0 N/C => = 0 N/C 1 T

W p + b => t - a = e => = 0 N/C => = 0 N/C 1 Done ! T 2 f() 1 Hardlim() p1 p2 Σ -3

XOR Function Truth Table X Y Z = (X and not Y) or (not X and Y) No single decision boundary can separate the favorable and unfavorable outcomes. z x y We will need a more complicated neural net to realize this function Circuit Diagram

XOR Function – Multilayer Perceptron W5 W6 W1 W3 W2 W4 f1() f() z b2 b11 b12 Σ Σ x y Z = f (W5*f1(W1*x + W4*y+b11) +W6*f1(W2*x + W3*y+b12)+b2) Weights of the neural net are independent of each other, so that we can compute the partial derivatives of z with respect to the weights of the network. i.e. δz / δW1, δz / δW2, δz / δW3, δz / δW4, δz / δW5, δz / δW6

Back Propagation Diagram Neural Networks and Logistic Regression by Lucila Ohno-Machado Decision Systems Group, Brigham and Women’s Hospital, Department of Radiology

Back Propagation Algorithm This algorithm to train Artificial Neural Networks (ANN) depends to two basic concepts : a) Reduced the Sum Squared Error, SSE, to an acceptable value. b) Reliable data to train your network under your supervision. Simple case : Single input no bias neural net. zx W1 n1 f1 W2 a1n2 T = desired output f2

BP Equations n1 = W1 * x a1 = f1(n1) = f1(W1 * x) n2 = W2 * a1 = W2 * f1(n1) = W2 * f1(W1 * x) z = f2(n2) = f2(W2 * f1(W1 * x)) SSE = ½ (z – T) Lets now take the partial derivatives δSSE/ δW2 = (z - T) * δ(z - T)/ δW2 = (z – T) * δz/ δW2 = (z - T) * δf2(n2)/δW2 Chain Rule δf2(n2)/δW2 = (δf2(n2)/δn2)* (δn2/δW2) = (δf2(n2)/δn2)* a1 δSSE/ δW2 = (z - T) * (δf2(n2)/δn2)* a1 Define λ to our learning rate (0 < λ < 1, typical λ = 0.2) Compute our new weight: W2(k+1) = W2(k) - λ (δSSE/ δW2) = W2(k) - λ ((z - T) * (δf2(n2)/δn2)* a1) 2

Sigmoid function: δf2(n2)/δn2 = f2(n2)(1 – f2(n2)) = z(1 – z) Therefore: W2(k+1) = W2(k) - λ ((z - T) * ( z(1 –z) )* a1) Analysis for W1 n1 = W1 * x a1 = f1(W1*x) n2 = W2 * f1(n1) = W2 * f1(W1 * x) δSSE/ δW1 = (z - T) * δ(z -T )/ δW1 = (z - T) * δz/ δW1 = (z - T) * δf2(n2)/δW1 δf2(n2)/δW1 = (δf2(n2)/δn2)* (δn2/δW1) -> Chain Rule δn2/δW1 = W2 * (δf1(n1)/δW1) = W2 * (δf1(n1)/δn1) * (n1/δW1) -> Chain Rule = W2 * (δf1(n1)/δn1) * x δSSE/ δW1 = (z - T ) * (δf2(n2)/δn2)* W2 * (δf1(n1)/δn1) * x W1(k+1) = W1(k) - λ ( (z - T ) * (δf2(n2)/δn2)* W2 * (δf1(n1)/δn1) * x) δf2(n2)/δn2 = z (1 – z) and δf1(n1)/δn1 = a1 ( 1 – a1)

Gradient Descent Local minimum Global minimum Error Training time Neural Networks and Logistic Regression by Lucila Ohno-Machado Decision Systems Group, Brigham and Women’s Hospital, Department of Radiology

2-D Diagram of Gradient Descent Source : Back Propagation algorithm by Olena Lobunets workshops03-04/Workshop4/Workshop%204.ppt

Learning by Example Training Algorithm: backpropagation of errors using gradient descent training. Colors: –Red: Current weights –Orange: Updated weights –Black boxes: Inputs and outputs to a neuron –Blue: Sensitivities at each layer Source : A Brief Overview of Neural Networks Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C. Wunsch campus.umr.edu/smartengineering/ EducationalResources/Neural_Net.ppt

First Pass Error= = G3=(1)(0.3492)= G2= (0.6508)( )(0.3492)(0.5)= G1= (0.6225)( )(0.0397)(0.5)(2)= Gradient of the neuron= G =slope of the transfer function ×[Σ{ (weight of the neuron to the next neuron) × ( output of the neuron)}] Gradient of the output neuron = slope of the transfer function × error

Weight Update 1 New Weight=Old Weight + {(learning rate)(gradient)(prior output)} 0.5+(0.5)(0.3492)(0.6508) (0.5)(0.0397)(0.6225) 0.5+(0.5)(0.0093)(1) Source : A Brief Overview of Neural Networks Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C. Wunsch campus.umr.edu/smartengineering/ EducationalResources/Neural_Net.ppt

Second Pass Error= = G3=(1)(0.1967)= G2= (0.6545)( )(0.1967)(0.6136)= G1= (0.6236)( )(0.5124)(0.0273)(2)= Source : A Brief Overview of Neural Networks Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C. Wunsch campus.umr.edu/smartengineering/ EducationalResources/Neural_Net.ppt

Weight Update 2 New Weight=Old Weight + {(learning rate)(gradient)(prior output)} (0.5)(0.1967)(0.6545) (0.5)(0.0273)(0.6236) (0.5)(0.0066)(1) Source : A Brief Overview of Neural Networks Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C. Wunsch campus.umr.edu/smartengineering/ EducationalResources/Neural_Net.ppt

Third Pass Source : A Brief Overview of Neural Networks Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C. Wunsch campus.umr.edu/smartengineering/ EducationalResources/Neural_Net.ppt

Weight Update Summary W1: Weights from the input to the input layer W2: Weights from the input layer to the hidden layer W3: Weights from the hidden layer to the output layer Source : A Brief Overview of Neural Networks Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C. Wunsch campus.umr.edu/smartengineering/ EducationalResources/Neural_Net.ppt

ECG Interpretation Neural Networks and Logistic Regression by Lucila Ohno-Machado Decision Systems Group, Brigham and Women’s Hospital, Department of Radiology

Other Applications of ANN Lip Reading Using Artificial Neural Network Ahmad Khoshnevis, Sridhar Lavu, Bahar Sadeghi and Yolanda Tsang ELEC502 Course Project www-dsp.rice.edu/~lavu/research/doc/502lavu.ps AI Techniques in Power Electronics and Drives Dr. Marcelo G. Simões Colorado School of Mines egweb.mines.edu/msimoes/tutorial Car Classification with Neural Networks Koichi Sato & Sangho Park hercules.ece.utexas.edu/course/ ee380l/1999sp/present/carclass.ppt Face Detection and Neural Networks Todd Wittman A Neural Network for Detecting and Diagnosing Tornadic Circulations V Lakshmanan, Gregory Stumpf, Arthur Witt

Bibliography A Brief Overview of Neural Networks Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C. Wunsch campus.umr.edu/smartengineering/ EducationalResources/Neural_Net.ppt Neural Networks and Logistic Regression Lucila Ohno-Machado Decision Systems Group, Brigham and Women’s Hospital,Department of Radiology dsg.harvard.edu/courses/hst951/ppt/hst951_0320.ppt G5AIAI Introduction to AI by Graham Kendall Schooll of Computer Science and IT, University of Nottingham The Scientist and Engineer's Guide to Digital Signal Processing Steven W. Smith, Ph.D. California Technical Publishing Neural Network Design Martin Hagen, Howard B. Demuth, and Mark Beale Campus Publishing Services, Boulder Colorado ECE 8412 lectures notes by Dr. Anthony Zygmont Department of Electrical Engineering Villanova University January 2003 Supervised Neural Network Introduction CISC 873. Data Mining Yabin Meng