Chapter 6 Neural Network.

Slides:



Advertisements
Similar presentations
Artificial Neural Networks
Advertisements

Memristor in Learning Neural Networks
A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C. Wunsch.
Learning in Neural and Belief Networks - Feed Forward Neural Network 2001 년 3 월 28 일 안순길.
1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)
G53MLE | Machine Learning | Dr Guoping Qiu
Artificial Neural Networks (1)
Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.
1 Neural networks. Neural networks are made up of many artificial neurons. Each input into the neuron has its own weight associated with it illustrated.
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
Machine Learning Neural Networks
Overview over different methods – Supervised Learning
Connectionist models. Connectionist Models Motivated by Brain rather than Mind –A large number of very simple processing elements –A large number of weighted.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
An Illustrative Example
Back-Propagation Algorithm
Data Mining with Neural Networks (HK: Chapter 7.5)
LOGO Classification III Lecturer: Dr. Bo Yuan
CHAPTER 11 Back-Propagation Ming-Feng Yeh.
CS 4700: Foundations of Artificial Intelligence
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Dr. Hala Moushir Ebied Faculty of Computers & Information Sciences
CS623: Introduction to Computing with Neural Nets (lecture-10) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
Artificial Neural Networks
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
Artificial Neural Networks
Introduction to Neural Networks Debrup Chakraborty Pattern Recognition and Machine Learning 2006.
Neural Networks Ellen Walker Hiram College. Connectionist Architectures Characterized by (Rich & Knight) –Large number of very simple neuron-like processing.
Neural Networks AI – Week 23 Sub-symbolic AI Multi-Layer Neural Networks Lee McCluskey, room 3/10
Machine Learning Chapter 4. Artificial Neural Networks
Chapter 3 Neural Network Xiu-jun GONG (Ph. D) School of Computer Science and Technology, Tianjin University
11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering
1 Chapter 6: Artificial Neural Networks Part 2 of 3 (Sections 6.4 – 6.6) Asst. Prof. Dr. Sukanya Pongsuparb Dr. Srisupa Palakvangsa Na Ayudhya Dr. Benjarath.
Machine Learning Dr. Shazzad Hosain Department of EECS North South Universtiy
Artificial Neural Network Supervised Learning دكترمحسن كاهاني
NEURAL NETWORKS FOR DATA MINING
 Diagram of a Neuron  The Simple Perceptron  Multilayer Neural Network  What is Hidden Layer?  Why do we Need a Hidden Layer?  How do Multilayer.
LINEAR CLASSIFICATION. Biological inspirations  Some numbers…  The human brain contains about 10 billion nerve cells ( neurons )  Each neuron is connected.
Artificial Intelligence Techniques Multilayer Perceptrons.
Artificial Neural Networks. The Brain How do brains work? How do human brains differ from that of other animals? Can we base models of artificial intelligence.
CS 478 – Tools for Machine Learning and Data Mining Backpropagation.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 31: Feedforward N/W; sigmoid.
Multi-Layer Perceptron
Non-Bayes classifiers. Linear discriminants, neural networks.
Neural Network Basics Anns are analytical systems that address problems whose solutions have not been explicitly formulated Structure in which multiple.
CS621 : Artificial Intelligence
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
Chapter 18 Connectionist Models
Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D.
Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 6: Artificial Neural Networks for Data Mining.
Learning with Neural Networks Artificial Intelligence CMSC February 19, 2002.
Machine Learning Supervised Learning Classification and Regression
Neural networks.
Fall 2004 Backpropagation CS478 - Machine Learning.
Learning with Perceptrons and Neural Networks
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
CSE P573 Applications of Artificial Intelligence Neural Networks
Classification / Regression Neural Networks 2
CS621: Artificial Intelligence
ECE 471/571 - Lecture 17 Back Propagation.
CSE 573 Introduction to Artificial Intelligence Neural Networks
Neural Networks Chapter 5
Artificial Neural Networks
Machine Learning: Lecture 4
Machine Learning: UNIT-2 CHAPTER-1
CSSE463: Image Recognition Day 17
CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya.
Presentation transcript:

Chapter 6 Neural Network

Hopfield Networks Hopfiled [1982] : theory of memory p.490 Model of content addressable memory p. 491 distribute representative distributed, asynchronous control content-addressable memory fault tolerance Figure 18.1 p. 490 black unit = active white unit = inactive Artificial Intelligence Chapter 6

Hopfield Networks units are connected to each other with weighted symmetric connection a positive weighted connection  indicates that the two unit tend to activate each other. a negative weighted connection  allows an active unit to deactivate a neighboring units. Artificial Intelligence Chapter 6

Parallel relaxation algorithm The network operates as follows: a random unit is chosen if any of it neighbors are active, the unit computes the sum of the weights on the connections to those active neighbors. if the sum is positive, the unit becomes active, otherwise it becomes inactive. Another random unit is chosen, and the process repeat until the network reach a stable state. (e.g. until no more unit can change state) Artificial Intelligence Chapter 6

Hopfield Networks Figure 18.1 p. 490 A Hopfield network black and positive  will attempt to activate the unit connected to it Figure 18.2 p. 491 Four stable states : storing the pattern given any set of weights and any initial state, the parallel relaxation algorithm will be stale into one of these four states Artificial Intelligence Chapter 6

Hopfield Networks Figure 18.3 p. 491 Model of content-addressable memory Setting the activities of the units to correspond to a partial pattern To retrieve a pattern, we need to apply the portion of it. the network will then settle into the stable state that best matches the partial pattern. shows the local minima = nearest stable state Figure 18.4 p. 492 what a Hopfield network compute from one state to another state. Artificial Intelligence Chapter 6

Hopfield Networks Problem : sometimes the network can not find global solution because the network stick with the local minima because they settle into stable states via a completely distributed algorithm. For example in Figure 18.4, If a network reaches a stable state A then no single unit willing to change its state in order to move uphill, so the network will never reach global optimal state B. Artificial Intelligence Chapter 6

Perceptron A perceptron (1962) Rosenblatt models a neural by taking a weight sum of its inputs and sending the output 1 if the sum is grater than some adjustable threshold value (otherwise it send 0) Figure 18.5-18.7 p.493-394 Threshold function Figure 18.8 Intelligence System g(x) = summation i = 1 to n of wixi output(x) = 1 if g(x) > 0 0 if g(x) < 0 Artificial Intelligence Chapter 6

x2 = -(w1/w2)x1 - (w0/w2)  equation for a line Perceptron In case of zero with two inputs g(x) = w0 + w1x1 + w2x2 = 0 x2 = -(w1/w2)x1 - (w0/w2)  equation for a line the location of the line is determine by the weight w0 w1 and w2 if an input vector lies on one side of the line, the perceptron will output 1 if it lies on the other side, the perceptron will output 0 Decision surface : a line that correctly separates the training instances corresponds to a perfectly function perceptron. See Figure 18.9 p. 496 Artificial Intelligence Chapter 6

Decision surface the absolute value of g(x) tells how far a given input vector x lies from the decision surface. so we know how good a set of weights is. let w be the weight vector (w0, w1,.., wn) Artificial Intelligence Chapter 6

Multilayer perceptron Figure 18.10 p. 497 Adjusting the weights by Gradient Descent hill-climbing/down-hill See Algorithm Fixed-Increment Perceptron Learning Figure 18.11 p.499 A perceptron learning to solve a classification problem : K = 10, K = 100, K = 635 Figure 18.12 p.500 the XOR is not linearly separable We need multilayer perceptron to solve the XOR problem See Figure 18.3 p. 500 where x1 = 1 and x2 = 1 Artificial Intelligence Chapter 6

Backpropagation Algorithm Parker 1985, Rumelhart et. al 1986 fully connected, feedforward network, multilayer network Figure 18.14, p.502 fast, resistant to damage , learn efficiently see Figure 18.15, p.503 use for classify problem use sigmoid activation function (S-shaped) : it process a real value between 0 and 1 as output see Figure 18.16, p.503 Artificial Intelligence Chapter 6

Backpropagation Algorithm Figure 18.14, p.502 start with a random set of weights the network adjusts its weights each time it sees an input-outout pair each pair require two stages 1) a forward pass : involves presenting a sample input to the network and letting activations flow until they reach the output layer. 2) a backward pass : the network’s actual output (from the forward pass) is compared with the target output and error estimates a re computed for output units Artificial Intelligence Chapter 6

Backpropagation Algorithm The weight connected to the output units can be adjusted in order to reduce the errors. We can use the error estimates of the output units to derive error estimates for the units in the hidden layers. Finally errors are propagated back to the connections stemming from the input units. Artificial Intelligence Chapter 6

Backpropagation Algorithm p. 504-506... initial weight = -0.1 to 0.1, initial the activation function of the thresholding unit, learning rate = , choose an input-output pair oj = network actual value (ค่าที่ network คำนวณได้) yj = target output (ค่าจริงของข้อมูล ที่เราใช้ train) adjust weights between the hidden layer and output layer (w2ij) adjust weights between the input layer and hidden layer (w1ij) input layer w1ij hidden layer w2i j output layer Xi A hj B oj C Artificial Intelligence Chapter 6

Backpropagation Algorithm Backpropagation updates its weights after seeing each input-output pair. After it has seen all the input-output pairs and adjusts its weight that many times, we call one epoch had been completed. number of epochs make the network more efficiency we can speed up the network by using the momentum term  see equation p. 506 perceptron convergence theorem (Rosenblatt 1962) : guarantees that the perceptron will find a solution... Artificial Intelligence Chapter 6

Backpropagation Algorithm Generalization Figure 18.17 p.508 Good network should capable of storing entire training sets and have a setting for weights that generally describe the mapping for all cases, not the individual input-output pairs. Artificial Intelligence Chapter 6

Reinforcement Learning use punishment and reward system (same as animal) 1) the network is presented with a sample input form the training set 2) the network computes what it thinks should be the sample output 3) the network is supplied with a real-valued judgment by a teacher receive positive value : indicates good performance receive negative value : indicates bad performance 4) the network adjusts its weights, and process repeats we try to receive positive value or to have good performance supervised learning Artificial Intelligence Chapter 6

Unsupervised Learning no feedback for its outputs no teacher required given a set of input data, the network is allowed to discover regularities and relations between the different parts of the input feature discovery : Figure 18.8 p. 511 Data for unsupervised learning 3 types of animal... 1) mammals 2)reptiles 3) birds Artificial Intelligence Chapter 6

Unsupervised Learning we need to sure that only one of the three output units becomes active for any given input. see Figure 18.19 p. 512 A competitive learning network use winner-take-all behavior Artificial Intelligence Chapter 6

Unsupervised Learning single competitive learning algorithm p. 512-513 1) present an input vector 2) calculate the initial activation for each output unit 3) let the output units fight until only one is active 4) adjust the weights on the input lines that lead to the single active output unit, increase the weights on connections between the active output unit and active input units. (this makes it more likely that the output unit will be active next time the pattern is required) 5) repeat steps 1 to 4 for all input patterns for many epochs. Artificial Intelligence Chapter 6

use in temporal AI task, planning, natural language processing Recurrent Networks Jordan 1986 use in temporal AI task, planning, natural language processing we need more than a single output vector we need a series of output vectors Figure 18.22 p. 518 A Jordan network Figure 18.23 p. 519 A recurrent network with a mental model Artificial Intelligence Chapter 6

Hopfield Networks Hopfiled [1982] : theory of memory p.490 Model of content addressable memory p. 491 distribute representative distributed, asynchronous control content-addressable memory fault tolerance Figure 18.1 p. 490 black unit = active white unit = inactive Artificial Intelligence Chapter 6

Artificial Intelligence Chapter 6

Hopfield Networks units are connected to each other with weighted symmetric connection a positive weighted connection  indicates that the two unit tend to activate each other. a negative weighted connection  allows an active unit to deactivate a neighboring units. Artificial Intelligence Chapter 6

Parallel relaxation algorithm The network operates as follows: a random unit is chosen if any of it neighbors are active, the unit computes the sum of the weights on the connections to those active neighbors. if the sum is positive, the unit becomes active, otherwise it becomes inactive. Another random unit is chosen, and the process repeat until the network reach a stable state. (e.g. until no more unit can change state) Artificial Intelligence Chapter 6

Hopfield Networks Figure 18.1 p. 490 A Hopfield network black and positive  will attempt to activate the unit connected to it Figure 18.2 p. 491 Four stable states : storing the pattern given any set of weights and any initial state, the parallel relaxation algorithm will be stale into one of these four states Artificial Intelligence Chapter 6

Artificial Intelligence Chapter 6

Hopfield Networks Figure 18.3 p. 491 Model of content-addressable memory Setting the activities of the units to correspond to a partial pattern To retrieve a pattern, we need to apply the portion of it. the network will then settle into the stable state that best matches the partial pattern. shows the local minima = nearest stable state Figure 18.4 p. 492 what a Hopfield network compute from one state to another state. Artificial Intelligence Chapter 6

Artificial Intelligence Chapter 6

Hopfield Networks Problem : sometimes the network can not find global solution because the network stick with the local minima because they settle into stable states via a completely distributed algorithm. For example in Figure 18.4, If a network reaches a stable state A then no single unit willing to change its state in order to move uphill, so the network will never reach global optimal state B. Artificial Intelligence Chapter 6

Artificial Intelligence Chapter 6

Perceptron A perceptron (1962) Rosenblatt models a neural by taking a weight sum of its inputs and sending the output 1 if the sum is grater than some adjustable threshold value (otherwise it send 0) Figure 18.5-18.7 p.493-394 Threshold function Figure 18.8 Intelligence System g(x) = summation i = 1 to n of wixi output(x) = 1 if g(x) > 0 0 if g(x) < 0 Artificial Intelligence Chapter 6

Artificial Intelligence Chapter 6

Artificial Intelligence Chapter 6

Artificial Intelligence Chapter 6

Artificial Intelligence Chapter 6

x2 = -(w1/w2)x1 - (w0/w2)  equation for a line Perceptron In case of zero with two inputs g(x) = w0 + w1x1 + w2x2 = 0 x2 = -(w1/w2)x1 - (w0/w2)  equation for a line the location of the line is determine by the weight w0 w1 and w2 if an input vector lies on one side of the line, the perceptron will output 1 if it lies on the other side, the perceptron will output 0 Decision surface : a line that correctly separates the training instances corresponds to a perfectly function perceptron. See Figure 18.9 p. 496 Artificial Intelligence Chapter 6

Artificial Intelligence Chapter 6

Decision surface the absolute value of g(x) tells how far a given input vector x lies from the decision surface. so we know how good a set of weights is. let w be the weight vector (w0, w1,.., wn) Artificial Intelligence Chapter 6

Multilayer perceptron Figure 18.10 p. 497 Adjusting the weights by Gradient Descent hill-climbing/down-hill See Algorithm Fixed-Increment Perceptron Learning Figure 18.11 p.499 A perceptron learning to solve a classification problem : K = 10, K = 100, K = 635 Figure 18.12 p.500 the XOR is not linearly separable We need multilayer perceptron to solve the XOR problem See Figure 18.3 p. 500 where x1 = 1 and x2 = 1 Artificial Intelligence Chapter 6

Artificial Intelligence Chapter 6

Artificial Intelligence Chapter 6

Artificial Intelligence Chapter 6

Artificial Intelligence Chapter 6

Backpropagation Algorithm Parker 1985, Rumelhart et. al 1986 fully connected, feedforward network, multilayer network Figure 18.14, p.502 fast, resistant to damage , learn efficiently see Figure 18.15, p.503 use for classify problem use sigmoid activation function (S-shaped) : it process a real value between 0 and 1 as output see Figure 18.16, p.503 Artificial Intelligence Chapter 6

Artificial Intelligence Chapter 6

Artificial Intelligence Chapter 6

Artificial Intelligence Chapter 6

Backpropagation Algorithm Figure 18.14, p.502 start with a random set of weights the network adjusts its weights each time it sees an input-outout pair each pair require two stages 1) a forward pass : involves presenting a sample input to the network and letting activations flow until they reach the output layer. 2) a backward pass : the network’s actual output (from the forward pass) is compared with the target output and error estimates a re computed for output units Artificial Intelligence Chapter 6

Backpropagation Algorithm The weight connected to the output units can be adjusted in order to reduce the errors. We can use the error estimates of the output units to derive error estimates for the units in the hidden layers. Finally errors are propagated back to the connections stemming from the input units. Artificial Intelligence Chapter 6

Backpropagation Algorithm p. 504-506... initial weight = -0.1 to 0.1, initial the activation function of the thresholding unit, learning rate = , choose an input-output pair oj = network actual value (ค่าที่ network คำนวณได้) yj = target output (ค่าจริงของข้อมูล ที่เราใช้ train) adjust weights between the hidden layer and output layer (w2ij) adjust weights between the input layer and hidden layer (w1ij) input layer w1ij hidden layer w2i j output layer Xi A hj B oj C Artificial Intelligence Chapter 6

Artificial Intelligence Chapter 6

Backpropagation Algorithm Backpropagation updates its weights after seeing each input-output pair. After it has seen all the input-output pairs and adjusts its weight that many times, we call one epoch had been completed. number of epochs make the network more efficiency we can speed up the network by using the momentum term  see equation p. 506 perceptron convergence theorem (Rosenblatt 1962) : guarantees that the perceptron will find a solution... Artificial Intelligence Chapter 6

Artificial Intelligence Chapter 6

Artificial Intelligence Chapter 6

Backpropagation Algorithm Generalization Figure 18.17 p.508 Good network should capable of storing entire training sets and have a setting for weights that generally describe the mapping for all cases, not the individual input-output pairs. Artificial Intelligence Chapter 6

Artificial Intelligence Chapter 6

Reinforcement Learning use punishment and reward system (same as animal) 1) the network is presented with a sample input form the training set 2) the network computes what it thinks should be the sample output 3) the network is supplied with a real-valued judgment by a teacher receive positive value : indicates good performance receive negative value : indicates bad performance 4) the network adjusts its weights, and process repeats we try to receive positive value or to have good performance supervised learning Artificial Intelligence Chapter 6

Unsupervised Learning no feedback for its outputs no teacher required given a set of input data, the network is allowed to discover regularities and relations between the different parts of the input feature discovery : Figure 18.8 p. 511 Data for unsupervised learning 3 types of animal... 1) mammals 2)reptiles 3) birds Artificial Intelligence Chapter 6

Artificial Intelligence Chapter 6

Unsupervised Learning we need to sure that only one of the three output units becomes active for any given input. see Figure 18.19 p. 512 A competitive learning network use winner-take-all behavior Artificial Intelligence Chapter 6

Artificial Intelligence Chapter 6

Unsupervised Learning single competitive learning algorithm p. 512-513 1) present an input vector 2) calculate the initial activation for each output unit 3) let the output units fight until only one is active 4) adjust the weights on the input lines that lead to the single active output unit, increase the weights on connections between the active output unit and active input units. (this makes it more likely that the output unit will be active next time the pattern is required) 5) repeat steps 1 to 4 for all input patterns for many epochs. Artificial Intelligence Chapter 6

use in temporal AI task, planning, natural language processing Recurrent Networks Jordan 1986 use in temporal AI task, planning, natural language processing we need more than a single output vector we need a series of output vectors Figure 18.22 p. 518 A Jordan network Figure 18.23 p. 519 A recurrent network with a mental model Artificial Intelligence Chapter 6

Artificial Intelligence Chapter 6

Artificial Intelligence Chapter 6

The End Artificial Intelligence Chapter 6