1 ECE 517: Reinforcement Learning in Artificial Intelligence Lecture 13: Artificial Neural Networks – Introduction, Feedforward Neural Networks Dr. Itamar.

1 ECE 517: Reinforcement Learning in Artificial Intelligence Lecture 13: Artificial Neural Networks – Introduction, Feedforward Neural Networks Dr. Itamar Arel College of Engineering Electrical Engineering and Computer Science Department The University of Tennessee Fall 2012 October 30, 2012

ECE 517 - Reinforcement Learning Final projects - logistics Projects can be done individually or in pairs Students are encouraged to propose a topic Please email me your top three choices for a project along with a preferred date for your presentation Presentation dates: Nov. 27, 29 and Dec. 4 Nov. 27, 29 and Dec. 4 Format: 17 min presentation + 3 min Q&A ~7 min for background and motivation ~7 min for background and motivation ~10 for description of your work and conclusions ~10 for description of your work and conclusions Written report due: Friday, Dec. 7 Format similar to project report 2

ECE 517 - Reinforcement Learning Final projects - topics Teris player using RL (and NN) Curiosity based TD learning* States vs. Rewards in RL States vs. Rewards in RL Human reinforcement learning Human reinforcement learning Reinforcement Learning of Local Shape in the Game of Go Where do rewards come from? Where do rewards come from? Efficient Skill Learning using Abstraction Selection Efficient Skill Learning using Abstraction Selection AIBO Playing on a PC using RL* AIBO learning to walk within a maze* Study of value function definitions for TD learning* 3

ECE 517 - Reinforcement Learning 4 Outline Introduction Brain vs. Computers The Perceptron Multilayer Perceptrons (MLP) Feedforward Neural-Networks and Backpropagation

ECE 517 - Reinforcement Learning 5 Pigeons as art experts (Watanabe et al. 1995) Experiment: Pigeon was placed in a closed box Pigeon was placed in a closed box Present paintings of two different artists (e.g. Chagall / Van Gogh) Present paintings of two different artists (e.g. Chagall / Van Gogh) Reward for pecking when presented a particular artist (e.g. Van Gogh) Reward for pecking when presented a particular artist (e.g. Van Gogh) Pigeons were able to discriminate between Van Gogh and Chagall with 95% accuracy (when presented with pictures they had been trained on)

ECE 517 - Reinforcement Learning 6 Pictures by different artists

ECE 517 - Reinforcement Learning 7 Interesting results Discrimination still 85% successful for previously unseen paintings of the artists Conclusions from the experiment: Pigeons do not simply memorise the pictures Pigeons do not simply memorise the pictures They can extract and recognise patterns (e.g. artistic ‘style’) They can extract and recognise patterns (e.g. artistic ‘style’) They generalise from the already seen to make predictions They generalise from the already seen to make predictions This is what neural networks (biological and artificial) are good at (unlike conventional computer) Provided further justification for use of ANNs “Computers are incredibly fast, accurate, and stupid. Human beings are incredibly slow, inaccurate, and brilliant. Together they are powerful beyond imagination,” Albert Einstein

ECE 517 - Reinforcement Learning 8 The “Von Neumann” architecture vs. Neural Networks Memory for programs and data CPU for math and logic Control unit to steer program flow Follows rules Solution can/must be formally specified Cannot generalize Not error tolerant Learns from data Rules on data are not visible Able to generalize Copes well with noise Von Neumann Neural Net

ECE 517 - Reinforcement Learning 9 Biological Neuron Input builds up on receptors (dendrites) Cell has an input threshold Upon breech of cell’s threshold, activation is fired down the axon Synapses (i.e. weights) exist prior to the dendrites (input) interfaces

ECE 517 - Reinforcement Learning 10 Connectionism Connectionist techniques (a.k.a. neural networks) are inspired by the strong interconnectedness of the human brain. Neural networks are loosely modeled after the biological processes involved in cognition: 1. Information processing involves many simple processing elements called neurons. 2. Signals are transmitted between neurons using connecting links. 3. Each link has a weight that modulates (or controls) the strength of its signal. 4. Each neuron applies an activation function to the input that it receives from other neurons. This function determines its output. Links with positive weights are called excitatory links. Links with negative weights are called inhibitory links.

ECE 517 - Reinforcement Learning 11 Some definitions A Neural Network is an interconnected assembly of simple processing elements, units or nodes. The long-term memory of the network is stored in the inter-unit connection strengths, or weights, obtained by a process of adaptation to, or learning from, a set of training patterns. Biologically inspired learning mechanism

ECE 517 - Reinforcement Learning 12 Brain vs. Computer Performance tends to degrade gracefully under partial damage In contrast, most programs and engineered systems are brittle: if you remove some arbitrary parts, very likely the whole will cease to function It performs massively parallel computations extremely efficiently. For example, complex visual perception occurs within less than 100 ms, that is, 10 processing steps!

ECE 517 - Reinforcement Learning 13 Dimensions of Neural Networks Various types of neurons Various network architectures Various learning algorithms Various applications We’ll focus mainly on supervised learning based networks The architecture of a neural network is linked with the learning algorithm used to train

ECE 517 - Reinforcement Learning 14 ANNs – The basics ANNs incorporate the two fundamental components of biological neural nets: Neurons – computational nodes Neurons – computational nodes Synapses – weights or memory storage devices Synapses – weights or memory storage devices

ECE 517 - Reinforcement Learning 15 Neuron vs. Node

ECE 517 - Reinforcement Learning 16 The Artificial Neuron Input signal Synaptic weights Summing function Bias b Activation function Local Field v Output y x1x1 x2x2 xmxm w2w2 wmwm w1w1

ECE 517 - Reinforcement Learning 17 Bias as an extra input Bias is an external parameter of the neuron. Can be modeled by adding an extra (fixed-valued) input Input signal Synaptic weights Summing function Activation function Local Field v Output y x1x1 x2x2 xmxm w2w2 wmwm w1w1 w0w0 x 0 = +1

ECE 517 - Reinforcement Learning 18 Face recognition example 90% accurate learning head pose, and recognizing 1-of-20 faces

ECE 517 - Reinforcement Learning 19 The XOR problem A single-layer (linear) neural network cannot solve the XOR problem. Input Output 00  0 01  1 10  1 11  0 To see why this is true, we can try to express the problem as a linear equation: aX + bY = Z a0 + b0 = 0 a0 + b1 = 1 -> b = 1 a1 + b0 = 1 -> a = 1 a1 + b1 = 0 -> a = -b

ECE 517 - Reinforcement Learning 20 The XOR problem (cont.) But adding a third bit the problem can be resolved. Input Output 000  0 010  1 100  1 111  0 Once again, we express the problem as a linear equation: aX + bY + cZ = W a0 + b0 + c0 = 0 a0 + b1 + c0 = 1 -> b=1 a1 + b0 + c0 = 1 -> a=1 a1 + b1 + c1 = 0 -> a + b + c = 0 -> 1 + 1 + c = 0 -> c = -2 So the equation: X + Y - 2Z = W will solve the problem.

ECE 517 - Reinforcement Learning 21 A Multilayer Network for the XOR function Thresholds

ECE 517 - Reinforcement Learning 22 Hidden Units Hidden units are a layer of nodes that are situated between the input nodes and the output nodes Hidden units allow a network to learn non-linear functions The hidden units allow the net to represent combinations of the input features Given too many hidden units, however, a net will simply memorize the input patterns Given too many hidden units, however, a net will simply memorize the input patterns Given too few hidden units, the network may not be able to represent all of the necessary generalizations Given too few hidden units, the network may not be able to represent all of the necessary generalizations

ECE 517 - Reinforcement Learning 23 Backpropagation Networks Backpropagation networks are among the most popular and widely used neural networks because they are relatively simple and powerful Backpropagation was one of the first general techniques developed to train multilayer networks, which do not have many of the inherent limitations of the earlier, single- layer neural nets criticized by Minsky and Papert. Backpropagation networks use a gradient descent method to minimize the total squared error of the output. A backpropagation net is a multilayer, feedforward network that is trained by backpropagating the errors using the generalized delta rule.

ECE 517 - Reinforcement Learning 24 The idea behind (error) backpropagation learning Feedforward training of input patterns Each input node receives a signal, which is broadcasted to all of the hidden units Each hidden unit computes its activation, which is broadcasted to all of the output nodes Backpropagation of errors Each output node compares its activation with the desired output Based on this difference, the error is propagated back to all previous nodes Adjustment of weights The weights of all links are computed simultaneously based on the errors that were propagated backwards Multilayer Perceptron (MLP)

ECE 517 - Reinforcement Learning 25 Activation functions Transforms neuron’s input into outputTransforms neuron’s input into output Features of activation functions:Features of activation functions: A squashing effect is requiredA squashing effect is required Prevents accelerating growth of activation levels through the network Simple and easy to calculateSimple and easy to calculate

ECE 517 - Reinforcement Learning 26 Backpropagation Learning We want to train a multi-layer feedforward network by gradient descent to approximate an unknown function, based on some training data consisting of pairs (x,d) Vector x represents a pattern of input to the network, and the vector d the corresponding target (desired output) Vector x represents a pattern of input to the network, and the vector d the corresponding target (desired output) BP is a gradient-descent based scheme … The overall gradient with respect to the entire training set is just the sum of the gradients for each pattern The overall gradient with respect to the entire training set is just the sum of the gradients for each pattern We will therefore describe how to compute the gradient for just a single training pattern We will therefore describe how to compute the gradient for just a single training pattern We will number the units, and denote the weight from unit j to unit i by x ij We will number the units, and denote the weight from unit j to unit i by x ij

ECE 517 - Reinforcement Learning 27 BP – Forward Pass at Layer 1

ECE 517 - Reinforcement Learning 28 BP – Forward Pass at Layer 2

ECE 517 - Reinforcement Learning 29 BP – Forward Pass at Layer 3 The last layer produces the network’s output The last layer produces the network’s output We can now derive an error (difference between output and the target) We can now derive an error (difference between output and the target)

ECE 517 - Reinforcement Learning 30 BP – Back-propagation of error – output layer We have an error with respect to the target ( z ) This error signal will be propagated back towards the input layer (layer 1) Each neuron will forward error information to the neurons feeding it from the previous layer

ECE 517 - Reinforcement Learning 31 BP – Back-propagation of error towards the hidden layer

ECE 517 - Reinforcement Learning 32 BP – Back-propagation of error towards the input layer

ECE 517 - Reinforcement Learning 33 BP – Illustration of Weight Update

1 ECE 517: Reinforcement Learning in Artificial Intelligence Lecture 13: Artificial Neural Networks – Introduction, Feedforward Neural Networks Dr. Itamar.

Similar presentations

Presentation on theme: "1 ECE 517: Reinforcement Learning in Artificial Intelligence Lecture 13: Artificial Neural Networks – Introduction, Feedforward Neural Networks Dr. Itamar."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 ECE 517: Reinforcement Learning in Artificial Intelligence Lecture 13: Artificial Neural Networks – Introduction, Feedforward Neural Networks Dr. Itamar.

Similar presentations

Presentation on theme: "1 ECE 517: Reinforcement Learning in Artificial Intelligence Lecture 13: Artificial Neural Networks – Introduction, Feedforward Neural Networks Dr. Itamar."— Presentation transcript:

Similar presentations

About project

Feedback