# Carla P. Gomes CS4700 CS 4700: Foundations of Artificial Intelligence Prof. Carla P. Gomes Module: Neural Networks: Concepts (Reading:

## Presentation on theme: "Carla P. Gomes CS4700 CS 4700: Foundations of Artificial Intelligence Prof. Carla P. Gomes Module: Neural Networks: Concepts (Reading:"— Presentation transcript:

Carla P. Gomes CS4700 CS 4700: Foundations of Artificial Intelligence Prof. Carla P. Gomes gomes@cs.cornell.edu Module: Neural Networks: Concepts (Reading: Chapter 20.5)

Carla P. Gomes CS4700 Basic Concepts A Neural Network maps a set of inputs to a set of outputs Number of inputs/outputs is variable The Network itself is composed of an arbitrary number of nodes or units, connected by links, with an arbitrary topology. A link from unit i to unit j serves to propagate the activation a j to j, and it has a weight W ij. What can a neural networks do? Compute a known function / Approximate an unknown function Pattern Recognition / Signal Processing Learn to do any of the above

Carla P. Gomes CS4700 Different types of nodes

Carla P. Gomes CS4700 Output edges, each with weights (positive, negative, and change over time, learning) Input edges, each with weights (positive, negative, and change over time, learning) An Artificial Neuron Node or Unit: A Mathematical Abstraction Artificial Neuron, Node or unit, Processing Unit i Input function(in i ) : weighted sum of its inputs, including fixed input a 0.  a processing element producing an output based on a function of its inputs Activation function (g) applied to input function (typically non-linear). Output Note: the fixed input and bias weight are conventional; some authors instead, e.g., or a 0 =1 and -W 0i

Carla P. Gomes CS4700 (a)Threshold activation function  a step function or threshold function (outputs 1 when the input is positive; 0 otherwise). (b) Sigmoid (or logistics function) activation function (key advantage: differentiable) (c) Sign function, +1 if input is positive, otherwise -1. Activation Functions  Changing the bias weight W 0,i moves the threshold location. These functions have a threshold (either hard or soft) at zero.

Carla P. Gomes CS4700 Threshold Activation Function Input edges, each with weights (positive, negative, and change over time, learning)  i =0  i =t  i threshold value associated with unit i

Carla P. Gomes CS4700 Implementing Boolean Functions Units with a threshold activation function can act as logic gates; we can use these units to compute Boolean function of its inputs. Activation of threshold units when:

Carla P. Gomes CS4700 Boolean AND input x1input x2ouput 000 010 100 111 x2x2 x1x1 w 2 =1w 1 =1 W 0 = 1.5 Activation of threshold units when:

Carla P. Gomes CS4700 Boolean OR input x1input x2ouput 000 011 101 111 x2x2 x1x1 w 2 =1w 1 =1 w 0 = 0.5 Activation of threshold units when:

Carla P. Gomes CS4700 Inverter input x1output 01 10 x1x1 w 1 =  1 w 0 = -  Activation of threshold units when: So, units with a threshold activation function can act as logic gates given the appropriate input and bias weights.

Carla P. Gomes CS4700 Network Structures Acyclic or Feed-forward networks Activation flows from input layer to output layer –single-layer perceptrons –multi-layer perceptrons Recurrent networks –Feed the outputs back into own inputs  Network is a dynamical system (stable state, oscillations, chaotic behavior)  Response of the network depends on initial state –Can support short-term memory –More difficult to understand Feed-forward networks implement functions, have no internal state (only weights). Our focus

Carla P. Gomes CS4700 Recurrent Networks Can capture internal state (activation keeps going around);  more complex agents. Brain cannot be a just a feed-forward network! Brain has many feed-back connections and cycles  brain is a recurrent network! Two key examples: Hopfield networks: Boltzmann Machines.

Carla P. Gomes CS4700 Hopfield Networks A Hopfield neural network is typically used for pattern recognition. Hopfield networks have symmetric weights (W ij =W ji ); Output: 0/1 only. Train weights to obtain associative memory e.g., store template patterns as multiple stable states; given a new input pattern, the network converges to one of the exemplar patterns. It can be proven that an N unit Hopfield net can learn up to 0.138N patterns reliably. Note: no explicit storage: all in weights!

Hopfield Networks The user trains the network with a set of black-and-white templates; Input units: 100 pixels; Output units: 100 pixels; For each template, each neuron in the network (corresponding to one pixel) learns to turn itself on or off based on the current output of every other neuron in the network. After training, the network can be provided with an arbitrary input pattern, and it (may) converges to an output pattern resembling whichever template most closely matches this input pattern http://www.cbu.edu/~pong/ai/hopfield/hopfieldapplet.html

Hopfield Networks http://www.cbu.edu/~pong/ai/hopfield/hopfieldapplet.html Given input pattern: After around 500 iterations the network converges to:

Hopfield Networks http://www.cbu.edu/~pong/ai/hopfield/hopfieldapplet.html Given input pattern: After around 500 iterations the network converges to:

Carla P. Gomes CS4700 Boltzmann Machines Generalization of Hopfield Networks: Hidden neurons: the Boltzamnn machines have hidden units; Neuron update: stochastic activation functions Both Hopfield and Boltzamnn networks can solve optimization problems (similar to Monte Carlo methods). We will not cover these networks.

Carla P. Gomes CS4700 Feed-forward Network: Represents a function of Its Input Two hidden unitsTwo input unitsOne Output By adjusting the weights we get different functions: that is how learning is done in neural networks! Each unit receives input only from units in the immediately preceding layer. Given an input vector x = (x 1,x 2 ), the activations of the input units are set to values of the input vector, i.e., (a 1,a 2 )=(x 1,x 2 ), and the network computes: Feed-forward network computes a parameterized family of functions h W (x) (Bias unit omitted for simplicity) Note: the input layer in general does not include computing units. Weights are the parameters of the function

Carla P. Gomes CS4700 Feed-forward Network (contd.) A neural network can be used for classification or regression. For Boolean classification with continuous outputs (e.g., with sigmoid units)  typically a single output unit (value> 0.5  one class) For k-way classification, one could divide the single output unit’s range into k portions  typically, k separate output units, with the value of each one representing the relative likelihood of that class given the current input

Carla P. Gomes CS4700 Large IBM investment in the next generation of Neural Nets IBM plans 'brain-like' computers Page last updated at 14:52 GMT, Friday, 21 November 2008 By Jason Palmer Science and technology reporter, BBC News IBM has announced it will lead a US government-funded collaboration to make electronic circuits that mimic brains. http://news.bbc.co.uk/2/hi/science/nature/7740484.stm

Download ppt "Carla P. Gomes CS4700 CS 4700: Foundations of Artificial Intelligence Prof. Carla P. Gomes Module: Neural Networks: Concepts (Reading:"

Similar presentations