CS 551/651 Search and “Through the Lens” Lecture 13 Search and “Through the Lens” Lecture 13.

Slides:

Advertisements

Similar presentations

NEURAL NETWORKS Backpropagation Algorithm

Advertisements

1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)

CS 416 Artificial Intelligence Lecture 22 Statistical Learning Chapter 20.5 Lecture 22 Statistical Learning Chapter 20.5.

Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.

1 Neural networks. Neural networks are made up of many artificial neurons. Each input into the neuron has its own weight associated with it illustrated.

Neural Networks  A neural network is a network of simulated neurons that can be used to recognize instances of patterns. NNs learn by searching through.

Tuomas Sandholm Carnegie Mellon University Computer Science Department

Kostas Kontogiannis E&CE

Artificial Neural Networks

Lecture 14 – Neural Networks

The back-propagation training algorithm

Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Artificial Intelligence Statistical learning methods Chapter 20, AIMA (only ANNs & SVMs)

September 30, 2010Neural Networks Lecture 8: Backpropagation Learning 1 Sigmoidal Neurons In backpropagation networks, we typically choose  = 1 and 

Neural Networks Marco Loog.

Artificial Neural Networks Artificial Neural Networks are (among other things) another technique for supervised learning k-Nearest Neighbor Decision Tree.

Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Artificial Neural Networks

CHAPTER 11 Back-Propagation Ming-Feng Yeh.

Image Compression Using Neural Networks Vishal Agrawal (Y6541) Nandan Dubey (Y6279)

Artificial Neural Networks

Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.

Artificial Neural Networks (ANN). Output Y is 1 if at least two of the three inputs are equal to 1.

Multiple-Layer Networks and Backpropagation Algorithms

Artificial Neural Networks

1 Artificial Neural Networks Sanun Srisuk EECP0720 Expert Systems – Artificial Neural Networks.

Chapter 11 – Neural Networks COMP 540 4/17/2007 Derek Singer.

Machine Learning Chapter 4. Artificial Neural Networks

11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering

1 Chapter 6: Artificial Neural Networks Part 2 of 3 (Sections 6.4 – 6.6) Asst. Prof. Dr. Sukanya Pongsuparb Dr. Srisupa Palakvangsa Na Ayudhya Dr. Benjarath.

Appendix B: An Example of Back-propagation algorithm

Artificial Neural Network Supervised Learning دكترمحسن كاهاني

Classification / Regression Neural Networks 2

LINEAR CLASSIFICATION. Biological inspirations  Some numbers…  The human brain contains about 10 billion nerve cells ( neurons )  Each neuron is connected.

1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.

A note about gradient descent: Consider the function f(x)=(x-x 0 ) 2 Its derivative is: By gradient descent. x0x0 + -

Multi-Layer Perceptron

Non-Bayes classifiers. Linear discriminants, neural networks.

11 1 Backpropagation Multilayer Perceptron R – S 1 – S 2 – S 3 Network.

1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.

EEE502 Pattern Recognition

Neural Networks Vladimir Pleskonjić 3188/ /20 Vladimir Pleskonjić General Feedforward neural networks Inputs are numeric features Outputs are in.

CS 551/651 SearchSearch. Topics Search Local globalLocal global Simulated annealing, genetic algorithmsSimulated annealing, genetic algorithms Neural.

Neural Networks 2nd Edition Simon Haykin

Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.

Kim HS Introduction considering that the amount of MRI data to analyze in present-day clinical trials is often on the order of hundreds or.

Learning with Neural Networks Artificial Intelligence CMSC February 19, 2002.

CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.

Multiple-Layer Networks and Backpropagation Algorithms

The Gradient Descent Algorithm

CSE 473 Introduction to Artificial Intelligence Neural Networks

Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)

Neural Networks A neural network is a network of simulated neurons that can be used to recognize instances of patterns. NNs learn by searching through.

with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017

FUNDAMENTAL CONCEPT OF ARTIFICIAL NETWORKS

CSE P573 Applications of Artificial Intelligence Neural Networks

Classification / Regression Neural Networks 2

CSE 473 Introduction to Artificial Intelligence Neural Networks

Prof. Carolina Ruiz Department of Computer Science

Artificial Neural Network & Backpropagation Algorithm

CSE 573 Introduction to Artificial Intelligence Neural Networks

Neural Network - 2 Mayank Vatsa

Capabilities of Threshold Neurons

Lecture Notes for Chapter 4 Artificial Neural Networks

Backpropagation.

Artificial Neural Networks

COSC 4335: Part2: Other Classification Techniques

Backpropagation and Neural Nets

CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya.

Prof. Carolina Ruiz Department of Computer Science

Presentation transcript:

CS 551/651 Search and “Through the Lens” Lecture 13 Search and “Through the Lens” Lecture 13

Assign 1 Grading Sign up for a slot to demo to TA Sunday upon return from breakSunday upon return from break Monday upon return from breakMonday upon return from break Sign up for a slot to demo to TA Sunday upon return from breakSunday upon return from break Monday upon return from breakMonday upon return from break

Papers to read during break Spacetime ConstraintsSpacetime Constraints Evolved Virtual CreaturesEvolved Virtual Creatures NeuroanimatorNeuroanimator Spacetime ConstraintsSpacetime Constraints Evolved Virtual CreaturesEvolved Virtual Creatures NeuroanimatorNeuroanimator

Single-layer networks Training Training samples are used to tune the network weightsTraining samples are used to tune the network weights –Input / output pairs Network generates an output based on input (and weights)Network generates an output based on input (and weights) Network’s output is compared to correct outputNetwork’s output is compared to correct output Error in output is used to adapt the weightsError in output is used to adapt the weights Repeat process to minimize errorsRepeat process to minimize errorsTraining Training samples are used to tune the network weightsTraining samples are used to tune the network weights –Input / output pairs Network generates an output based on input (and weights)Network generates an output based on input (and weights) Network’s output is compared to correct outputNetwork’s output is compared to correct output Error in output is used to adapt the weightsError in output is used to adapt the weights Repeat process to minimize errorsRepeat process to minimize errors

Consider error in single-layer neural networks Sum of squared errors (across training data) For one sample: How can we minimize the error? Set derivative equal to zero (like in Calc 101)Set derivative equal to zero (like in Calc 101) –Solve for weights that make derivative == 0 Is that error affected by each of the weights in the weight vector?Is that error affected by each of the weights in the weight vector? Sum of squared errors (across training data) For one sample: How can we minimize the error? Set derivative equal to zero (like in Calc 101)Set derivative equal to zero (like in Calc 101) –Solve for weights that make derivative == 0 Is that error affected by each of the weights in the weight vector?Is that error affected by each of the weights in the weight vector?

Minimizing the error What is the derivative? The gradient,The gradient, –Composed of What is the derivative? The gradient,The gradient, –Composed of

Computing the partial Remember the Chain Rule:Remember the Chain Rule: For a network, h w, with inputs x and correct output yFor a network, h w, with inputs x and correct output y

Computing the partial g ( ) = the activation function

Computing the partial g’() = derivative of the activation function Chain rule again

Minimizing the error Gradient descent Learning rate

Why are modification rules more complicated in multilayer? We can calculate the error of the output neuron by comparing to training data We could use previous update rule to adjust W 3,5 and W 4,5 to correct that errorWe could use previous update rule to adjust W 3,5 and W 4,5 to correct that error But how do W 1,3 W 1,4 W 2,3 W 2,4 adjust?But how do W 1,3 W 1,4 W 2,3 W 2,4 adjust? We can calculate the error of the output neuron by comparing to training data We could use previous update rule to adjust W 3,5 and W 4,5 to correct that errorWe could use previous update rule to adjust W 3,5 and W 4,5 to correct that error But how do W 1,3 W 1,4 W 2,3 W 2,4 adjust?But how do W 1,3 W 1,4 W 2,3 W 2,4 adjust?

Backprop at the output layer Output layer error is computed as in single-layer and weights are updated in same fashion Let Err i be the i th component of the error vector y – h WLet Err i be the i th component of the error vector y – h W –Let Output layer error is computed as in single-layer and weights are updated in same fashion Let Err i be the i th component of the error vector y – h WLet Err i be the i th component of the error vector y – h W –Let

Backprop in the hidden layer Each hidden node is responsible for some fraction of the error  i in each of the output nodes to which it is connected  i is divided among all hidden nodes that connect to output i according to their strengths  i is divided among all hidden nodes that connect to output i according to their strengths Error at hidden node j: Each hidden node is responsible for some fraction of the error  i in each of the output nodes to which it is connected  i is divided among all hidden nodes that connect to output i according to their strengths  i is divided among all hidden nodes that connect to output i according to their strengths Error at hidden node j:

Backprop in the hidden layer Error is: Correction is: Error is: Correction is:

Summary of backprop 1.Compute the  value for the output units using the observed error 2.Starting with the output layer, repeat the following for each layer until done Propagate  value back to previous layerPropagate  value back to previous layer Update the weights between the two layersUpdate the weights between the two layers 1.Compute the  value for the output units using the observed error 2.Starting with the output layer, repeat the following for each layer until done Propagate  value back to previous layerPropagate  value back to previous layer Update the weights between the two layersUpdate the weights between the two layers

Some general artificial neural network (ANN) info The entire network is a function g( inputs ) = outputsThe entire network is a function g( inputs ) = outputs –These functions frequently have sigmoids in them –These functions are frequently differentiable –These functions have coefficients (weights) Backpropagation networks are simply ways to tune the coefficients of a function so it produces desired outputBackpropagation networks are simply ways to tune the coefficients of a function so it produces desired output The entire network is a function g( inputs ) = outputsThe entire network is a function g( inputs ) = outputs –These functions frequently have sigmoids in them –These functions are frequently differentiable –These functions have coefficients (weights) Backpropagation networks are simply ways to tune the coefficients of a function so it produces desired outputBackpropagation networks are simply ways to tune the coefficients of a function so it produces desired output

Function approximation Consider fitting a line to data Coefficients: slope and y-interceptCoefficients: slope and y-intercept Training data: some samplesTraining data: some samples Use least-squares fitUse least-squares fit This is what an ANN does Consider fitting a line to data Coefficients: slope and y-interceptCoefficients: slope and y-intercept Training data: some samplesTraining data: some samples Use least-squares fitUse least-squares fit This is what an ANN does x y

Function approximation A function of two inputs… Fit a smooth curve to the available dataFit a smooth curve to the available data –Quadratic –Cubic –n th -order –ANN! A function of two inputs… Fit a smooth curve to the available dataFit a smooth curve to the available data –Quadratic –Cubic –n th -order –ANN!

Curve fitting A neural network should be able to generate the input/output pairs from the training dataA neural network should be able to generate the input/output pairs from the training data You’d like for it to be smooth (and well-behaved) in the voids between the training dataYou’d like for it to be smooth (and well-behaved) in the voids between the training data There are risks of over fitting the dataThere are risks of over fitting the data A neural network should be able to generate the input/output pairs from the training dataA neural network should be able to generate the input/output pairs from the training data You’d like for it to be smooth (and well-behaved) in the voids between the training dataYou’d like for it to be smooth (and well-behaved) in the voids between the training data There are risks of over fitting the dataThere are risks of over fitting the data

When using ANNs Sometimes the output layer feeds back into the input layer – recurrent neural networksSometimes the output layer feeds back into the input layer – recurrent neural networks The backpropagation will tune the weightsThe backpropagation will tune the weights You determine the topologyYou determine the topology –Different topologies have different training outcomes (consider overfitting) –Sometimes a genetic algorithm is used to explore the space of neural network topologies Sometimes the output layer feeds back into the input layer – recurrent neural networksSometimes the output layer feeds back into the input layer – recurrent neural networks The backpropagation will tune the weightsThe backpropagation will tune the weights You determine the topologyYou determine the topology –Different topologies have different training outcomes (consider overfitting) –Sometimes a genetic algorithm is used to explore the space of neural network topologies

Through The Lens Camera Control

Controlling virtual camera

Lagrange multipliers Lagrange Multipliers without Permanent Scarring Dan Klein – (now at Stanford)Dan Klein – (now at Stanford) Lagrange Multipliers without Permanent Scarring Dan Klein – (now at Stanford)Dan Klein – (now at Stanford)

More complicated example Maximize parabaloid subject to a unit circle Any solution to the maximization problem must sit on x 2 + y 2 = 1Any solution to the maximization problem must sit on x 2 + y 2 = 1 Maximize parabaloid subject to a unit circle Any solution to the maximization problem must sit on x 2 + y 2 = 1Any solution to the maximization problem must sit on x 2 + y 2 = 1

The central theme of Lagrange Multipliers At the solution points, the isocurve (a.k.a. level curve or contour) of the function to be maximized is tangent to the constraint curve

Tangent Curves Tangent curves == parallel normals Create the LagrangianCreate the Lagrangian Solve for where gradient = 0 to capture parallel normals and g(x) must equal 0Solve for where gradient = 0 to capture parallel normals and g(x) must equal 0 Tangent curves == parallel normals Create the LagrangianCreate the Lagrangian Solve for where gradient = 0 to capture parallel normals and g(x) must equal 0Solve for where gradient = 0 to capture parallel normals and g(x) must equal 0

Go to board for more development