Neural Networks Marco Loog.

Slides:



Advertisements
Similar presentations
Artificial Neural Networks
Advertisements

Beyond Linear Separability
Slides from: Doug Gray, David Poole
Learning in Neural and Belief Networks - Feed Forward Neural Network 2001 년 3 월 28 일 안순길.
1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)
Navneet Goyal, BITS-Pilani Perceptrons. Labeled data is called Linearly Separable Data (LSD) if there is a linear decision boundary separating the classes.
Tuomas Sandholm Carnegie Mellon University Computer Science Department
CS 4700: Foundations of Artificial Intelligence
Kostas Kontogiannis E&CE
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
Ai in game programming it university of copenhagen Statistical Learning Methods Marco Loog.
Machine Learning Neural Networks
Lecture 14 – Neural Networks
Simple Neural Nets For Pattern Classification
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Chapter 5 NEURAL NETWORKS
Carla P. Gomes CS4700 CS 4700: Foundations of Artificial Intelligence Prof. Carla P. Gomes Module: Neural Networks: Concepts (Reading:
Machine Learning Motivation for machine learning How to set up a problem How to design a learner Introduce one class of learners (ANN) –Perceptrons –Feed-forward.
Lecture 4 Neural Networks ICS 273A UC Irvine Instructor: Max Welling Read chapter 4.
MACHINE LEARNING 12. Multilayer Perceptrons. Neural Networks Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Artificial Neural Networks
CHAPTER 11 Back-Propagation Ming-Feng Yeh.
CS 4700: Foundations of Artificial Intelligence
ICS 273A UC Irvine Instructor: Max Welling Neural Networks.
CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:
Radial Basis Function Networks
Neural Networks. Plan Perceptron  Linear discriminant Associative memories  Hopfield networks  Chaotic networks Multilayer perceptron  Backpropagation.
Artificial Neural Networks
Artificial Neural Networks
DIGITAL IMAGE PROCESSING Dr J. Shanbehzadeh M. Hosseinajad ( J.Shanbehzadeh M. Hosseinajad)
Multi-Layer Perceptrons Michael J. Watts
11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering
Machine Learning Dr. Shazzad Hosain Department of EECS North South Universtiy
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 16: NEURAL NETWORKS Objectives: Feedforward.
Classification / Regression Neural Networks 2
Artificial Neural Networks. The Brain How do brains work? How do human brains differ from that of other animals? Can we base models of artificial intelligence.
CS 478 – Tools for Machine Learning and Data Mining Backpropagation.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 31: Feedforward N/W; sigmoid.
1 CMSC 671 Fall 2001 Class #25-26 – Tuesday, November 27 / Thursday, November 29.
Non-Bayes classifiers. Linear discriminants, neural networks.
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
CS621 : Artificial Intelligence
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
Artificial Neural Network
Hazırlayan NEURAL NETWORKS Backpropagation Network PROF. DR. YUSUF OYSAL.
Learning Neural Networks (NN) Christina Conati UBC
Introduction to Neural Networks Freek Stulp. 2 Overview Biological Background Artificial Neuron Classes of Neural Networks 1. Perceptrons 2. Multi-Layered.
Artificial Neural Networks (ANN). Artificial Neural Networks First proposed in 1940s as an attempt to simulate the human brain’s cognitive learning processes.
CAP6938 Neuroevolution and Artificial Embryogeny Neural Network Weight Optimization Dr. Kenneth Stanley January 18, 2006.
Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.
Neural Networks The Elements of Statistical Learning, Chapter 12 Presented by Nick Rizzolo.
129 Feed-Forward Artificial Neural Networks AMIA 2003, Machine Learning Tutorial Constantin F. Aliferis & Ioannis Tsamardinos Discovery Systems Laboratory.
Learning with Neural Networks Artificial Intelligence CMSC February 19, 2002.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Machine Learning Supervised Learning Classification and Regression
Fall 2004 Backpropagation CS478 - Machine Learning.
One-layer neural networks Approximation problems
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
Dr. Kenneth Stanley September 6, 2006
Machine Learning Today: Reading: Maria Florina Balcan
CSC 578 Neural Networks and Deep Learning
ECE 471/571 - Lecture 17 Back Propagation.
Classification Neural Networks 1
of the Artificial Neural Networks.
Lecture Notes for Chapter 4 Artificial Neural Networks
Neural Networks ICS 273A UC Irvine Instructor: Max Welling
David Kauchak CS158 – Spring 2019
Presentation transcript:

Neural Networks Marco Loog

Previously in ‘Statistical Methods’... Agents can handle uncertainty by using the methods of probability and decision theory But first they must learn their probabilistic theories of the world from experience...

Previously in ‘Statistical Methods’... Key Concepts : Data : evidence, i.e., instantiation of one or more random variables describing the domain Hypotheses : probabilistic theories of how the domain works

Previously in ‘Statistical Methods’... Outline Bayesian learning Maximum a posteriori and maximum likelihood learning Instance-based learning Neural networks...

Outline Some slides from last week... Network structure Perceptrons Multilayer Feed-Forward Neural Networks Learning Networks?

Neural Networks and Games

Neural Networks and Games

Neural Networks and Games

Neural Networks and Games

Neural Networks and Games

Neural Networks and Games

Neural Networks and Games

So First... Neural Networks According to Robert Hecht-Nielsen, a neural network is simply “a computing system made up of a number of simple, highly interconnected processing elements, which process information by their dynamic state response to external inputs” Simply... We skip the biology for now And provide the bare basics

Network Structure Input units Hidden units Output units

Network Structure Feed-forward networks Recurrent networks Feedback from output units to input

Feed-Forward Network Feed-forward network = a parameterized family of nonlinear functions g is activation function Ws are weights to be adapted, i.e., the learning

Activation Functions Often have form of a step function [a threshold] or sigmoid N.B. thresholding = ‘degenerated’ sigmoid

Perceptrons Single-layer neural network Expressiveness Perceptron with g = step function can learn AND, OR, NOT, majority, but not XOR

Learning in Sigmoid Perceptrons The idea is to adjust the weights so as to minimize some measure of error on the training set Learning is optimization of the weights This can be done using general optimization routines for continuous spaces

Learning in Sigmoid Perceptrons The idea is to adjust the weights so as to minimize some measure of error on the training set Error measure most often used for NN is the sum of squared errors

Learning in Sigmoid Perceptrons Error measure most often used for NN is the sum of squared errors Perform optimization search by gradient descent Weight update rule [ is learning rate]

Simple Comparison

Some Remarks [Thresholded] perceptron learning rule converges to a consistent function for any linearly separable data set [Sigmoid] perceptron output can be interpreted as conditional probability Also interpretation in terms of maximum likelihood [ML] estimation possible

Multilayer Feed-Forward NN Network with hidden units Adding hidden layers enlarges the hypothesis space Most common : single hidden layer

Expressiveness 2-input perceptron 2-input single-hidden-layer neural network [by ‘adding’ perceptron outputs]

Expressiveness With a single, sufficiently large, hidden layer it is possible to approximate any continuous function With two layers, discontinuous functions can be approximated as well For particular networks it is hard to say what exactly can be represented

Learning in Multilayer NN Back-propagation is used to perform weight updates in the network Similar to perceptron learning Major difference is that output error is clear, but how to measure the error at the nodes in the hidden layers? Additionally, should deal with multiple outputs

Learning in Multilayer NN At output layer weight-update rule is the similar as for perceptron [but then for multiple outputs i] where Idea of back-propagation : every hidden unit contributes some fraction to the error of the output node to which it connects

Learning in Multilayer NN [...] contributes some fraction to the error of the output node to which it connects Thus errors are divided according to connection strength [or weights] Update rule :

E.g. Training curve for 100 restaurant examples : exact fit

Learning NN Structures? How to find the best network structure? Too big results in ‘lookup table’ behavior / overtraining Too small in ‘undertraining’ / not exploiting the full expressiveness Possibility : try different structures and validate using, for example, cross-validation But which different structures to consider? Start with fully connected network and remove nodes : optimal brain damage Growing larger networks [from smaller ones], e.g. tiling and NEAT

Learning NN Structures : Topic for later Lecture?

Finally, Some Remarks NN = possibly complex nonlinear function with many parameters that have to be tuned Problems : slow convergence, local minima Back-propagation explained, but other optimization schemes are possible Perceptron can handle linear separable functions Multilayer NN can represent any kind of function Hard to come up with optimal network Learning rate, initial weights, etc. have to be set NN : not much magic there... “Keine Hekserei, nur Behändigkeit!”

And with that Disappointing Message... We take a break...