Download presentation

Presentation is loading. Please wait.

Published byAlexandra Lord Modified over 4 years ago

1
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, 2004 alpaydin@boun.edu.tr http://www.cmpe.boun.edu.tr/~ethem/i2ml Lecture Slides for

2
CHAPTER 11: Multilayer Perceptrons

3
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 3 Neural Networks Networks of processing units (neurons) with connections (synapses) between them Large number of neurons: 10 10 Large connectitivity: 10 5 Parallel processing Distributed computation/memory Robust to noise, failures

4
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 4 Understanding the Brain Levels of analysis (Marr, 1982) 1. Computational theory 2. Representation and algorithm 3. Hardware implementation Reverse engineering: From hardware to theory Parallel processing: SIMD vs MIMD Neural net: SIMD with modifiable local memory Learning: Update by training/experience

5
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 5 Perceptron (Rosenblatt, 1962)

6
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 6 What a Perceptron Does Regression: y=wx+w 0 Classification: y=1(wx+w 0 >0) w w0w0 y x x 0 =+1 w w0w0 y x s w0w0 y x

7
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 7 K Outputs Classification : Regression :

8
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 8 Training Online (instances seen one by one) vs batch (whole sample) learning: No need to store the whole sample Problem may change in time Wear and degradation in system components Stochastic gradient-descent: Update after a single pattern Generic update rule (LMS rule):

9
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 9 Training a Perceptron: Regression Regression (Linear output):

10
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 10 Classification Single sigmoid output K>2 softmax outputs

11
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 11 Learning Boolean AND

12
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 12 XOR No w 0, w 1, w 2 satisfy: (Minsky and Papert, 1969)

13
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 13 Multilayer Perceptrons (Rumelhart et al., 1986)

14
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 14 x 1 XOR x 2 = (x 1 AND ~x 2 ) OR (~x 1 AND x 2 )

15
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 15 Backpropagation

16
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 16 Regression Forward Backward x

17
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 17 Regression with Multiple Outputs zhzh v ih yiyi xjxj w hj

18
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 18

19
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 19

20
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 20 whx+w0whx+w0 zhzh vhzhvhzh

21
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 21 Two-Class Discrimination One sigmoid output y t for P(C 1 |x t ) and P(C 2 |x t ) ≡ 1-y t

22
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 22 K>2 Classes

23
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 23 Multiple Hidden Layers MLP with one hidden layer is a universal approximator (Hornik et al., 1989), but using multiple layers may lead to simpler networks

24
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 24 Improving Convergence Momentum Adaptive learning rate

25
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 25 Overfitting/Overtraining Number of weights: H (d+1)+(H+1)K

26
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 26

27
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 27 Structured MLP (Le Cun et al, 1989)

28
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 28 Weight Sharing

29
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 29 Hints Invariance to translation, rotation, size Virtual examples Augmented error: E’=E+ λ h E h If x’ and x are the “same”: E h =[g(x| θ )- g(x’| θ )] 2 Approximation hint: (Abu-Mostafa, 1995)

30
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 30 Tuning the Network Size Destructive Weight decay: Constructive Growing networks (Ash, 1989) (Fahlman and Lebiere, 1989)

31
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 31 Bayesian Learning Consider weights w i as random vars, prior p(w i ) Weight decay, ridge regression, regularization cost=data-misfit + λ complexity

32
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 32 Dimensionality Reduction

33
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 33

34
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 34 Learning Time Applications: Sequence recognition: Speech recognition Sequence reproduction: Time-series prediction Sequence association Network architectures Time-delay networks (Waibel et al., 1989) Recurrent networks (Rumelhart et al., 1986)

35
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 35 Time-Delay Neural Networks

36
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 36 Recurrent Networks

37
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 37 Unfolding in Time

Similar presentations

© 2019 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google