Presentation is loading. Please wait.

Presentation is loading. Please wait.

Supervised learning 1.Early learning algorithms 2.First order gradient methods 3.Second order gradient methods.

Similar presentations


Presentation on theme: "Supervised learning 1.Early learning algorithms 2.First order gradient methods 3.Second order gradient methods."— Presentation transcript:

1 Supervised learning 1.Early learning algorithms 2.First order gradient methods 3.Second order gradient methods

2 Early learning algorithms Designed for single layer neural networks Generally more limited in their applicability Some of them are –Perceptron learning –LMS or Widrow- Hoff learning –Grossberg learning

3 Perceptron learning 1.Randomly initialize all the networks weights. 2.Apply inputs and find outputs ( feedforward). 3. compute the errors. 4.Update each weight as 5.Repeat steps 2 to 4 until the errors reach the satisfactory level.

4 Performance Optimization Gradient based methods

5 Basic Optimization Algorithm

6 Steepest Descent (first order Taylor expansion)

7 Example

8 Plot

9 LMS or Widrow- Hoff learning First introduce ADALINE (ADAptive LInear NEuron) Network

10 LMS or Widrow- Hoff learning or Delta Rule ADALINE network same basic structure as the perceptron network

11 Approximate Steepest Descent

12 Approximate Gradient Calculation

13 LMS Algorithm This algorithm inspire from steepest descent algorithm

14 Multiple-Neuron Case

15 Difference between perceptron learning and LMS learning DERIVATIVE Linear activation function has derivative but sign (bipolar, unipolar) has not derivative

16 Grossberg learning (associated learning) Sometimes known as instar and outstar training Updating rule: Where could be the desired input values (instar training, example: clustering) or the desired output values (outstar) depending on network structure. Grossberg network (use Hagan to more details)

17 First order gradient method Back propagation

18 Multilayer Perceptron R – S 1 – S 2 – S 3 Network

19 Example

20 Elementary Decision Boundaries

21

22 Total Network

23 Function Approximation Example

24 Nominal Response

25 Parameter Variations

26 Multilayer Network

27 Performance Index

28 Chain Rule

29 Gradient Calculation

30 Steepest Descent

31 Jacobian Matrix

32 Backpropagation (Sensitivities)

33 Initialization (Last Layer)

34 Summary

35 Back-propagation training algorithm Backprop adjusts the weights of the NN in order to minimize the network total mean squared error. Network activation Forward Step Error propagation Backward Step Summary

36 Example: Function Approximation

37 Network

38 Initial Conditions

39 Forward Propagation

40 Transfer Function Derivatives

41 Backpropagation

42 Weight Update

43 Choice of Architecture

44 Choice of Network Architecture

45 Convergence Global minium (left) local minimum (rigth)

46 Generalization

47 Disadvantage of BP algorithm Slow convergence speed Sensitivity to initial conditions Trapped in local minima Instability if learning rate is too large Note: despite above disadvantages, it is popularly used in control community. There are numerous extensions to improve BP algorithm.

48 Improved BP algorithms ( first order gradient method) 1.BP with momentum 2.Delta- bar- delta 3.Decoupled momentum 4.RProp 5.Adaptive BP 6.Trinary BP 7.BP with adaptive gain 8.Extended BP


Download ppt "Supervised learning 1.Early learning algorithms 2.First order gradient methods 3.Second order gradient methods."

Similar presentations


Ads by Google