Presentation is loading. Please wait.

Presentation is loading. Please wait.

Artificial Neural Networks ML 4.6-4.9 Paul Scheible.

Similar presentations


Presentation on theme: "Artificial Neural Networks ML 4.6-4.9 Paul Scheible."— Presentation transcript:

1 Artificial Neural Networks ML 4.6-4.9 Paul Scheible

2 BackPropagation Algorithm

3 Convergence to Local Minima Performs well in many practical problems Local minima are less troubling than one might think  Multi-dimensional problems are unlikely to have a local minima in all dimensions  Any dimension without a local minimum provides an escape route  Starting with small initial weights tend to avoid local minima

4 Convergence to Local Minima Still no methods known to find when local minima will present a problem  Use momentum  Use stochastic gradient descent  Train several networks with data but different starting bias

5 Representational Power Boolean functions Continuous functions Arbitrary functions

6 Other Aspects Continuous hypothesis space Smooth interpolation inductive bias Able to determine its own internal configuration Avoiding overfitting  Weight decay  Cross validation

7 Face Recognition Example

8 Problem Identify face orientation from an image Training set of 640 images with resolution 120x128 pixels and 255 value grey scale Training set has varied backgrounds, clothing, expressions, and eye wear (sun glasses)‏

9 Design Choices Input encoding  Image reduced to 30x32 pixels  Pixels averaged to obtain reduced values  Pixel values scaled from 0 to 255 to 0 to 1 Output encoding  Four outputs  Each output corresponds to a face orientation

10 Design Choices Network graph structure  Acyclic  Two layer  Hidden units Five minutes to train with 3 units (chosen)‏ One hour to train with 30 units Learning rate η: 0.3 Momentum α: 0.3

11 Design Choices Full gradient descent Small random weights on output Zero weights on input

12 Advanced Topics

13 Alternate Error Functions Add penalty for weight magnitude  Prefers small magnitude vectors  Equivalent of weight decay

14 Alternate Error Functions Add term for error in slope  Requires knowledge of target function

15 Alternative Error Functions Minimize cross entropy Relate weights to each other by some design constraint

16 Alternative Error Minimization Procedures Line Search Conjugate gradient

17 Recurrent Networks Directed cyclic graphs Used to find recursive functions

18 Dynamic Modification of Network Structure Cascade-Correlation  Start with one layer  Add hidden node if error too great and retrain holding the hidden node weights constant  Add additional hidden nodes until error is acceptable  Can easily result in overfitting

19 Pruning Start with complex network Prune unneeded nodes  Weights close to zero  Nodes with little effect on output (better)‏


Download ppt "Artificial Neural Networks ML 4.6-4.9 Paul Scheible."

Similar presentations


Ads by Google