Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hand-written character recognition

Similar presentations


Presentation on theme: "Hand-written character recognition"— Presentation transcript:

1 Hand-written character recognition
MNIST: a data set of hand-written digits 60,000 training samples 10,000 test samples Each sample consists of 28 x 28 = 784 pixels Various techniques have been tried Linear classifier: % 2-layer BP net (300 hidden nodes) % 3-layer BP net ( hidden nodes) % Support vector machine (SVM) % Convolutional net % 6 layer BP net (7500 hidden nodes): % Failure rate for test samples

2 Hand-written character recognition
Our own experiment: BP learning with architecture Total # of weights: 784* *10 = 238,200 Total # of Δw computed for each epoch: 1.4*10^10 Ran 1 month before it stopped Test error rate: 5.0%

3

4 Risk-Averting Error Function
Mean Squared Error (MSE) Risk-Averting Error (RAE) James Ting-Ho Lo. Convexification for data fitting. Journal of Global Optimization, 46(2):307–315, February 2010.

5 Normalized Risk-Averting Error
Normalized Risk-Averting Error (NRAE) It can be simplified as

6 The Broyden-Fletcher-Goldfarb-Shanno (BFGS) Method
A quasi-Newton method for solving the nonlinear optimization problems Using first-order gradient information to generate an approximation to the Hessian (second-order gradient) matrix Avoiding the calculation of the exact Hessian matrix can significantly save the computational cost during the optimization

7 The Broyden-Fletcher-Goldfarb-Shanno (BFGS) Method
The BFGS Algorithm: Generate an initial guess and an initial approximate inverse Hessian Matrix Obtain a search direction at step k by solving: where is the gradient of the objective function evaluated at . Perform a line search to find an acceptable stepsize in the direction , then update

8 The Broyden-Fletcher-Goldfarb-Shanno (BFGS) Method
Set and Update the approximate Hessian matrix by Repeat step 2-5 until converges to the solution. Convergence can be checked by observing the norm of the gradient,

9 The Broyden-Fletcher-Goldfarb-Shanno (BFGS) Method
Limited-memory BFGS Method: A variation of the BFGS method Only using a few vectors to represent the approximation of the Hessian matrix implicitly Less memory requirement Well suited for optimization problems with a large number of variables

10 References J. T. Lo and D. Bassu. An adaptive method of training multilayer perceptrons. In Proceedings of the 2001 International Joint Conference on Neural Networks, volume 3, pages 2013–2018, July 2001. James Ting-Ho Lo. Convexification for data fitting. Journal of Global Optimization, 46(2):307–315, February 2010. BFGS:

11 A Notch Function

12 MSE vs. RAE

13 MSE vs. RAE


Download ppt "Hand-written character recognition"

Similar presentations


Ads by Google