Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ajita Rattani and Reza Derakhshani,

Similar presentations


Presentation on theme: "Ajita Rattani and Reza Derakhshani,"— Presentation transcript:

1 On Fine-tuning Convolutional Neural Networks for Smartphone based Ocular Recognition
Ajita Rattani and Reza Derakhshani, Dept. of Computer Science and Electrical Engineering, University of Missouri- Kansas City

2 Introduction Increasing number of portable devices equipped with capable RGB cameras In this context, ocular biometrics has gained increasing attention from the research community Capturing ocular biometrics tantamount to taking a “selfie” of one's ocular region (Rattani and Derakhshani, 2016)

3 Contd.., Ocular images acquired using smart-phones exhibit substantial degradations due to adverse lighting conditions, specular reflections, and motion and defocus blur Low spatial resolution and noise owing to optics and small sensor of front facing mobile cameras further degrade the sample quality Consequently, reducing the recognition accuracy of ocular recognition technology when integrated in smartphones

4 Examples of challenging ocular images captured using the front facing camera of an iPhone 5s. Factors such as adverse lighting, specular reflections and motion blur can be observed from left to right

5 Contd., ICIP Competition: Substantially lower error rates for deep learning schemes such as deep coupled autoencoders (Raghavendra and Busch, 2016) and deep sparse filtering (Kiran et al., 2016) Another study demonstrated high accuracies for custom deep learning CNN model for ocular recognition in mobile environment (Ahuja et al., 2017) The efficacy of deep learning schemes is very much dependent on the size of the training dataset

6 Contribution Re-purpose existing CNN models namely VGG net and InceptionNet Pre-trained CNNs were fine-tuned using transfer learning the transfer of knowledge from a source task to a target task transfer learning can be performed by transferring the learned feature layers from a pretrained CNN to initialize another task (Afridi et al., 2017)

7 Advantages Faster training times are observed than training a new CNN
May be more effective than training the CNN from scratch Last few layers of the pretrained CNN must be fine-tuned to learn the specific features related to new classification problem

8 Pre-trained CNNs on ImageNet Database
VGG: A stack of convolutional layers is followed by three fully- connected layers: the first two have 4096 channels each, the third performing 1000-way classification InceptionNet: A total of 9 Inception modules, which allow for pooling and convolution operation with different filter sizes to be performed in parallel followed average pooling layer and a 1000-way fully-connected layer with softmax

9 Fine-tuning All the layers were extracted until the fully connected layer followed by additional 1025 and 550-way (equal to number of classes) fully-connected layers along with softmax The learning ability of transferred layers of these pre- trained networks was set to false, the weight learning rate of the newly added fully connected layers was set to high

10 ROI considered for Ocular Recognition from VISOB dataset
Keras implementation and its weight files are used Weight learning rate factor: 10 Optimizer: RMSprop Input batch size: 128 Epochs: 30

11 Fine-tuned Model ROI Device TPR [%] (@FPR=10-4)
VGG-16 Left Eye Oppo 99.6 Right Eye Complete Strip iPhone 100.0 87.0 InceptionNet 99.5 97.3 65.0 32.0 Test results on fine-tuning pre-trained VGG-16, and InceptionNet for ocular recognition in terms of FPR=10-4) for three different ROIs; Left Eye, Right Eye and Complete Strip

12 Results of the Existing Deep Learning Schemes and Fine-tuned VGG-16 evaluated on VISOB dataset in terms of TPR at FPR of 10-3 Deep Learning Methods Oppo iPhone Left Right Deeply Coupled Autoencoders (Raghavendra and Busch, 2016) 93.32 93.05 91.00 91.44 Deep Sparse Filters (Raja et al., 2016) 90.54 88.24 90.43 87.45 Custom CNN (Ahuja et al., 2017) 99.55 99.59 99.72 99.74 Fine-tuned VGG-16 99.65 99.95 100.00

13 Conclusion VGG-16 faired better than InceptionNet for ROI consisting of left and right eye These pre-trained networks produced different results which maybe due to feature representation differences emerging from their unique architecture The choice of the source CNN impacts the performance of ocular recognition This work was made possible in part as a gift from EyeVerify Inc. (

14 Thank you


Download ppt "Ajita Rattani and Reza Derakhshani,"

Similar presentations


Ads by Google