Ajita Rattani and Reza Derakhshani,

On Fine-tuning Convolutional Neural Networks for Smartphone based Ocular Recognition
Ajita Rattani and Reza Derakhshani, Dept. of Computer Science and Electrical Engineering, University of Missouri- Kansas City

Introduction Increasing number of portable devices equipped with capable RGB cameras In this context, ocular biometrics has gained increasing attention from the research community Capturing ocular biometrics tantamount to taking a “selfie” of one's ocular region (Rattani and Derakhshani, 2016)

Contd.., Ocular images acquired using smart-phones exhibit substantial degradations due to adverse lighting conditions, specular reflections, and motion and defocus blur Low spatial resolution and noise owing to optics and small sensor of front facing mobile cameras further degrade the sample quality Consequently, reducing the recognition accuracy of ocular recognition technology when integrated in smartphones

Examples of challenging ocular images captured using the front facing camera of an iPhone 5s. Factors such as adverse lighting, specular reflections and motion blur can be observed from left to right

Contd., ICIP Competition: Substantially lower error rates for deep learning schemes such as deep coupled autoencoders (Raghavendra and Busch, 2016) and deep sparse filtering (Kiran et al., 2016) Another study demonstrated high accuracies for custom deep learning CNN model for ocular recognition in mobile environment (Ahuja et al., 2017) The efficacy of deep learning schemes is very much dependent on the size of the training dataset

Contribution Re-purpose existing CNN models namely VGG net and InceptionNet Pre-trained CNNs were fine-tuned using transfer learning the transfer of knowledge from a source task to a target task transfer learning can be performed by transferring the learned feature layers from a pretrained CNN to initialize another task (Afridi et al., 2017)

Advantages Faster training times are observed than training a new CNN
May be more effective than training the CNN from scratch Last few layers of the pretrained CNN must be fine-tuned to learn the specific features related to new classification problem

Pre-trained CNNs on ImageNet Database
VGG: A stack of convolutional layers is followed by three fully- connected layers: the first two have 4096 channels each, the third performing 1000-way classification InceptionNet: A total of 9 Inception modules, which allow for pooling and convolution operation with different filter sizes to be performed in parallel followed average pooling layer and a 1000-way fully-connected layer with softmax

Fine-tuning All the layers were extracted until the fully connected layer followed by additional 1025 and 550-way (equal to number of classes) fully-connected layers along with softmax The learning ability of transferred layers of these pretrained networks was set to false, the weight learning rate of the newly added fully connected layers was set to high

ROI considered for Ocular Recognition from VISOB dataset
Keras implementation and its weight files are used Weight learning rate factor: 10 Optimizer: RMSprop Input batch size: 128 Epochs: 30

Fine-tuned Model ROI Device TPR [%] (@FPR=10-4)
VGG-16 Left Eye Oppo 99.6 Right Eye Complete Strip iPhone 100.0 87.0 InceptionNet 99.5 97.3 65.0 32.0 Test results on fine-tuning pre-trained VGG-16, and InceptionNet for ocular recognition in terms of FPR=10-4) for three different ROIs; Left Eye, Right Eye and Complete Strip

Results of the Existing Deep Learning Schemes and Fine-tuned VGG-16 evaluated on VISOB dataset in terms of TPR at FPR of 10-3 Deep Learning Methods Oppo iPhone Left Right Deeply Coupled Autoencoders (Raghavendra and Busch, 2016) 93.32 93.05 91.00 91.44 Deep Sparse Filters (Raja et al., 2016) 90.54 88.24 90.43 87.45 Custom CNN (Ahuja et al., 2017) 99.55 99.59 99.72 99.74 Fine-tuned VGG-16 99.65 99.95 100.00

Conclusion VGG-16 faired better than InceptionNet for ROI consisting of left and right eye These pre-trained networks produced different results which maybe due to feature representation differences emerging from their unique architecture The choice of the source CNN impacts the performance of ocular recognition This work was made possible in part as a gift from EyeVerify Inc. (

Thank you

Ajita Rattani and Reza Derakhshani,

Similar presentations

Presentation on theme: "Ajita Rattani and Reza Derakhshani,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Ajita Rattani and Reza Derakhshani,

Similar presentations

Presentation on theme: "Ajita Rattani and Reza Derakhshani,"— Presentation transcript:

Similar presentations

About project

Feedback