Download presentation
Presentation is loading. Please wait.
Published byBrenda Watson Modified over 6 years ago
1
Keystroke Biometric Studies with Short Numeric Input on Smartphones
Michael J. Coakley Advisor: Dr. Charles Tappert
2
Abstract A classification system was developed to evaluate keystroke biometric smartphone data Based on Pace University Classification System Three sets of features evaluated Mechanical-keyboards-like features Comparable to features available on mechanical keyboards Keystroke features only available on touchscreens Combined mechanical and touchscreen features Touchscreen features subsets were evaluated to determine their relative biometric value
3
Relevance of Study Use of mobile devices continues to climb dramatically More mobile phones than people on the planet Improved technology and capacity equates to more sensitive data being stored and accessed through mobile devices Most devices are either secured via a small 4-character PIN or not securitized at all Government interest and support Defense Advanced Projects Agency (DARPA) National Institute of Standards and Technology (NIST) National Science Foundation (NSF)
4
Related Work Keystroke TouchScreen
Bakelman (Dissertation at Pace University) Maxion & Killourhy (CMU) Maiorana Trojahn & Ortmeir TouchScreen Zheng Kambourakis Feng Alariki
5
Data Collection
6
Mobile Device Biometric System
Android BioKeyboard Virtual keypad developed on Android platform and used as the default keyboard on Android mobile devices Text entry data captured on mobile devices Data stored in SQLite Database Data transmitted from devices to centralized server Mechanical-Keyboard-Like Keystroke Features Standard Timing Data of Key Press and Key Release events Touchscreen Keystroke Features Data associated exclusively with Touchscreen Pressure, Location, Accelerometer, Gyroscope
7
Touchscreen Features User, Session ID, Time Session Began
Screen DPI (Dots Per Inch) Action (Press or Release) Time of the event in milliseconds Soft Key Name (“9”, for example) Screen Orientation Holding phone vertically or horizontally
8
Touchscreen Features (continued)
Pressure of Key Press {x,y} coordinates of center of touch event Feature data extracted from other sensors Accelerometer Gyroscope Feature types available but are not included in this study: GPS
9
System Process Overview
10
Data Collection Devices 52 Participants
5 identical Android LG-D820 Nexus 5 Mobile devices Virtual keypad capturing keystrokes 52 Participants City of White Plains employees Pace University Students (NYC & PLV) Each entered 10 digit string ( ) 30 times 58,882 data records from the 52 distinct participants 614 total Keystroke Mechanical & Touchscreen Features Data collected in two sessions several weeks 44% Male, 55% Female, Avg Age = 23, 86% Right Handed
11
Pace Biometric Classification System (PBCS)
Classification system created at Pace Univ. Vector-difference model transforms a multi-class problem into a two-class problem Nearest neighbor method used for decisions in vector difference space Between- and within-person distance matches determine who is authentic and who is not authentic (imposter)
12
Phase 1 Experiments Three feature sets of biometric data were processed by the Pace Biometric Classification System (PBCS) Mechanical-keyboard-like keystroke features Touchscreen-only keystroke features Combined mechanical & touchscreen features
13
Phase 2 Experiments The touchscreen keystroke features were divided into four sub-feature sets to determine their relative biometric value Pressure Location Accelerometer Gyroscope Each sub-feature file was processed through Pace Biometric Classification System (PBCS)
14
Data Analysis Each feature set run through PBCS four times (two distance metrics and two validation methods) Euclidean Distance Repeated Random Subsampling (RRS) Leave One Out Cross Validation (LOOCV) Manhattan Distance Platform Hardware: 16 gigs RAM, 8 Cores (2 threads/core), 100 gig drive OS: Linux Pace Classifier: Python
15
Distance Metrics Minkowski Distance = Euclidean Distance
Distance metrics in a normed vector space Euclidean Distance Minkowski Distance with p = 2 Manhattan Distance Minkowski Distance with p = 1 Sometimes called the city block distance
16
Two Validation Methods
Repeated Random Subsampling (RSS) Max between size of 10/Max within size of 10 used to select number of samples prior to the vector difference calculations 30 iterations Leave-One-Out Cross Validation (LOOCV) Full dataset (no random sampling) n samples => n iterations, one for each sample
17
Performance Evaluation Receiver Operating Characteristic (ROC) Curves and Equal Error Rate (EER)
Plots False Acceptance Rate (FAR) against False Rejection Rate (FRR) The Equal Error Rate (EER) is where FAR and FRR intersect (where FAR = FRR) EER is a single, easy-to-understand number often used in evaluating biometric systems However, when deploying a biometric system, the ROC curve is more valuable
18
Phase 1 EER Results (preview)
On these data, the results indicate LOOCV validation method is better than RSS Manhattan distance is better than Euclidean Receiver Operating Characteristic (ROC) curves follow
19
Mechanical vs Touchscreen vs Combined ROC Curves: Euclidean Distance & RRS Validation
20
Mechanical vs Touchscreen vs Combined ROC Curves: Euclidean Distance & LOOCV Validation
21
Mechanical Keyboard Features Euclidean Distance: RRS versus LOOCV
EER = 20%
22
Touchscreen Features Euclidean Distance: RRS versus LOOCV
23
Mechanical and Touchscreen Features Euclidean Distance: RRS versus LOOCV
EER = 7.1%
24
Unexpected Issue! Equal Error Rate (EER) for the Combined feature set (7.10%) was higher than the EER of the Touchscreen set (4.9%) This could be explained by the proximity of the Keystroke feature sets inflating the combined EER We modified the distance of measure (P) and re-ran the data using Manhattan Distance
25
Mechanical vs Touchscreen vs Combined ROC Curves: Manhattan Distance & RRS Validation
26
Mechanical vs Touchscreen vs Combined ROC Curves: Manhattan Distance & LOOCV Validation
EER = 19.7%
27
Mechanical Keyboard Features Manhattan Distance: RRS versus LOOCV Validation
28
Touchscreen Features Manhattan Distance: RRS versus LOOCV Validation
29
Mechanical and Touchscreen Features Manhattan Distance: RRS versus LOOCV Validation
30
Phase 1 EER Results (review)
On these data, the results indicate LOOCV validation method is better than RSS Manhattan distance is better than Euclidean Resolves unexpected issue, now Combined better than Touchscreen
31
Phase 1 Conclusions Study indicated that the Pace Classifier can be extended to authenticate data associated with and extracted from mobile devices Manhattan Distance performed better than Euclidean Distance Leave One Out Cross Validation (LOOCV) performed better than Repeated Random Subsampling (RRS)
32
Phase 1 Conclusions (continued)
Equal Error Rate (EER) for Mechanical Keystroke Biometrics alone (19.7%) worse than those of Killourhy & Maxion (8.6%) and Bakelman (6.14%) Possibly explained by smaller device form factor as well as “slickness” of touchscreen Touchscreen biometric EER (4.0%) was a significant improvement pure Mechanical Keystroke Biometrics Combined biometric EER (3.9%) was a further improvement Note: The aforementioned studies by Killourhy & Maxion and Bakelman could not utilize the touchscreen feature sets as physical keyboards cannot capture that data
33
Phase 2 Results (preview)
Sensor subsets of the touchscreen features were further evaluated to determine their relative biometric value
34
ROC Curve – (RRS and Manhattan)
35
ROC Curve – (LOOCV and Manhattan)
36
Phase 2 Results (review)
Sensor subsets of the touchscreen features were further evaluated to determine their relative biometric value Conclusions Gyroscope sensor has highest biometric value By itself, almost as good as all features Pressure sensor has lowest biometric value
37
Overall Conclusions Touchscreen biometric data outperformed keystroke biometric data Manhattan distance metric outperformed Euclidean distance metric Leave-One-Out Cross-Validation (LOOCV) outperformed Repeated Random Sub-Sampling (RRS)
38
Overall Conclusions (continued)
Touchscreen biometric data can return excellent results alone or in concert with keystroke biometric data Results associated with the Gyroscope returned the best results of all the Touchscreen Sensor feature data sets Results associated with the Pressure features returned the worst results of all the Touchscreen Sensor feature data sets
39
Limitations of Study Android Only
Brevity and specificity of input string Limitation of Classification Algorithms K-Nearest Neighbor
40
Future Work Replication of this study on other Mobile Devices
iOS Windows Expansion of allowable key characters Incorporation of additional sensor data Mobile devices have other sensors that were not utilized in our study Expand research to utilize and compare other classification algorithms Support Vector Machines (SVM)
41
Thank You
42
Supplemental Slides
43
ROC Curve for Touchscreen Pressure Features - Euclidean
44
ROC Curve for Touchscreen Pressure Features - Manhattan
45
ROC Curve for Touchscreen Location Features - Euclidean
46
ROC Curve for Touchscreen Location Features - Manhattan
EER=17.9% EER=15.0%
47
ROC Curve for Touchscreen Accelerometer Features - Euclidean
48
ROC Curve for Touchscreen Accelerometer Features - Manhattan
49
ROC Curve for Touchscreen Gyroscope Features - Euclidean
50
ROC Curve for Touchscreen Gyroscope Features - Manhattan
51
Comparison – RRS & Euclidean
52
Comparison – LOOCV & Euclidean
EER=26.37% EER=18.25%
53
Comparison – RRS & Manhattan
EER=24.85%
54
Comparison – LOOCV & Manhattan
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.