Mobile Sensor-Based Biometrics Using Common Daily Activities Ken Yoneda Gary M. Weiss (presenting) Wireless Sensor Data Mining (WISDM) Lab Fordham University
Mobile Sensor-based Biometrics Security is often achieved via passwords, tokens, keys, etc. Known problems with these (bad passwords, stolen keys) A better way: mobile biometrics Almost everyone has a smartphone Some people have smartwatches Both devices contain accelerometer and gyroscope sensors These sensors measure motion Idea: People move differently so accelerometer and gyroscope sensor data can be used for biometrics
Activities for Motion-based Biometrics Motion-based biometrics typically uses only walking (gait) Some researchers pick another activity (e.g., finger snapping) We evaluate a large number (18) of diverse activities This is a major contribution We also evaluate 9 sensor combinations across 2 devices Four individual sensors (accel, gyro on phone, and watch) Five sensor combinations of sensors This is a major contribution
Identification vs Authentication
The18 Evaluated Activities General Activities (non-hand oriented) General Activities (hand-oriented) Eating Activities (hand-oriented) Walking Dribble Basketball Eat Pasta Jogging Catch with Tennis Ball Eat Soup Stairs Typing Eat a Sandwich Sitting Writing Eat Chips Standing Clapping Drink from Cup Kick Soccer Ball Brush Teeth Fold Clothes
Data Collection and Transformation Use Android smartphones and smartwatches Collected 3 minutes of data per user activity 51 users and 18 activities 45 hours of data Most class. algorithms don’t handle time-series data Sliding window approach 10s non-overlapping segments Each example formed by calculating 43 high level features Average and standard deviation of x, y, z axis sensor values
Classification Algorithms Experiments use Scikit-learn (Python module) K Nearest Neighbor Decision Tree Random Forest Experiments use stratified 10-fold cross validation Random Forest consistently performs best In this presentation only show RF results
Identification Accuracy (%) using Random Forest Activity Single Sensor Fused Sensor Avg. Phone Accel Phone Gyro Watch Accel Watch Gyro Phone Watch Accel Gyro All Walking 96.1 94.7 75.1 67.0 96.8 78.9 96.5 95.3 97.4 88.6 Jogging 92.5 75.0 74.3 96.0 82.1 95.7 95.2 98.0 89.3 Stairs 90.8 81.2 52.4 39.2 92.7 58.7 92.6 80.9 95.1 76.0 Sitting 90.1 56.3 70.4 30.1 91.5 69.3 93.1 55.9 92.0 72.1 Standing 85.8 47.1 64.1 27.0 86.8 61.2 90.5 46.6 89.9 66.6 Typing 94.8 71.7 51.2 94.6 84.2 95.6 76.5 82.8 Teeth 92.2 69.5 70.0 93.7 76.1 74.5 95.4 80.3 Soup 94.3 56.5 74.1 50.4 95.8 76.6 96.3 66.9 96.6 78.6 Chips 93.3 56.8 62.6 38.7 93.2 62.4 66.3 94.9 73.8 Pasta 94.1 56.9 67.2 38.1 94.0 71.6 61.1 Drinking 93.9 57.4 63.9 41.3 93.8 65.3 60.6 74.0 Sandwich 92.9 62.8 61.9 37.6 62.1 95.9 68.5 74.4 Kicking 87.4 54.3 38.3 59.8 92.1 72.7 72.5 Catch 90.0 69.1 71.3 90.3 75.4 82.0 81.5 Dribbling 88.3 66.0 72.3 74.8 89.5 94.4 82.4 Writing 92.8 79.6 47.6 79.1 94.2 73.0 Clapping 72.8 83.4 73.9 85.3 86.1 96.7 87.0 Folding 90.7 65.8 60.0 38.8 63.0 93.6 67.3 68.7 49.8 73.2
Majority-Voting Strategy Results on prior slide based on one 10s test example Overly restrictive Our majority voting strategy uses 5 examples (50 sec of data) and votes to assign most predicted person Yields much better results
Identification Accuracy (%) using Random Forest (with Voting) Activity Single Sensor Fused Sensor Avg. Phone Accel Phone Gyro Watch Accel Watch Gyro Phone Watch Accel Gyro All Walking 100.0 94.1 80.4 90.2 96.1 Jogging 90.0 88.0 98.0 97.3 Stairs 70.0 43.8 96.0 75.0 91.7 84.9 Sitting 62.7 88.2 33.3 86.3 64.7 81.5 Standing 39.2 82.4 20.0 84.0 50.0 74.2 Typing 89.8 94.0 95.9 92.2 Teeth 91.9 Soup 66.7 62.0 80.0 87.2 Chips 76.0 41.2 84.2 Pasta 56.0 48.0 71.4 Drinking 58.8 60.8 80.8 Sandwich 68.0 38.0 82.0 73.5 Kicking 68.6 32.0 81.4 Catch 78.0 85.7 91.8 90.6 Dribbling Writing 90.8 Clapping 96.5 Folding 76.5 78.4 98.8 74.4 85.8 55.8 98.9 88.3 99.7 83.2 99.6
Goal: Continuous Biometrics User identified by their motion while performing normal daily tasks (unstructured) We can only approximate this since only 18 activities and even distribution of each Next set of results merge all 18 activities
Identification Accuracy (%) using All 18 Activities (Random Forest, Voting) Sensors Used Without Label Predicted Label With Label Phone Accel 96.8 96.0 97.6 Phone Gyro 61.6 63.1 65.1 Watch Accel 76.0 75.4 77.3 Watch Gyro 39.8 42.4 43.9 Phone 97.0 96.2 97.5 Watch 77.1 80.6 77.9 Accel 99.2 98.9 99.3 Gyro 72.3 72.9 73.0 All 99.1 Average 79.9 80.5 81.2
Summary of Identification Results Best Sensors for identification Phone and Watch Accelerometer (“Accel”) best followed closely by “All” four sensors (phone + watch sensors) Gyroscope generally not as useful as accelerometer Best Individual Activities for identification without voting Walking and Jogging activities are best Clapping and typing are good Using All Activities Can do very well without activity labels (can predict label) Good step toward continuous biometrics
Authentication Experiments Binary classification problem: “you” or “imposter” 1 model per user (51 models given 51 users) “Imposters” in the test set should not be in train set Main evaluation metric is Equal Error Rate Balances two types of errors: false acceptance rate and false rejection rate EER: FAR = FRR (vary probability threshold for classification)
Authentication EER (%) without Voting (Random Forest) Activity Single Sensor Fused Sensor Avg. Phone Accel Phone Gyro Watch Accel Watch Gyro Phone Watch Accel Gyro All Walking 11.2 11.3 17.5 18.8 9.3 16.1 12.6 10.2 7.9 12.8 Jogging 11.5 13.2 18.1 19.3 10.3 15.1 13.8 9.8 13.6 Stairs 12.3 16.4 24.3 26.1 11.8 21.6 13.9 16.5 13.5 17.4 Sitting 26.3 21.8 33.4 22.3 10.7 27.2 13.0 20.1 Standing 14.7 26.0 22.6 33.3 15.6 23.0 11.9 27.9 15.4 21.2 Typing 19.4 16.8 26.2 18.0 10.4 19.0 8.7 15.7 Teeth 19.7 18.6 22.7 12.1 17.2 11.4 19.9 12.2 16.2 Soup 9.6 22.4 17.6 24.6 10.1 8.6 21.7 15.8 Chips 23.3 19.2 29.5 11.7 20.3 20.4 Pasta 12.4 18.4 28.8 14.4 10.9 Drinking 12.0 24.2 20.0 30.1 12.9 Sandwich 24.1 30.2 22.1 23.6 18.5 Kicking 12.5 26.7 21.1 16.7 14.0 Catch 10.8 20.6 20.8 13.4 16.0 Dribbling 18.9 21.0 12.7 17.9 Writing 13.3 15.3 27.1 15.9 Clapping 20.5 9.7 14.6 10.6 14.9 Folding 16.6 19.6 24.7 17.1 8.3 17.0 20.2 19.5 25.8
Authentication EER (%) Using a Single Activity with Voting (Random Forest) Single Sensor Fused Sensor Avg. Phone Accel Phone Gyro Watch Accel Watch Gyro Phone Watch Accel Gyro All Walking 9.4 9.8 13.2 17.2 8.8 13.9 11.3 10.0 6.8 11.2 Jogging 7.8 10.8 16.2 15.2 9.7 12.7 9.0 8.3 Stairs 13.4 12.5 19.3 23.9 9.3 18.9 8.4 14.1 6.9 Sitting 10.4 23.7 14.5 32.1 17.0 21.1 10.2 16.4 Standing 12.1 22.1 16.7 31.6 10.9 21.5 7.7 Typing 15.4 13.0 20.7 8.9 14.0 8.6 13.3 12.3 Teeth 10.1 20.0 14.4 14.9 8.2 12.9 Soup 7.3 19.2 22.3 6.1 17.5 8.0 Chips 9.9 14.7 25.9 10.3 18.1 8.5 Pasta 14.3 26.6 18.5 19.6 5.4 Drinking 16.6 25.1 19.9 8.1 15.0 Sandwich 17.9 25.7 11.4 17.7 Kicking 10.6 19.4 21.0 24.1 11.0 18.8 15.8 Catch 16.3 15.5 Dribbling 16.1 11.8 11.5 13.5 Writing 8.7 15.7 10.7 21.3 9.2 11.6 16.0 Clapping 14.8 12.0 Folding 7.9 18.6 23.4 17.3 7.1 17.6 15.6 22.4 9.6 15.3
Biometric Rankings: Which Activities are Best Activity Authentication Practicality Total Rank Walking 2 1 3 Sitting 5 6 Clapping 8 9 Stairs 7 10 4 Pasta Folding 13 Typing 14 Writing 15 Soup Chips 16 Standing 17 11 Drinking Jogging 12 19 Sandwich Brushing Teeth Catch 24 Dribbling 25 Kicking 18 27
Conclusions Both accelerometers and all-4 sensors perform best Gyroscope generally not as good as accelerometer Majority-voting strategy using 5 examples effective Good biometric identification and authentication performance is achievable with voting Can get performance even if activities not labeled Walking is most effective biometric trait Sitting and clapping are also viable biometric traits
Acknowledgements Many WISDM Lab members who assisted with data collection This is an expansion of earlier WISDM Lab studies Jennifer Kwapisz (2010) Andrew Johnston (2015)
Additional Slides (if time permits)
Comparison of Algorithms Average Identification Accuracy (%) Using “Accel” Sensor Algorithm Without Voting With Voting k-Neighbors 77.8 88.2 Decision Tree 91.8 98.4 Random Forest 94.7 99.7
Identification Learning Curve Learning Curve for Amount of Training Data per Activity
Authentication Learning Curve Learning Curve for Amount of Training Data per Activity
Impact of Number of Examples used for Voting Voting Performance by Number of Examples for Identification Accuracy