Download presentation
1
Template Security in Biometric Systems
Yagiz Sutcu
2
Outline Introduction Biometrics and biometric systems
Template security and user privacy Proposed solutions Feature Transformation for ECC- based Biometric Template Protection Syndrome framework and its requirements for binary data Binarization of minutiae-based fingerprint templates Protecting Biometric Templates with Secure Sketch Secure sketch framework, issues and limitations Quantization-based secure sketch Randomization Multi-factor setup Conclusions, discussions and future directions
3
Biometrics Biometrics is the science and technology of measuring and statistically analyzing biological data. Adopted from: S. Prabhakar, S. Pankanti, and A. K. Jain, “Biometric Recognition: Security and Privacy Concerns”, IEEE SECURITY & PRIVACY, 2003. Universality - do all people have it ? Distinctiveness: can people be distinguished based on an identifier ? Permanence : how permanent is the identifier ? Collectability : how well can the identifier be captured and quantified ? Performance : speed and accuracy Acceptability : willingness of the people to use Circumvention : foolproof
4
Biometrics
5
Biometric Systems S. Prabhakar, S. Pankanti, and A. K. Jain, “Biometric Recognition: Security and Privacy Concerns”, IEEE SECURITY & PRIVACY, 2003.
7
Issues, Challenges and Objectives
Biometric authentication is attractive Closely related to the identity Cannot be forgotten Not easy to forge Have been successfully used for a long time However … Cannot be exactly reproduced (intra-variability, noise) Once compromised cannot be revoked Entropy may not be sufficient Objectives Authentication/verification without storing the original biometric Robustness to noisy measurements Best possible tradeoff between Security (How many bits must an attacker guess?) Accuracy/performance (What is the false reject rate?)
8
Proposed Solutions – Transformation-Based Approaches
Employ one-way transformation E.g., quantization, thresholding, random projections Properties Non-invertible or hard-to-invert Similarity preserving Cancelable Technique depends highly on the biometric considered Security not easy to analyze Ratha’01&’07, Savvides’04, Ang’05, Teoh’06, etc.
9
An Example: Cancelable Biometrics
Repeatable Distortion Match Do not Match Applied at the sensor level Signal level Feature level Security If compromised, a new distortion Reusability Different distortions for different databases Performance What about false accept and false reject rates? Requires alignment
10
Proposed Solutions – Helper Data-Based Approaches
Generate user-specific helper data E.g., syndrome, secure sketch Helper data and generation method are public General ECC-based framework that is applicable to many biometrics Techniques may vary to optimize performance for different modalities Security analysis based on information-theory is possible Davida’98, Juels’99&’02, Dodis’04, Martinian’05, Draper’07, etc.
11
An example: “Fuzzy Commitment”
If the noisy biometric (X’) is close enough to the template (X), decoder successfully corrects the error. The only information stored are and hash(K)
12
An Example for Helper Data-based Biometric Template Protection: Syndrome Coding
… About securing biometric templates using error correction codes. … Joint work
13
Syndrome Coding Framework
Encode enrollment biometric Decode w/ probe biometric Store syndrome S and hash of X X S Syndrome Encoding Syndrome Decoding Biometric Authentication Authenticate only if hash of estimate matches stored hash Original enrollment biometric Y Fingerprint Channel Noisy biometric probe S cannot be uncompressed by itself and is therefore secure In combination with a noisy second reading Y the original X can be recovered using a Slepian-Wolf decoder Compare hash of estimate with stored hash to permit access
14
System Implementation
Enrolment Fingerprint access granted Alignment and Minutiae Extraction Extract binary feature vectors Syndrome Encoding yes no access denied Syndrome Database yes Alignment and Minutiae Extraction Extract binary feature vectors no access denied Syndrome Decoding Probe Fingerprint
15
Overview of Syndrome Encoding/Decoding
Security = number “missing” bits = original bits – syndrome bits Translates into number guesses to identify original biometric w.h.p. Robustness = false-rejection rate Robustness to variations in biometric readings achieved by syndrome decoding process (syndrome + noisy biometric => original biometric) Fewer syndrome bits = greater security, but less robustness
16
Example: Distributed Coding with Joint Decoding
Syndrome Bits Source X Source Y 1 1 1 1 1 1
17
Example: Syndrome Decoding
Bits Source X Side Info Y ? 1 ? 1 ? Use side info Y and syndrome bits to reconstruct X ? ? 1 ? ?
18
Example: Syndrome Decoding
Bits Source X Guess X 1 Flipping 1st bit from 0 to 1 satisfies 1st syndrome but violates 2nd syndrome 1 1 1 1 1 1
19
Example: Syndrome Decoding
Bits Source X Guess X 1 Flipping 2nd bit from 1 to 0 satisfies syndrome bits 1 and 2 but violates 3rd syndrome bit 1 1 1 1
20
Example: Syndrome Decoding
Bits Source X Guess X 1 1 1 Flipping 3rd bit from 0 to 1 satisfies all syndrome bits and recovers X 1 1 1 1
21
Security of Syndrome Approach
list of biometrics satisfying linear constraints F(X), set of linear functions specified by code C 1 1 1 1 F (F(X)) -1 + 1 + …. + secure biometric, S = evaluation of functions (syndrome vector) enrollment biometric X
22
Previous Approach: Binary Grid Representation
Problems Representation is sparse and difficult to model Statistics of binary string are not well suited for existing codes Poor performance Solution: Pre-processing! Pre-process fingerprint data to produce a binary string that is statistically compatible with existing codes Syndrome coding on resulting binary string 1
23
Desired Properties of Feature Transformation
. 1 Zeros and ones equally likely Individual bits independent User A User B Independent bit strings Reading 1 Reading 2 BSC-p 2 3 4 1. Bits extracted from biometrics should have statistical properties which make the biometric channel simple 2. List the properties and implications. Entropy = 1 bit, Pairwise entropy = 2 bits 3. Then the recipe for a secure biometrics scheme would be…
24
Extracting Robust Bits from Biometrics
Y N random cuboids in minutiae map # minutiae Bit vector 6 7 9 1 + Median thresholds Each cuboid contributes a 0 or 1 bit to the feature vector, if it contains less or more minutia points than the median
25
Elimination of Overlapping Cuboids
Large overlap similar bits easy for attacker to guess Large overlap results in low pair-wise bit-entropy, i.e., easier to guess pairs of bits To prevent, reject cuboids which result in low pair-wise entropy with other cuboids. Training process done only once for the complete database. 400 cuboids Best 150 cuboids
26
User-Specific Cuboids
Different users have different set of cuboids Requirements: For each user, choose the cuboids which Are most reliable Result in equal number of zeros and ones Have smallest possible pairwise correlation What is a “reliable” cuboid? # minutiae points far away from the median 0 or 1 bit is likely to remain unchanged over repeated noisy measurements of fingerprint.
27
“Reliable” Cuboids 1 UNRELIABLE RELIABLE
8 minutiae 3 minutiae Measurement 1 Measurement 1 5 minutiae 9 minutiae Measurement 2 Measurement 2 For robust bit extraction, we would prefer the cuboids to be “reliable” Consider a cuboid, median number of minutia = 4 Run through the example UNRELIABLE RELIABLE 1 E.g., median = 4 Each user has a different set of reliable cuboids !
28
User-Specific Reliable Cuboids
0-List 1-List … Sort by reliability N fair coin flips to choose cuboids from top of each list 1 2 3 4 5 Unordered list How to construct a feature vector using the most reliable cuboids? List of 0- and 1-cuboids Farther the number of minutia points is from the median, the more reliable the cuboid is. Sort. Chose cuboids from the top of the list at random Verify 4 properties: Fair coin = nearly equal 0s and 1s Independent coin tosses = independent bits Different users have different reliable cuboids = independent bitstreams across users Reliable cuboids = intra-user bit flips have small probability
29
Distribution of Zeros and Ones
Proprietary database of 1035 users, 15 pre-aligned samples per user Desired 75 ones Desired 75 ones 20 40 60 80 100 120 140 50 150 200 250 Number of 1's in the transformed feature vectors Number of feature vectors 30 40 50 60 70 80 90 100 110 120 150 200 250 300 350 400 450 Number of 1's in the transformed feature vectors Number of feature vectors Clustered around 75. i.e., half the number of 1’s because we chose randomly from the two ordered lists of 0 and 1 Cuboids. 150 Common Cuboids 150 User-Specific Cuboids
30
Intra-User and Inter-User Separation
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.02 0.04 0.06 0.08 0.12 0.14 0.16 0.18 Distribution of the NHD Normalized Hamming Distance (NHD) intra-user variation inter-user or attacker variation 0.35 inter-user variation 0.3 0.25 Distribution of the NHD 0.2 intra-user variation attacker steals user’s cuboids 0.15 NHD is used as a distance criterion Overlap between intra-user and inter-user distributions For user-specific case, overlap is dramatically reduced because every user has different set of reliable cuboids Green plot => what if attacker gets access to the reliable user-specific questions (worst case scenario). How much does that compromise security 0.1 0.05 0.2 0.4 0.6 0.8 1 Normalized Hamming Distance (NHD) 150 Common Cuboids 150 User-Specific Cuboids
31
Equal Error Rates User-specific cuboids provide lower equal error rate even if the attacker knows everybody’s cuboids. 0.05 0.1 0.15 0.2 0.25 Intra-user NHD Inter-user NHD 0.02 0.04 0.06 0.08 0.12 0.14 0.16 0.18 Inter-user NHD or attacker NHD inter-user scenario attack scenario 0.027 ≈ 0 ROC curves, EER If the attacker knows the questions, the EER increases. But it is still less than the EER for common cuboids. Common Cuboids User-Specific Cuboids
32
Syndrome Coding Results
33
Conclusions/Discussions
Random cuboids enable robust bit extraction with desired properties User-specific (reliable) features require more computation and storage, but give better separation between intra-user and inter-user feature vectors and provide higher security than common feature vectors However… Fast method for eliminating correlated bit-pairs from user-specific cuboids Extending feature transformation to use ridge data which is provided along with minutiae map Observing effect of alignment inconsistencies on overall performance
34
Protecting Biometric Templates with Secure Sketch: Theory and Practice
… About securing biometric templates using error correction codes. … Joint work
35
Secure Sketch ENCODER DECODER Enrollment Verification Noise P Sketch
randomness Generate password/key Sketch should not reveal too much information about the original biometric
36
Secure Sketch Entropy-loss (min-entropy of X)
(average min-entropy of X given P)
37
Security of secure sketch is defined in terms of entropy loss, L
Suppose original biometric data X have min-entropy H(X) The strength of the key we can extract from X given sketch P is at least H(X) – L L can be easily bounded by the size of the sketch P L is an upper bound of information leakage for all distributions of X
38
However… Known secure sketch schemes have limitations
Only deal with discrete data But real world biometric features are mostly in continuous domain One general solution Quantize/discretize the data and apply known schemes in the quantized domain
39
Quantization-based Secure Sketch
40
A Simple Example X is original data, 0 < X < 1 and under noise, X can be shifted by at most 0.1 1 1
41
Problems Remain... For different quantization methods
Min-entropy of quantized data could be different Entropy loss could also be different How to define the security? Using entropy loss in the quantized domain? Could be misleading
42
Different Quantizers Using scalar quantizer: Which one is better?
Case 1: step size 0.1 entropy loss = log 3 Case 2: step size 0.01 entropy loss = log 11 Which one is better? It depends on distribution of X If X is uniform: Case 1: H(X) = log(100), H(X) – L = log (100/3) = 5.06 Case 2: H(X) = log(1000), H(X) – L = log(1000/11) = 6.51 Case 2 yields a stronger key However, there exists a distribution of X such that for both case 1 and 2: H(X) is the same L is the actual information leakage Case 1 yields a stronger key
43
How to Compare? Or, how to define the ``optimal'' quantization for all distributions of X? It is difficult Might be impossible Instead, we propose to look at relative entropy loss, in addition to entropy loss Essentially, we ask questions differently: Given a family of quantizers (say, scalar quantizers with different quantization steps), for any one of them (say, Q), how many bits more that we could extract from X if another quantizer Q' was used? How to bound the number of additional bits that can be extracted for any Q' (compared with Q)? If we can bound it by B, then the ``best'' quantizer in the family cannot be better than Q by more than B bits
44
Main Results For any well-formed quantizer family, we can always bound the relative entropy loss well-formed: no quantizer in the family loses too much information (say, having too large a quantization step) The safest way to quantize data is to use a quantization step same as the error rate safest: relative entropy loss is the smallest This result is consistent with intuition useful to guide practical designs
45
However… Known secure sketch schemes have limitations
Only deal with discrete data But real world biometric features are mostly in continuous domain One general solution Quantize/discretize the data and apply known schemes in the quantized domain Measure security using entropy loss in the quantized domain For different quantization methods Min-entropy/entropy could be different How to define the security? Using min-entropy alone could be misleading Improve performance? How about cancelability/reusability? Better feature selection?
46
Randomized Quantization-based Secure Sketch
Enrollment Verification X XR Q(XR) ENCODER randomization quantization Y YR Q(YR) DECODER PX sketch template noisy biometric
47
Results ORL Face database - 40 different individual and 10 samples per individual 7 for training and 3 for testing PCA features (eigenfaces) considered Range-based similarity measure User-specific random (uniform) matrices
48
Results randomization quantization
49
Results Min-entropy Average sketch-size Left-over entropy PCA-based selection 59.91 40.35 19.56 based selection 61.95 43.46 18.49 132.52 98.88 33.64 139.95 106.55 33.40 n=20 n=50
50
Conclusions/Discussions
Randomization Improve performance Cancelability/reusability Feature selection Similar security with better average sketch-size estimation However… How to measure biometric information? Entropy estimation Matching algorithm How to define/find the “Optimal” quantization? Given the input distribution Given practical constraints (size of sketch and/or templates) Different quantization strategies
51
Overview of Future Directions
52
Threats/Attack Vectors
Chris Roberts, “Biometric attack vectors and defences”, Computers & Security 26(1): (2007)
53
Secure Biometric Systems: A Blend of Multiple Disciplines
Feature extraction/matching with robustness to noise and other variations; Finding new/better features Pattern Recognition Signal Processing Cryptography Recovery of original data from noisy or corrupt data; Better transformations Protect/scramble biometric data against malicious attacks; Analysis of security levels offered by various techniques. Secure Biometric Systems
54
Open Issues/Research Opportunities
Improving robustness vs. security trade-off Reliable measurements of biometric information Inherent entropy of biometrics Connection between template security and system security Remote vs. local Two-party vs. multi-party Multi-biometrics Multi-factor User privacy Standardization Terminology Format
55
Thanks
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.