Presentation on theme: "BioSecure & COST 2101 – Smart Cards and Biometric – Lausanne, 2007 Sabah Jassim University of Buckingham, UK. SecurePhone A Multi-Modal Biometric Verifier."— Presentation transcript:
BioSecure & COST 2101 – Smart Cards and Biometric – Lausanne, 2007 Sabah Jassim University of Buckingham, UK. SecurePhone A Multi-Modal Biometric Verifier for constrained devices
BioSecure + COST 2101 - March 2007 Outline The SecurePhone project Fusion approaches to biometric-based Identification SecurePhone multi-Modal Biometric verifier PDA Implementation Constraints Modalities Fusion strategy Performance: Match on Host (Moh) & Mach on Card (Moc) Challenges and Potential solutions Conclusion
BioSecure + COST 2101 - March 2007 The SecurePhone Project Aims to produce a prototype of a new mobile communication system enabling biometrically authenticated users to deal legally binding m-contracts during a mobile phone call in an easy yet highly dependable and secure way using a biometric recogniser that fuses face, voice and handwritten signature. The SP consortuim
BioSecure + COST 2101 - March 2007 SecurePhone aim 1: secure exchange Secure PKI (Public Key Infrastructure) Deal secure m-contracts during a mobile phone call secure: private key stored on SIM card user-friendly: intuitive, non-intrusive flexible: legally binding text/audio transactions dynamic: mobile e-signing “on the fly”
BioSecure + COST 2101 - March 2007 Implementation constraints PDA main processor is such slower processing power than PC. Thus even on PDA verification must be very efficient. Inadequate Audio-Visual signal sample rate using the device applications (only 8 kHz for audio and 10 fps video). Succeeded to improved. Current SP sampling and real time pre-processing is 22 kHz audio and 20 fps video signals. Only data on the SIM is secure, so must store and process the biometric models/templates on the SIM. Yet the SIM has very limited computational resources and processing support SIM model storage is limited to 40 K: text-dependent prompts Note: text-independent prompts or varied text-dependent prompts are more secure, but would require 200-400 K. Enrolment should be based on a short session (acceptability)
BioSecure + COST 2101 - March 2007 Voice verification (SU / GET ENST) Fixed 5-digits prompt – conceptually neutral, easily extendable, requires few Gaussians 22 KHz sampling Online energy based non-speech frame removal MFCCs with online CMS and first time difference features – slow to compute, but fixed point faster than floating point Features modelled by 100-Gaussian GMM pdf, with UBM for model initialisation and score normalisation Training on data from 2 indoor and 2 outdoor recordings from one session. Testing on similar data from another session
BioSecure + COST 2101 - March 2007 Signature verification (GET INT) 2D coordinates (100 Hz) augmented by time difference features, curvature, etc. – total 19 features Note:no pressure or angles available, since obtained from PDA’s touch screen, not from writing pad Shift normalisation, but no rotation or scaling Features modelled by 100 Gaussian GMM pdf – UBM used for model initialisation and score normalisation Fast to compute Training and testing on data from one session
BioSecure + COST 2101 - March 2007 Face Wavelet feature Representation (BU) The Discrete Wavelet Transform (DWT) decomposes an image into a set of different frequency subbands with different resolutions, each consisting of At a resolution depth of k, the pyramidal scheme decomposes an image I into 3k + 1 subbands: (LL k, HL k, LH k, HH k,..., HL 1, LH 1, HH 1 ). The lowest-pass subband LL k represents the k- level resolution approximation of the image I. The subbands HL 1, LH 1, and HH 1 contain finest scale wavelet coefficients, and the coefficients get coarser as k increases, LL k being the coarsest. Each subband of DWT-decomposed face image represents the person’s face at different frequency ranges and different scales (i.e. a distinct stream for face recognition with varying accuracy rates that can be fused for improved accuracy).
BioSecure + COST 2101 - March 2007 Face verification (BU) Static face recognition – 10 grey-scale images selected at random from a video, face area 160x192 pixels Histogram equalisation and z-score standardisation of features are applied as simple fast light normalisation. Haar wavelet low-low-4 (or low-high) subband as feature vectors Other wavelet filters were tested but Haar is the fastest to compute Features modelled by only 4 Gaussian GMM pdf – UBM used for model initialisation and score normalisation Training on data from 2 indoor and 2 outdoor recordings from one session, testing on similar data from another session
BioSecure + COST 2101 - March 2007 Fusion (GET INT) For each modality S(i) = log p(Xi|C) - log p(Xi|I) Score fusion was tested by: Optimal linear weighted sum: Fused-scores = w(i) * S(i) sum is taken over the 3 modalities GMM scores modelling, i.e. modelling both client and impostor joint score pdf’s by diagonal covariance GMMs: Fused-score = log p(S|C) - log p(S|I)
BioSecure + COST 2101 - March 2007 User verification system User requests PDA to verify their identity PDA requests user to read prompt (face in box) sign signature Feature processing applied to each modality [silence removal, histogram equalisation, MFCC or Haar wavelets, online CMS, delta features, etc.] for each modality S(i)=log p(Xi|C)-log p(Xi|I) if S(i) < θ(i) for any (i) please repeat else fused-score = log p(S|C) - log p(S|I) if fused-score > φ user accepted else user rejected Press to start/stop speaking 7 9 8 5 1 start/stop
BioSecure + COST 2101 - March 2007 Speaking face & Forgery (GET ENST) Investigated possible attacks and forgery scenarios: using synthesised voice and face Difficult to create – synchronisation problems Replay attacks – devised a successful attack whereby the client voice and face images but not the same video. Used coupled HMM for voice and face reduced greatly the effect of this attack.
BioSecure + COST 2101 - March 2007 PDA Database (PDAtabase) After initial development with many databases [TIMIT(V), CSLU(V), BANCA(V,F), ORL(F), BIOMET(V,F,S), NIST(V)] CSLU/BANCA-like database recorded on Qtek2020 PDA for realistic conditions (sensors, environment) 60 English subjects: 24 for UBM, 18 for g1, 18 for g2. Accept/reject threshold optimised on g1evaluated on g2, vice versa Video (voice + face): 18 prompts from (5-digit, 10-digit and phrase); 3 sessions, with 2 inside and 2 outside recordings per session Signatures in one session, 20 expert impostorisation for each Virtual couplings of audio-visual with signature data (independent) Automatic test script allows to test many possible configuration User just provides executables for feature modelling, scores generation and scores fusion
BioSecure + COST 2101 - March 2007 Match on Host (MoH): complementarity of modalities Modality5 digits10 digits Voice (V) 6.13.4 Face (F) 28.629.9 Signature (S) 6.2 V + F 4.83.0 V + S 1.10.7 S + F 4.84.7 V + F + S 0.90.6 Result table with improved results for 5-digit and 10-digit prompts in PDAtabase (SPIE 2006) For LL subband. Already have improved results for LH subband!
BioSecure + COST 2101 - March 2007 Match on Card (MoC) Implementation of the MoH system on the SIMcard (MoC) No problem in terms of storage But is not feasible because of verification time (matching plus host/SIM communication = one hour ) A reduction of the verification time can be attained by reducing the vector size reducing the frame rate reducing the number of Gaussians of the client and background models Matching time was still not acceptable
BioSecure + COST 2101 - March 2007 MoC bottleneck Not in preprocessing, since this is still all done on the PDA, as in the MoH system. Not in face: Although feature vectors are Only a few (10) of them in testing and only 4 Gaussians needed (client model and UBM) Bottleneck caused by voice and signature data: Vectors are relatively small, large number of frames large number of Gaussians
BioSecure + COST 2101 - March 2007 MoC solution Only a drastic measure can solve the problem: Globalised features: Features to represent the whole signature: a single vector of 41 parameters representing correlation and variation in x-y coordinates, velocity and acceleration parameters Idea generalized to voice: use of means (cf. Long-Term Average Spectrum) and standard deviations per vector parameters across all frames Works well for signature Improvement: use up to four equal subparts of signature/voice signal Implementation: 2 equal subparts
BioSecure + COST 2101 - March 2007 MoC-emulated results EER (percent) for globalised means (columns 2-5) and means plus standard deviations (columns 6-9) for voice and sinature divided into two equal subparts Global feat. Means only Means + sd #Gauss.12481248 Voice22.1321.0920.8721.8620.8819.7217.6818.49 Face32.2631.7829.0629.1932.2631.7829.0629.19 Signature38.2927.5822.5817.8628.1422.1617.5916.45 Fused12.8912.4810.499.3212.5610.488.289.15
BioSecure + COST 2101 - March 2007 Solving the capacity problem Possible options for improving performance of the SecurePhone: Use match-on-server (MoS) - Security and privacy concern. Implement the Biometric Recognizer and Encryption on a chip (more costly than current solution) Build a secure PDA with sufficient storage and processing power (A dedicated device that would be more costly and less ubiquitous). Split matching (hybrid MoC/MoH) considered but not implemented. Initial work is being done and results are encouraging. Promising implications for security and privacy of biometrics data (templates/models)without cryptography.
BioSecure + COST 2101 - March 2007 Conclusion and Future Work Natural, non-intrusive biometrics guarantee high user acceptance Biometric data never leave the SIM-card. High security Fusion of Multi-streams of single trait can lead to improved in performance (A pilot for Face was tested but not implemented in SP) MoH is efficient with high accuracy, but vulnerable. MoC is secure, efficiency and high accuracy cannot happen together! Future work include: Designing hybrid mixed client-server matching. Investigating the privacy and security of Biometric data, using Cancellable Biometrics, specially for “Match on Server” Improving performance of single modalities through the multi- classifier & multi-stream strategies. e.g. Face by mixing larger number of subbands at different depths
BioSecure + COST 2101 - March 2007 Acknowledgement Thanks to EU for funding this research through the SecurePhone (IST-2002-506883) project.