June 28th, 2004 BioSecure, SecurePhone 1 Automatic Speaker Verification : Technologies, Evaluations and Possible Future Gérard CHOLLET CNRS-LTCI, GET-ENST.

Slides:



Advertisements
Similar presentations
Becars: an Automatic Speaker Verification system
Advertisements

Some activities on Biometrics at ENST/CNRS-LTCI
Robust Speech recognition V. Barreaud LORIA. Mismatch Between Training and Testing n mismatch influences scores n causes of mismatch u Speech Variation.
Masters Presentation at Griffith University Master of Computer and Information Engineering Magnus Nilsson
Frederico Rodrigues and Isabel Trancoso INESC/IST, 2000 Robust Recognition of Digits and Natural Numbers.
Speaker Recognition Sharat.S.Chikkerur Center for Unified Biometrics and Sensors
Speech in Multimedia Hao Jiang Computer Science Department Boston College Oct. 9, 2007.
SecurePhone Workshop - 24/25 June Speaking Faces Verification Kevin McTait Raphaël Blouet Gérard Chollet Silvia Colón Guido Aversano.
BioSec © 2004 BioSec Consortium 1 Biometrics & Security IST st BioSec Workshop Barcelona, June 28th, 2004 Multimodality Solutions: Major Advantages.
Speaker Recognition G. CHOLLET, G. GRAVIER,
FIT3105 Biometric based authentication and identity management
Introduction to Biometrics Dr. Pushkin Kachroo. New Field Face recognition from computer vision Speaker recognition from signal processing Finger prints.
PALM VEIN TECHNOLOGY.
1 Cours parole du 9 Mars 2005 enseignants: Dr. Dijana Petrovska-Delacrétaz et Gérard Chollet Reconnaissance du locuteur 1.Introduction, Historique, Domaines.
The Chinese University of Hong Kong Department of Computer Science and Engineering Lyu0202 Advanced Audio Information Retrieval System.
Why is ASR Hard? Natural speech is continuous
A PRESENTATION BY SHAMALEE DESHPANDE
A Brief Survey on Face Recognition Systems Amir Omidvarnia March 2007.
Authors: Anastasis Kounoudes, Anixi Antonakoudi, Vasilis Kekatos
Biometrics: Voice Recognition
Audio Processing for Ubiquitous Computing Uichin Lee KAIST KSE.
Introduction to Automatic Speech Recognition
Isolated-Word Speech Recognition Using Hidden Markov Models
Speaker Recognition By Afshan Hina.
An Introduction to Biometric Identity Verification
Douglas A. Reynolds, PhD Senior Member of Technical Staff
A Talking Elevator, WS2006 UdS, Speaker Recognition 1.
Utterance Verification for Spontaneous Mandarin Speech Keyword Spotting Liu Xin, BinXi Wang Presenter: Kai-Wun Shih No.306, P.O. Box 1001,ZhengZhou,450002,
International Conference on Intelligent and Advanced Systems 2007 Chee-Ming Ting Sh-Hussain Salleh Tian-Swee Tan A. K. Ariff. Jain-De,Lee.
Juan Ortega 10/20/09 NTS490. Speaker recognition is the computing task of validating a user’s claimed identity using characteristics extracted from their.
A brief overview of Speech Recognition and Spoken Language Processing Advanced NLP Guest Lecture August 31 Andrew Rosenberg.
BIOMETRICS By: Lucas Clay and Tim Myers. WHAT IS IT?  Biometrics are a method of uniquely identifying a person based on physical or behavioral traits.
Voice Recognition All Talk No Walk.
Csc Lecture 7 Recognizing speech. Geoffrey Hinton.
IRCS/CCN Summer Workshop June 2003 Speech Recognition.
Signature with Text-Dependent and Text-Independent Speech for Robust Identity Verification B. Ly-Van*, R. Blouet**, S. Renouard** S. Garcia-Salicetti*,
Signature with Text-Dependent and Text-Independent Speech for Robust Identity Verification B. Ly-Van*, R. Blouet**, S. Renouard** S. Garcia-Salicetti*,
LML Speech Recognition Speech Recognition Introduction I E.M. Bakker.
Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.
Dijana Petrovska-Delacrétaz 1 Asmaa el Hannani 1 Gérard Chollet 2 1: DIVA Group, University of Fribourg 2: GET-ENST, CNRS-LTCI,
Speaker Recognition by Habib ur Rehman Abdul Basit CENTER FOR ADVANCED STUDIES IN ENGINERING Digital Signal Processing ( Term Project )
Secure contracts signed by mobile Phone IST Jacques Koreman, NTNU Andrew Morris, Spinvox International Workshop on Verbal and Nonverbal Communiation.
Speaker Authentication Qi Li and Biing-Hwang Juang, Pattern Recognition in Speech and Language Processing, Chap 7 Reporter : Chang Chih Hao.
Speaker Verification Speaker verification uses voice as a biometric to determine the authenticity of a user. Speaker verification systems consist of two.
July Age and Gender Recognition from Speech Patterns Based on Supervised Non-Negative Matrix Factorization Mohamad Hasan Bahari Hugo Van hamme.
AMSP : Advanced Methods for Speech Processing An expression of Interest to set up a Network of Excellence in FP6 Prepared by members of COST-277 and colleagues.
Biometric Technologies
Speech Recognition with CMU Sphinx Srikar Nadipally Hareesh Lingareddy.
Performance Comparison of Speaker and Emotion Recognition
Detection of Vowel Onset Point in Speech S.R. Mahadeva Prasanna & Jinu Mariam Zachariah Department of Computer Science & Engineering Indian Institute.
1 Electrical and Computer Engineering Binghamton University, State University of New York Electrical and Computer Engineering Binghamton University, State.
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
Speaker Verification System Middle Term Presentation Performed by: Barak Benita & Daniel Adler Instructor: Erez Sabag.
Message Source Linguistic Channel Articulatory Channel Acoustic Channel Observable: MessageWordsSounds Features Bayesian formulation for speech recognition:
By: Nicole Cappella. Why I chose Speech Recognition  Always interested me  Dr. Phil Show Manti Teo Girlfriend Hoax  Three separate voice analysts proved.
Shital ghule..  INTRODUCTION: This paper proposes an ATM security model that would combine a physical access card,a pin and electronic facial recognition.
Speaker Recognition UNIT -6. Introduction  Speaker recognition is the process of automatically recognizing who is speaking on the basis of information.
BIOMETRICS VOICE RECOGNITION. Meaning Bios : LifeMetron : Measure Bios : LifeMetron : Measure Biometrics are used to identify the input sample when compared.
Study on Deep Learning in Speaker Recognition Lantian Li CSLT / RIIT Tsinghua University May 26, 2016.
A Seminar Report On Face Recognition Technology
FACE RECOGNITION TECHNOLOGY
3.0 Map of Subject Areas.
Sfax University, Tunisia
Asst. Prof. Arvind Selwal, CUJ,Jammu
Ala’a Spaih Abeer Abu-Hantash Directed by Dr.Allam Mousa
Faculty of Science IT Department Lecturer: Raz Dara MA.
Hybrid Finger print recognition
A maximum likelihood estimation and training on the fly approach
The Application of Hidden Markov Models in Speech Recognition
Presentation transcript:

June 28th, 2004 BioSecure, SecurePhone 1 Automatic Speaker Verification : Technologies, Evaluations and Possible Future Gérard CHOLLET CNRS-LTCI, GET-ENST Biometrics in Current Security Environments Biometrics in Current Security Environments

June 28th, 2004BioSecure, SecurePhone2Outline  State of affairs (tasks, security, forensic,…)  Speaker characteristics in the speech signal  Automatic Speaker Verification :  Decision theory  Text dependent / Text independent  Imposture (occasional, dedicated)  Voice transformations  Audio-visual speaker verification  Evaluations (algorithms, field tests, ergonomy,…)  Conclusions, Perspectives

June 28th, 2004BioSecure, SecurePhone3 Why should a computer recognize who is speaking ?  Protection of individual property (habitation, bank account, personal data, messages, mobile phone, PDA,...)  Limited access (secured areas, data bases)  Personalization (only respond to its master’s voice)  Locate a particular person in an audio-visual document (information retrieval)  Who is speaking in a meeting ?  Is a suspect the criminal ? (forensic applications)

June 28th, 2004BioSecure, SecurePhone4 Tasks in Automatic Speaker Recognition  Speaker verification (Voice Biometric)  Are you really who you claim to be ?  Identification (Speaker ID) :  Is this speech segment coming from a known speaker ?  How large is the set of speakers (population of the world) ?  Speaker detection, segmentation, indexing, retrieval, tracking :  Looking for recordings of a particular speaker  Combining Speech and Speaker Recognition  Adaptation to a new speaker, speaker typology  Personalization in dialogue systems

June 28th, 2004BioSecure, SecurePhone5 Applications  Access Control  Physical facilities, Computer networks, Websites  Transaction Authentication  Telephone banking, e-Commerce  Speech data Management  Voice messaging, Search engines  Law Enforcement  Forensics, Home incarceration

June 28th, 2004BioSecure, SecurePhone6 Voice Biometric  Avantages  Often the only modality over the telephone,  Low cost (microphone, A/D), Ubiquity  Possible integration on a smart (SIM) card  Natural bimodal fusion : speaking face  Disadvantages  Lack of discretion  Possibility of imitation and electronic imposture  Lack of robustness to noise, distortion,…  Temporal drift

June 28th, 2004BioSecure, SecurePhone7 Speaker Identity in Speech  Differences in  Vocal tract shapes and muscular control  Fundamental frequency (typical values)  100 Hz (Male), 200 Hz (Female), 300 Hz (Child)  Glottal waveform  Phonotactics  Lexical usage  The differences between Voices of Twins is a limit case  Voices can also be imitated or disguised

June 28th, 2004BioSecure, SecurePhone8 spectral envelope of / i: / f A Speaker A Speaker B Speaker Identity  segmental factors (~30ms)  glottal excitation: fundamental frequency, amplitude, voice quality (e.g., breathiness)  vocal tract: characterized by its transfer function and represented by MFCCs (Mel Freq. Cepstral Coef)  suprasegmental factors  speaking speed (timing and rhythm of speech units)  intonation patterns  dialect, accent, pronunciation habits

June 28th, 2004BioSecure, SecurePhone9 Acoutic features  Short term spectral analysis

June 28th, 2004BioSecure, SecurePhone10 Intra- and Inter-speaker variability

June 28th, 2004BioSecure, SecurePhone11 Speaker Verification Typology of approaches (EAGLES Handbook)  Text dependent  Public password  Private password  Customized password  Text prompted  Text independent Incremental enrolment Evaluation

June 28th, 2004BioSecure, SecurePhone12 History of Speaker Recognition

June 28th, 2004BioSecure, SecurePhone13 Current approaches

June 28th, 2004BioSecure, SecurePhone14 HMM structure depends on the application

June 28th, 2004BioSecure, SecurePhone15 Gaussian Mixture Model  Parametric representation of the probability distribution of observations:

June 28th, 2004BioSecure, SecurePhone16 Gaussian Mixture Models 8 Gaussians per mixture

June 28th, 2004BioSecure, SecurePhone17  Two types of errors :  False rejection (a client is rejected)  False acceptation (an impostor is accepted)  Decision theory : given an observation O and a claimed identity  H 0 hypothesis : it comes from an impostor  H 1 hypothesis : it comes from our client  H 1 is chosen if and only if P(H 1 |O) > P(H 0 |O) which could be rewritten (using Bayes law) as Decision theory for identity verification

June 28th, 2004BioSecure, SecurePhone18 Signal detection theory

June 28th, 2004BioSecure, SecurePhone19 Decision

June 28th, 2004BioSecure, SecurePhone20 Distribution of scores

June 28th, 2004BioSecure, SecurePhone21 Detection Error Tradeoff (DET) Curve

June 28th, 2004BioSecure, SecurePhone22 Evaluation  Decision cost (FA, FR, priors, costs,…)  Receiver Operating Characteristic Curve  Reference systems (open software)  Evaluations (algorithms, field trials, ergonomy,…)

June 28th, 2004BioSecure, SecurePhone23 National Institute of Standards & Technology (NIST) Speaker Verification Evaluations Annual evaluation since 1995 Common paradigm for comparing technologies

June 28th, 2004BioSecure, SecurePhone24 NIST evaluations : Results

June 28th, 2004BioSecure, SecurePhone25 Combining Speech Recognition and Speaker Verification.  Speaker independent phone HMMs  Selection of segments or segment classes which are speaker specific  Preliminary evaluations are performed on the NIST extended data set (one hour of training data per speaker)

June 28th, 2004BioSecure, SecurePhone26 ALISP data-driven speech segmentation

June 28th, 2004BioSecure, SecurePhone27 Searching in client and world speech dictionaries for speaker verification purposes

June 28th, 2004BioSecure, SecurePhone28 Fusion

June 28th, 2004BioSecure, SecurePhone29 Fusion results

June 28th, 2004BioSecure, SecurePhone30 Speaking Faces : Motivations  A person speaking in front of a camera offers 2 modalities for identity verification (speech and face).  The sequence of face images and the synchronisation of speech and lip movements could be exploited.  Imposture is much more difficult than with single modalities. Many PCs, PDAs, mobile phones are equiped with a camera. Audio-Visual Identity Verification will offer non-intrusive security for e-commerce, e- banking,…

June 28th, 2004BioSecure, SecurePhone31 Talking Face Recognition (hybrid verification)

June 28th, 2004BioSecure, SecurePhone32 Lip features  Tracking lip movements

June 28th, 2004BioSecure, SecurePhone33 A talking face model  Using Hidden Markov Models (HMMs) Acoustic parameters Visual parameters

June 28th, 2004BioSecure, SecurePhone34 Morphing, avatars

June 28th, 2004BioSecure, SecurePhone35 Conclusions, Perspectives   Deliberate imposture is a challenge for speech only systems   Verification of identity based on features extracted from talking faces should be developped   Common databases and evaluation protocols are necessary   Free access to reference systems will facilitate future developments