Speaker Recognition UNIT -6. Introduction  Speaker recognition is the process of automatically recognizing who is speaking on the basis of information.

Slides:



Advertisements
Similar presentations
1 Gesture recognition Using HMMs and size functions.
Advertisements

Presented by Erin Palmer. Speech processing is widely used today Can you think of some examples? Phone dialog systems (bank, Amtrak) Computers dictation.
Speech Recognition Part 3 Back end processing. Speech recognition simplified block diagram Speech Capture Speech Capture Feature Extraction Feature Extraction.
Hidden Markov Models Reading: Russell and Norvig, Chapter 15, Sections
Development of Automatic Speech Recognition and Synthesis Technologies to Support Chinese Learners of English: The CUHK Experience Helen Meng, Wai-Kit.
December 2006 Cairo University Faculty of Computers and Information HMM Based Speech Synthesis Presented by Ossama Abdel-Hamid Mohamed.
1 Profile Hidden Markov Models For Protein Structure Prediction Colin Cherry
SPEECH RECOGNITION Kunal Shalia and Dima Smirnov.
Speech Translation on a PDA By: Santan Challa Instructor Dr. Christel Kemke.
Computer Intelligence and Soft Computing
Evaluating Hypotheses
AI – CS364 Hybrid Intelligent Systems Overview of Hybrid Intelligent Systems 07 th November 2005 Dr Bogdan L. Vrusias
1 Hidden Markov Model Instructor : Saeed Shiry  CHAPTER 13 ETHEM ALPAYDIN © The MIT Press, 2004.
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 7: Expert Systems and Artificial Intelligence Decision Support.
Dynamic Time Warping Applications and Derivation
A PRESENTATION BY SHAMALEE DESHPANDE
Authors: Anastasis Kounoudes, Anixi Antonakoudi, Vasilis Kekatos
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
Introduction to Automatic Speech Recognition
Intrusion Detection for Grid and Cloud Computing Author Kleber Vieira, Alexandre Schulter, Carlos Becker Westphall, and Carla Merkle Westphall Federal.
Isolated-Word Speech Recognition Using Hidden Markov Models
1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.
Speaker Recognition By Afshan Hina.
June 28th, 2004 BioSecure, SecurePhone 1 Automatic Speaker Verification : Technologies, Evaluations and Possible Future Gérard CHOLLET CNRS-LTCI, GET-ENST.
A Talking Elevator, WS2006 UdS, Speaker Recognition 1.
7-Speech Recognition Speech Recognition Concepts
Chapter 14 Speaker Recognition 14.1 Introduction to speaker recognition 14.2 The basic problems for speaker recognition 14.3 Approaches and systems 14.4.
Csc Lecture 7 Recognizing speech. Geoffrey Hinton.
Recognition of spoken and spelled proper names Reporter : CHEN, TZAN HWEI Author :Michael Meyer, Hermann Hild.
Image Classification 영상분류
LML Speech Recognition Speech Recognition Introduction I E.M. Bakker.
Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.
22CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University. Lecture 10: Advanced Input.
Speech, Perception, & AI Artificial Intelligence CMSC February 13, 2003.
Speaker Authentication Qi Li and Biing-Hwang Juang, Pattern Recognition in Speech and Language Processing, Chap 7 Reporter : Chang Chih Hao.
Overview ► Recall ► What are sound features? ► Feature detection and extraction ► Features in Sphinx III.
Project Estimation techniques Estimation of various project parameters is a basic project planning activity. The important project parameters that are.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)
Conditional Random Fields for ASR Jeremy Morris July 25, 2006.
Artificial Intelligence, Expert Systems, and Neural Networks Group 10 Cameron Kinard Leaundre Zeno Heath Carley Megan Wiedmaier.
Speech Recognition with CMU Sphinx Srikar Nadipally Hareesh Lingareddy.
Performance Comparison of Speaker and Emotion Recognition
Presented by: Fang-Hui Chu Discriminative Models for Speech Recognition M.J.F. Gales Cambridge University Engineering Department 2007.
ARTIFICIAL INTELLIGENCE FOR SPEECH RECOGNITION. Introduction What is Speech Recognition?  also known as automatic speech recognition or computer speech.
BY KALP SHAH Sentence Recognizer. Sphinx4 Sphinx4 is the best and versatile recognition system. Sphinx4 is a speech recognition system which is written.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Elements of a Discrete Model Evaluation.
DYNAMIC TIME WARPING IN KEY WORD SPOTTING. OUTLINE KWS and role of DTW in it. Brief outline of DTW What is training and why is it needed? DTW training.
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
Speaker Verification System Middle Term Presentation Performed by: Barak Benita & Daniel Adler Instructor: Erez Sabag.
A Hybrid Model of HMM and RBFN Model of Speech Recognition 길이만, 김수연, 김성호, 원윤정, 윤아림 한국과학기술원 응용수학전공.
Definition of the Hidden Markov Model A Seminar Speech Recognition presentation A Seminar Speech Recognition presentation October 24 th 2002 Pieter Bas.
Message Source Linguistic Channel Articulatory Channel Acoustic Channel Observable: MessageWordsSounds Features Bayesian formulation for speech recognition:
Chapter 21 Robotic Perception and action Chapter 21 Robotic Perception and action Artificial Intelligence ดร. วิภาดา เวทย์ประสิทธิ์ ภาควิชาวิทยาการคอมพิวเตอร์
1 7-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches Recognition Theories Bayse Rule Simple Language Model P(A|W) Network Types.
By: Nicole Cappella. Why I chose Speech Recognition  Always interested me  Dr. Phil Show Manti Teo Girlfriend Hoax  Three separate voice analysts proved.
PREPARED BY MANOJ TALUKDAR MSC 4 TH SEM ROLL-NO 05 GUKC-2012 IN THE GUIDENCE OF DR. SANJIB KR KALITA.
Shital ghule..  INTRODUCTION: This paper proposes an ATM security model that would combine a physical access card,a pin and electronic facial recognition.
BIOMETRICS VOICE RECOGNITION. Meaning Bios : LifeMetron : Measure Bios : LifeMetron : Measure Biometrics are used to identify the input sample when compared.
Study on Deep Learning in Speaker Recognition Lantian Li CSLT / RIIT Tsinghua University May 26, 2016.
Introduction to Machine Learning, its potential usage in network area,
CHAPTER 1 Introduction BIC 3337 EXPERT SYSTEM.
ARTIFICIAL NEURAL NETWORKS
Speech Recognition UNIT -5.
Conditional Random Fields for ASR
RECURRENT NEURAL NETWORKS FOR VOICE ACTIVITY DETECTION
A maximum likelihood estimation and training on the fly approach
The Application of Hidden Markov Models in Speech Recognition
Presentation transcript:

Speaker Recognition UNIT -6

Introduction  Speaker recognition is the process of automatically recognizing who is speaking on the basis of information obtained from speech signal.  This technique will make it possible to verify the identity of persons accessing systems, that is, access control by voice, in various services.  These services include voice dialing, banking transaction over telephone network, telephone shopping, database access services, information and reservation system, voice mail, security control for confidential information areas, and remote access to computers.  Speaker recognition is probably the only biometric which may be easily tested remotely through the telephone network, this makes it quite valuable in many real applications, and it will become more popular in the future.

Types of Speaker recognition  Speaker recognition is divided into speaker verification and speaker identification.  For speaker verification an identity is claimed by the user, and the decision required of the verification system is strictly binary; i.e., to accept or reject the claimed identity.  Speaker identification is the process of determining which speaker in a group of known speakers most closely matches the unknown speaker.

Cont’d  The data used in the recognition is divided into:  Text-dependent and text-independent.  In text-dependent, the speaker is required to provide utterances having the same text for both training and recognition  In text-independent systems allow the user is allowed to utter any text.  Modern research targets text-independent Speaker recognition systems.

Block Diagram of speaker recognition

Cont’d

A block schematic diagram of pattern recognition is presented.  Template approach  Stochastic approach

Template approach

Cont’d  Template based approach to speech recognition have provided a family of techniques that have advanced the field considerably during the last six decades.  The underlying idea is simple.  A collection of prototypical speech patterns are stored as reference patterns representing the dictionary of candidate s words.  Recognition is then carried out by matching an unknown spoken utterance with each of these reference templates and selecting the category of the best matching pattern.  Usually templates for entire words are constructed.  This has the advantage that, errors due to segmentation or classification of smaller acoustically more variable units such as phonemes can be avoided.

Cont’d  In turn, each word must have its own full reference template; template preparation and matching become prohibitively expensive or impractical as vocabulary size increases beyond a few hundred words.  One key idea in template method is to derive a typical sequences of speech frames for a pattern(a word) via some averaging procedure, and to rely on the use of local spectral distance measures to compare patterns.  Another key idea is to use some form of dynamic programming to temporarily align patterns to account for differences in speaking rates across talkers as well as across repetitions of the word by the same talker.

Stochastic Approach  Stochastic modelling entails the use of probabilistic models to deal with uncertain or incomplete information.  In speech recognition, uncertainty and incompleteness arise from many sources; for example, confusable sounds, speaker variability, contextual effects, and homophones words.  Thus, stochastic models are particularly suitable approach to speech recognition.  The most popular stochastic approach today is hidden Markov modeling.

Cont’d  A hidden Markov model is characterized by a finite state markov model and a set of output distributions.  The transition parameters in the Markov chain models, temporal variabilities, while the parameters in the output distribution model, spectral variabilities.  These two types of variabilites are the essence of speech recognition.  Compared to template based approach, hidden Markov modelling is more general and has a firmer mathematical foundation.

Cont’d  A template based model is simply a continuous density HMM, with identity covariance matrices and a slope constrained topology. Although templates can be trained on fewer instances, they lack the probabilistic formulation of full HMMs and typically underperforms HMMs.  Compared to knowledge based approaches; HMMs enable easy integration of knowledge sources into a compiled architecture.  A negative side effect of this is that HMMs do not provide much insight on the recognition process.  As a result, it is often difficult to analyze the errors of an HMM system in an attempt to improve its performance.  Nevertheless, prudent incorporation of knowledge has significantly improved HMM based systems.

Artificial Intelligence approach (Knowledge Based approach)  The Artificial Intelligence approach is a hybrid of the acoustic phonetic approach and pattern recognition approach.  In this, it exploits the ideas and concepts of Acoustic phonetic and pattern recognition methods.  Knowledge based approach uses the information regarding linguistic, phonetic and spectrogram.  Some speech researchers developed recognition system that used acoustic phonetic knowledge to develop classification rules for speech sounds.  While template based approaches have been very effective in the design of a variety of speech recognition systems; they provided little insight about human speech processing, thereby making error analysis and knowledge-based system enhancement difficult.

Cont’d  On the other hand, a large body of linguistic and phonetic literature provided insights and understanding to human speech processing.  In its pure form, knowledge engineering design involves the direct and explicit incorporation of experts speech knowledge into a recognition system.  This knowledge is usually derived from careful study of spectrograms and is incorporated using rules or procedures.  Pure knowledge engineering was also motivated by the interest and research in expert systems.  However, this approach had only limited success, largely due to the difficulty in quantifying expert knowledge.