Sharat.S.Chikkerur S.Anand Mantravadi Rajeev.K.Srinivasan

Slides:



Advertisements
Similar presentations
Robust Speech recognition V. Barreaud LORIA. Mismatch Between Training and Testing n mismatch influences scores n causes of mismatch u Speech Variation.
Advertisements

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: The Linear Prediction Model The Autocorrelation Method Levinson and Durbin.
© Fraunhofer FKIE Corinna Harwardt Automatic Speaker Recognition in Military Environment.
Voiceprint System Development Design, implement, test unique voiceprint biometric system Research Day Presentation, May 3 rd 2013 Rahul Raj (Team Lead),
Liner Predictive Pitch Synchronization Voiced speech detection, analysis and synthesis Jim Bryan Florida Institute of Technology ECE5525 Final Project.
Vineel Pratap Girish Govind Abhilash Veeragouni. Human listeners are capable of extracting information from the acoustic signal beyond just the linguistic.
Abstract This article investigates the importance of the vocal source information for speaker recogni- tion. We propose a novel feature extraction scheme.
Speaker Recognition Sharat.S.Chikkerur Center for Unified Biometrics and Sensors
A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU.
Communications & Multimedia Signal Processing Analysis of the Effects of Train noise on Recognition Rate using Formants and MFCC Esfandiar Zavarehei Department.
Feature vs. Model Based Vocal Tract Length Normalization for a Speech Recognition-based Interactive Toy Jacky CHAU Department of Computer Science and Engineering.
CUBS, University at Buffalo
A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU.
Voice Transformation Project by: Asaf Rubin Michael Katz Under the guidance of: Dr. Izhar Levner.
Securing Pervasive Networks Using Biometrics
Real-Time Speech Recognition Thang Pham Advisor: Shane Cotter.
A PRESENTATION BY SHAMALEE DESHPANDE
Representing Acoustic Information
Introduction to Automatic Speech Recognition
1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.
Speech Processing Laboratory
7-Speech Recognition Speech Recognition Concepts
Chapter 14 Speaker Recognition 14.1 Introduction to speaker recognition 14.2 The basic problems for speaker recognition 14.3 Approaches and systems 14.4.
Reporter: Shih-Hsiang( 士翔 ). Introduction Speech signal carries information from many sources –Not all information is relevant or important for speech.
Implementing a Speech Recognition System on a GPU using CUDA
Jacob Zurasky ECE5526 – Spring 2011
Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.
Supervisor: Dr. Eddie Jones Co-supervisor: Dr Martin Glavin Electronic Engineering Department Final Year Project 2008/09 Development of a Speaker Recognition/Verification.
1 Linear Prediction. 2 Linear Prediction (Introduction) : The object of linear prediction is to estimate the output sequence from a linear combination.
1 Linear Prediction. Outline Windowing LPC Introduction to Vocoders Excitation modeling  Pitch Detection.
Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.
Chapter 5: Speech Recognition An example of a speech recognition system Speech recognition techniques Ch5., v.5b1.
Speaker Recognition by Habib ur Rehman Abdul Basit CENTER FOR ADVANCED STUDIES IN ENGINERING Digital Signal Processing ( Term Project )
ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska
Look who’s talking? Project 3.1 Yannick Thimister Han van Venrooij Bob Verlinden Project DKE Maastricht University.
Speaker Identification by Combining MFCC and Phase Information Longbiao Wang (Nagaoka University of Technologyh, Japan) Seiichi Nakagawa (Toyohashi University.
A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU.
Performance Comparison of Speaker and Emotion Recognition
EEL 6586: AUTOMATIC SPEECH PROCESSING Speech Features Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida February 27,
Automatic Speech Recognition A summary of contributions from multiple disciplines Mark D. Skowronski Computational Neuro-Engineering Lab Electrical and.
More On Linear Predictive Analysis
Predicting Voice Elicited Emotions
Chapter 7 Speech Recognition Framework  7.1 The main form and application of speech recognition  7.2 The main factors of speech recognition  7.3 The.
DTW for Speech Recognition J.-S. Roger Jang ( 張智星 ) MIR Lab ( 多媒體資訊檢索實驗室 ) CS, Tsing Hua Univ. ( 清華大學.
DYNAMIC TIME WARPING IN KEY WORD SPOTTING. OUTLINE KWS and role of DTW in it. Brief outline of DTW What is training and why is it needed? DTW training.
Speaker Verification System Middle Term Presentation Performed by: Barak Benita & Daniel Adler Instructor: Erez Sabag.
EEL 6586: AUTOMATIC SPEECH PROCESSING Speech Features Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida February 20,
Linear Prediction.
7.0 Speech Signals and Front-end Processing References: , 3.4 of Becchetti of Huang.
BIOMETRICS VOICE RECOGNITION. Meaning Bios : LifeMetron : Measure Bios : LifeMetron : Measure Biometrics are used to identify the input sample when compared.
Study on Deep Learning in Speaker Recognition Lantian Li CSLT / RIIT Tsinghua University May 26, 2016.
Speech Processing Dr. Veton Këpuska, FIT Jacob Zurasky, FIT.
PATTERN COMPARISON TECHNIQUES
Ch. 5: Speech Recognition
ARTIFICIAL NEURAL NETWORKS
Spoken Digit Recognition
Presentation on Artificial Neural Network Based Pathological Voice Classification Using MFCC Features Presenter: Subash Chandra Pakhrin 072MSI616 MSC in.
Linear Prediction.
Leigh Anne Clevenger Pace University, DPS ’16
PROJECT PROPOSAL Shamalee Deshpande.
Isolated word, speaker independent speech recognition
Neuro-Fuzzy and Soft Computing for Speaker Recognition (語者辨識)
Ala’a Spaih Abeer Abu-Hantash Directed by Dr.Allam Mousa
Digital Systems: Hardware Organization and Design
Presenter: Simon de Leon Date: March 2, 2006 Course: MUMT611
Linear Prediction.
A maximum likelihood estimation and training on the fly approach
Speech Processing Final Project
Measuring the Similarity of Rhythmic Patterns
Keyword Spotting Dynamic Time Warping
Presentation transcript:

Sharat.S.Chikkerur S.Anand Mantravadi Rajeev.K.Srinivasan Speaker Recognition Sharat.S.Chikkerur S.Anand Mantravadi Rajeev.K.Srinivasan

Speaker Identification Speaker Recognition Definition It is the method of recognizing a person based on his voice It is one of the forms of biometric identification Depends of speaker dependent characteristics. Speaker Recognition Speaker Identification Speaker Verification Speaker Detection Text Dependent Independent EE 516 Term Project, Fall 2003

Speech production Speech production mechanism Speech production model Impulse Train Generator Glottal Pulse Model G(z) Vocal Tract V(z) Radiation R(z) Noise source Pitch Av AN Speech production mechanism Speech production model EE 516 Term Project, Fall 2003

Generic Speaker Recognition System Speech signal Score Analysis Frames Feature Vector Preprocessing Feature Extraction Pattern Matching Verification Preprocessing Feature Extraction Speaker Model Enrollment Stochastic Models GMM HMM Template Models DTW Distance Measures LAR Cepstrum LPCC MFCC A/D Conversion End point detection Pre-emphasis filter Segmentation Choice of features Differentiating factors b/w speakers include vocal tract shape and behavioral traits Features should have high inter-speaker and low intra speaker variation EE 516 Term Project, Fall 2003

Our Approach Silence Removal Cepstrum Coefficients Cepstral Normalization Long time average Polynomial Function Expansion Dynamic Time Warping Distance Computation Reference Template Preprocessing Feature Extraction Speaker model Matching EE 516 Term Project, Fall 2003

Silence Removal Preprocessing Feature Extraction Speaker model Matching EE 516 Term Project, Fall 2003

Pre-emphasis Preprocessing Feature Extraction Speaker model Matching EE 516 Term Project, Fall 2003

Segmentation Preprocessing Feature Extraction Speaker model Matching Short time analysis The speech signal is segmented into overlapping ‘Analysis Frames’ The speech signal is assumed to be stationary within this frame Q31 Q32 Q33 Q34 EE 516 Term Project, Fall 2003

Feature Representation Preprocessing Feature Extraction Speaker model Matching Speech signal and spectrum of two users uttering ‘ONE’ EE 516 Term Project, Fall 2003

Smoothened Signal Spectrum Vocal Tract modeling Preprocessing Feature Extraction Speaker model Matching Signal Spectrum Smoothened Signal Spectrum The smoothened spectrum indciates the locations of the formants of each user The smoothened spectrum is obtained by cepstral coefficients EE 516 Term Project, Fall 2003

Cepstral coefficients P[n] G(z) V(z) R(z) u[n] Pitch Av AN Preprocessing Feature Extraction Speaker model Matching D[] L[] D-1[] x1[n]*x2[n] x1‘[n]+x2‘[n] y1‘[n]+y2‘[n] y1[n]*y2[n] DFT[] LOG[] IDFT[] x1[n]*x2[n] X1(z)X2(z) x1‘[n]+x2‘[n] log(X1(z)) + log(X2(z)) EE 516 Term Project, Fall 2003

Speaker Model F1 = [a1…a10,b1…b10] F2 = [a1…a10,b1…b10] ……………. FN = [a1…a10,b1…b10] ……………. EE 516 Term Project, Fall 2003

Dynamic Time Warping Preprocessing Feature Extraction Speaker model Matching The DTW warping path in the n-by-m matrix is the path which has minimum average cumulative cost. The unmarked area is the constrain that path is allowed to go. EE 516 Term Project, Fall 2003

Results Distances are normalized w.r.t. length of the speech signal Intra speaker distance less than inter speaker distance Distance matrix is symmetric EE 516 Term Project, Fall 2003

Matlab Implementation EE 516 Term Project, Fall 2003

THANK YOU