Presentation is loading. Please wait.

Presentation is loading. Please wait.

Speaker Recognition Sharat.S.Chikkerur Center for Unified Biometrics and Sensors

Similar presentations


Presentation on theme: "Speaker Recognition Sharat.S.Chikkerur Center for Unified Biometrics and Sensors"— Presentation transcript:

1

2 Speaker Recognition Sharat.S.Chikkerur Center for Unified Biometrics and Sensors http://www.cubs.buffalo.edu

3 Speech Fundamentals Characterizing speech Content (Speech recognition) Signal representation (Vocoding) Waveform Parametric( Excitation, Vocal Tract) Signal analysis (Gender determination, Speaker recognition) Terminologies Phonemes : Basic discrete units of speech. English has around 42 phonemes. Language specific Types of speech Voiced speech Unvoiced speech(Fricatives) Plosives Formants

4 Speech production Speech production mechanismSpeech production model Impulse Train Generator Glottal Pulse Model G(z) Vocal Tract Model V(z) Radiation Model R(z) Noise source Pitch AvAv ANAN 17 cm

5 Nature of speech Spectrogram

6 Vocal Tract modeling Signal Spectrum Smoothened Signal Spectrum The smoothened spectrum indciates the locations of the formants of each user The smoothened spectrum is obtained by cepstral coefficients

7 Parametric Representations: Formants Formant Frequencies Characterizes the frequency response of the vocal tract Used in characterization of vowels Can be used to determine the gender

8 Parametric Representations:LPC Linear predictive coefficients Used in vocoding Spectral estimation 5 2 20 40 200

9 Parametric Representations:Cepstrum P[n]G(z) V(z)R(z) u[n] PitchAvAv ANAN D[]L[]D -1 [] x 1 [n]*x 2 [n] x 1 ‘[n]+x 2 ‘[n] y 1 ‘[n]+y 2 ‘[n] y 1 [n]*y 2 [n] DFT[]LOG[]IDFT[] x 1 [n]*x 2 [n] X 1 (z)X 2 (z) x1‘[n]+x2‘[n] log(X 1 (z)) + log(X 2 (z)) 5 10 40

10 Speaker Recognition Definition It is the method of recognizing a person based on his voice It is one of the forms of biometric identification Depends of speaker dependent characteristics. Speaker Recognition Speaker IdentificationSpeaker VerificationSpeaker Detection Text Dependent Text Independent Text Dependent Text Independent

11 Generic Speaker Recognition System Preprocessing Feature Extraction Pattern Matching Preprocessing Feature Extraction Speaker Model Verification Enrollment A/D Conversion End point detection Pre-emphasis filter Segmentation LAR Cepstrum LPCC MFCC Stochastic Models GMM HMM Template Models DTW Distance Measures Speech signal Analysis FramesFeature Vector Score Choice of features Differentiating factors b/w speakers include vocal tract shape and behavioral traits Features should have high inter-speaker and low intra speaker variation

12 Our Approach Silence Removal Cepstrum Coefficients Cepstral NormalizationLong time average Polynomial Function Expansion Dynamic Time Warping Distance Computation Reference Template Preprocessing Feature Extraction Speaker model Matching

13 Silence Removal Preprocessing Feature Extraction Speaker model Matching

14 Pre-emphasis Preprocessing Feature Extraction Speaker model Matching

15 Segmentation Preprocessing Feature Extraction Speaker model Matching Short time analysis The speech signal is segmented into overlapping ‘Analysis Frames’ The speech signal is assumed to be stationary within this frame Q 31 Q 32 Q 33 Q 34

16 Feature Representation Preprocessing Feature Extraction Speaker model Matching Speech signal and spectrum of two users uttering ‘ONE’

17 Speaker Model F 1 = [a1…a10,b1…b10] F 2 = [a1…a10,b1…b10] F N = [a1…a10,b1…b10] …………….

18 Dynamic Time Warping Preprocessing Feature Extraction Speaker model Matching The DTW warping path in the n-by-m matrix is the path which has minimum average cumulative cost. The unmarked area is the constrain that path is allowed to go.

19 Results Distances are normalized w.r.t. length of the speech signal Intra speaker distance less than inter speaker distance Distance matrix is symmetric

20 Matlab Implementation

21 THANK YOU


Download ppt "Speaker Recognition Sharat.S.Chikkerur Center for Unified Biometrics and Sensors"

Similar presentations


Ads by Google