Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sharat.S.Chikkerur S.Anand Mantravadi Rajeev.K.Srinivasan

Similar presentations


Presentation on theme: "Sharat.S.Chikkerur S.Anand Mantravadi Rajeev.K.Srinivasan"— Presentation transcript:

1 Sharat.S.Chikkerur S.Anand Mantravadi Rajeev.K.Srinivasan
Speaker Recognition Sharat.S.Chikkerur S.Anand Mantravadi Rajeev.K.Srinivasan

2 Speaker Identification
Speaker Recognition Definition It is the method of recognizing a person based on his voice It is one of the forms of biometric identification Depends of speaker dependent characteristics. Speaker Recognition Speaker Identification Speaker Verification Speaker Detection Text Dependent Independent EE 516 Term Project, Fall 2003

3 Speech production Speech production mechanism Speech production model
Impulse Train Generator Glottal Pulse Model G(z) Vocal Tract V(z) Radiation R(z) Noise source Pitch Av AN Speech production mechanism Speech production model EE 516 Term Project, Fall 2003

4 Generic Speaker Recognition System
Speech signal Score Analysis Frames Feature Vector Preprocessing Feature Extraction Pattern Matching Verification Preprocessing Feature Extraction Speaker Model Enrollment Stochastic Models GMM HMM Template Models DTW Distance Measures LAR Cepstrum LPCC MFCC A/D Conversion End point detection Pre-emphasis filter Segmentation Choice of features Differentiating factors b/w speakers include vocal tract shape and behavioral traits Features should have high inter-speaker and low intra speaker variation EE 516 Term Project, Fall 2003

5 Our Approach Silence Removal Cepstrum Coefficients
Cepstral Normalization Long time average Polynomial Function Expansion Dynamic Time Warping Distance Computation Reference Template Preprocessing Feature Extraction Speaker model Matching EE 516 Term Project, Fall 2003

6 Silence Removal Preprocessing Feature Extraction Speaker model
Matching EE 516 Term Project, Fall 2003

7 Pre-emphasis Preprocessing Feature Extraction Speaker model Matching
EE 516 Term Project, Fall 2003

8 Segmentation Preprocessing Feature Extraction Speaker model Matching
Short time analysis The speech signal is segmented into overlapping ‘Analysis Frames’ The speech signal is assumed to be stationary within this frame Q31 Q32 Q33 Q34 EE 516 Term Project, Fall 2003

9 Feature Representation
Preprocessing Feature Extraction Speaker model Matching Speech signal and spectrum of two users uttering ‘ONE’ EE 516 Term Project, Fall 2003

10 Smoothened Signal Spectrum
Vocal Tract modeling Preprocessing Feature Extraction Speaker model Matching Signal Spectrum Smoothened Signal Spectrum The smoothened spectrum indciates the locations of the formants of each user The smoothened spectrum is obtained by cepstral coefficients EE 516 Term Project, Fall 2003

11 Cepstral coefficients
P[n] G(z) V(z) R(z) u[n] Pitch Av AN Preprocessing Feature Extraction Speaker model Matching D[] L[] D-1[] x1[n]*x2[n] x1‘[n]+x2‘[n] y1‘[n]+y2‘[n] y1[n]*y2[n] DFT[] LOG[] IDFT[] x1[n]*x2[n] X1(z)X2(z) x1‘[n]+x2‘[n] log(X1(z)) + log(X2(z)) EE 516 Term Project, Fall 2003

12 Speaker Model F1 = [a1…a10,b1…b10] F2 = [a1…a10,b1…b10] …………….
FN = [a1…a10,b1…b10] ……………. EE 516 Term Project, Fall 2003

13 Dynamic Time Warping Preprocessing Feature Extraction Speaker model
Matching The DTW warping path in the n-by-m matrix is the path which has minimum average cumulative cost. The unmarked area is the constrain that path is allowed to go. EE 516 Term Project, Fall 2003

14 Results Distances are normalized w.r.t. length of the speech signal
Intra speaker distance less than inter speaker distance Distance matrix is symmetric EE 516 Term Project, Fall 2003

15 Matlab Implementation
EE 516 Term Project, Fall 2003

16 THANK YOU


Download ppt "Sharat.S.Chikkerur S.Anand Mantravadi Rajeev.K.Srinivasan"

Similar presentations


Ads by Google