In-car Speech Recognition Using Distributed Microphones Tetsuya Shinde Kazuya Takeda Fumitada Itakura Center for Integrated Acoustic Information Research.

In-car Speech Recognition Using Distributed Microphones Tetsuya Shinde Kazuya Takeda Fumitada Itakura Center for Integrated Acoustic Information Research Nagoya University

Background In-car Speech Recognition using multiple microphones –Since the position of the speaker and noise are not fixed, many sophisticated algorithms are difficult to apply. –Robust criterion for parameter optimizing is necessary. Multiple Regression of Log Spectra (MRLS) –Minimize the log spectral distance between the reference speech and the multiple regression results of the signals captured by distributed microphones. Filter parameter optimization for microphone array (M.L. Seltzer, 2002) –Maximize the likelihood by changing the filter parameters of a microphone array system for a reference utterance.

Sample utterances idling city area expressway

・・・ distant microphones Regression Weights Speech Signal MR Spectrum Analysis Spectrum Analysis Spectrum Analysis ・・・ log MFB output ・・・ Speech Recognition Approximate log MFB output Block diagram of MRLS

Modified spectral subtraction S N X1X1 XiXi XNXN HiHi GiGi Assume that power spectrum at each microphone position obey power sum rule.

Taylor expansion of log spectrum

Multiple regression of log spectrum Minimum error is given when

1 1 0 Optimal regression weights Reduction of freedom in optimization

Experimental Setup for Evaluation Recorded with 6 microphones Training data –Phonetically balanced sentences –6,000 sentences while idling –2,000 sentences while driving –200 speakers Test data –50 isolated word utterances –15 different driving conditions road (idling/ city area/ expressway) in-car (normal/ fan-low/ fan-hi/ CD play/ window open) –18 speakers top view side view distributed microphone positions

Recognition experiments HMMs: –Close-talking: close-talking microphone speech. –Distant-mic.: nearest distant microphone (mic. #6) speech. –MLLR: nearest distant mic. speech after MLLR adaptation. –MRLS: MRLS results obtained by the optimal regression weights for each training utterance. Test Utterances –Close-talking speech (CLS-TALK) –Distant-microphone speech (DIST) –Distant-microphone speech after MLLR adaptation (MLLR) –MRLS results of the 6 different weights optimized for: each utterance (OPT) each speaker (SPKER) each driving condition (DR) all training corpus (ALL)

Performance Comparison (average over 15 different conditions) MRLS

Clustering in-car sound environment Clustering in-car sound environment using a spectrum feature concatenating distributed microphone signals normalCDfan lofan hi window open Class 122241903298372 Class 244024771344 Class 325202354268435 Class 41113502289 Clustering Results

Adapting weights to sound environment Vary regression weights in accordance with the classification results. Same performance with speaker/condition dependent weights.

Summary Results –Log spectral multiple regression is effective for in-car speech recognition using distributed multiple microphones. –Especially, when the regression weights are trained for a particular driving condition, very high performance can be obtained. –Adapting weights to the diving condition improves the performance. Future works –Combing with microphone array.

In-car Speech Recognition Using Distributed Microphones Tetsuya Shinde Kazuya Takeda Fumitada Itakura Center for Integrated Acoustic Information Research.

Similar presentations

Presentation on theme: "In-car Speech Recognition Using Distributed Microphones Tetsuya Shinde Kazuya Takeda Fumitada Itakura Center for Integrated Acoustic Information Research."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

In-car Speech Recognition Using Distributed Microphones Tetsuya Shinde Kazuya Takeda Fumitada Itakura Center for Integrated Acoustic Information Research.

Similar presentations

Presentation on theme: "In-car Speech Recognition Using Distributed Microphones Tetsuya Shinde Kazuya Takeda Fumitada Itakura Center for Integrated Acoustic Information Research."— Presentation transcript:

Similar presentations

About project

Feedback