Download presentation
Presentation is loading. Please wait.
Published byRoland Dwayne Crawford Modified over 8 years ago
1
November 1, 2005IEEE MMSP 2005, Shanghai, China1 Adaptive Multi-Frame-Rate Scheme for Distributed Speech Recognition Based on a Half Frame-Rate Front-End Zheng-Hua Tan, Paul Dalsgaard and Børge Lindberg Aalborg University, Denmark
2
November 1, 2005IEEE MMSP 2005, Shanghai, China2 Outline Background and motivation Half frame-rate front-end Experimental evaluation Adaptive multi-frame-rate DSR scheme Experimental evaluation Conclusions
3
November 1, 2005IEEE MMSP 2005, Shanghai, China3 Distributed speech recognition (DSR) – automatic speech recognition (ASR) over mobile networks Networking introduced challenges: Bandwidth limitations Transmission errors Background and motivation Feature extraction ASR decoding Word s Speech Network constraints Source & channel coding Source & channel decoding
4
November 1, 2005IEEE MMSP 2005, Shanghai, China4 Existing solutions: Source coding to compress speech features, e.g. split vector quantization, discrete cosine transform Channel coding and error concealment to protect and recover speech features Our alternative solutions: in the front-end feature extraction stage based on the redundancies known to exist in full frame-rate (FFR) features half frame-rate (HFR) front-end adaptive multi-frame-rate scheme Background and motivation
5
November 1, 2005IEEE MMSP 2005, Shanghai, China5 Full frame-rate front-end Temporal correlation between speech features caused by Vocal tract inertia Overlapping in the feature extraction procedure: ms 00 4535252010 10 ms frame shift 15 ms overlap 25 ms frame length
6
November 1, 2005IEEE MMSP 2005, Shanghai, China6 Half frame-rate front-end 25 ms frame length & 20 ms frame shift 5 ms overlap But why is FFR front-end prevalent in ASR systems? And why is HFR front-end promising in DSR? ms 00 4535252010 20 ms frame shift 5 ms overlap 25 ms frame length
7
November 1, 2005IEEE MMSP 2005, Shanghai, China7 HFR front-end in DSR Observation: the performance degradation of DSR is marginal when packet loss occurs in short bursts on the condition that a proper error concealment technique is applied. so why not deliberately drop some packets (speech frames)? HFR + repetition ‘error concealment’: Prior to server-side recognition, each HFR feature vector is repeated once to construct the FFR vector equivalent.
8
November 1, 2005IEEE MMSP 2005, Shanghai, China8 Experiments Recognition accuracy (%) across the front-ends for three databases using FFR models Repetition of each HFR feature vector is critical! Danish digitsCity namesAurora 2 (TI digits) FFR99.7979.2999.05 HFR-Repetition99.5979.2998.98 HFR- NoRepetition 96.6861.2571.12
9
November 1, 2005IEEE MMSP 2005, Shanghai, China9 Derived DSR schemes The FFR-based ETSI-DSR standard The HFR front-end – half the bit rate FFR-based one-frame coding FFR-based interleaving24 No delay when transmission errors as opposed to the regular interleaving! FFR-based multiple description coding (MDC): odd- numbered & even-numbered feature vectors
10
November 1, 2005IEEE MMSP 2005, Shanghai, China10 Comparison of DSR schemes Robustness against transmission errors (Word Error Rate %) Aurora 2 database corrupted by GSM error pattern 3 (4 dB C/I ratio) Error-free MDC Interleaving24 Half frame-rate – Repetition ETSI-DSR Standard No CRC Which is the best? WER
11
November 1, 2005IEEE MMSP 2005, Shanghai, China11 Adaptive multi-frame-rate scheme Client Front-End Server Back-End Channel Encoder Channel Decoder incl. EC Split VQ Decoder Recogniser Words Speech Split VQ Coder FFR Front-End Error-Prone Channel Network Context HFR Front-End
12
November 1, 2005IEEE MMSP 2005, Shanghai, China12 Conclusions Half frame-rate front-end for DSR: half frame-rate, half bit-rate, half client-side computation. comparable performance, but repetition of HFR features is critical. Adaptive multi-frame-rate DSR scheme HFR one-frame coding Interleaving no transmission errors, no delay MDC a performance close to error-free channel
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.