Department of Communication Technology 30/08/ A Comparative Study of Feature-Domain Error Concealment Techniques for Distributed Speech Recognition - Robust2004 workshop, Norwich, UK Zheng-Hua Tan, Børge Lindberg and Paul Dalsgaard {zt, bli, Aalborg University, Denmark
Department of Communication Technology 30/08/ Agenda Feature-domain EC techniques –repetition –linear interpolation –subvector concealment Speech recognition experiments Comparative study –MFCC features –Euclidean and DP distances –HMM state durations
Department of Communication Technology 30/08/ Motivation Why to do this work? A variety of EC techniques for DSR occur –A survey Repetition vs. interpolation –Which is better? What makes an EC technique good for recognition?
Department of Communication Technology 30/08/ EC techniques Two classes of EC techniques Client based EC –e.g. retransmission and forward error control (FEC) Server based EC (the redundancy in the transmitted signal is exploited) –in the model-domain Weighted Viterbi, missing feature theory –in the feature-domain Insertion based techniques: splicing, substitution, repetition Interpolation based techniques: linear interpolation Subvector concealment
Department of Communication Technology 30/08/ Subvector concealment Observation1: conventional EC schemes share a common characteristic - conducting EC at the vector level Observation 2: within erroneous vectors, a substantial number of subvectors are often error-free Subvector based EC
Department of Communication Technology 30/08/ Subvector concealment (cont.) The ETSI-DSR standard –Feature-pair and SVQ: The n’th vector is –Frame-pair: Feature-pair Subvector
Department of Communication Technology 30/08/ Buffering matrix Consistency test Subvector concealment (cont.)
Department of Communication Technology 30/08/ Consistency matrix and subvector concealment 0 for inconsistent 1 for consistent C = Subvector concealment (cont.)
Department of Communication Technology 30/08/ Outline Feature-domain EC techniques –repetition –linear interpolation –subvector concealment Speech recognition experiments Comparative study –MFCC features –Euclidean and DP distances –HMM state durations
Department of Communication Technology 30/08/ Recognition experiments two tasks: Danish digits and city names the HTK based reference recogniser the realistic GSM error patterns (EP) : –EP1, 10 dB (C/I ratios ) –EP2, 7 dB –EP3, 4dB
Department of Communication Technology 30/08/ Recognition experiments (cont.) The %WER for three EC techniques (a) Danish digits (b) city names
Department of Communication Technology 30/08/ Outline Feature-domain EC techniques –repetition –linear interpolation –subvector concealment Speech recognition experiments Comparative study –MFCC features –Euclidean and DP distances –HMM state durations
Department of Communication Technology 30/08/ Comparative study - MFCC features Transmission errors of a random BER value of 2% is used. The original error-free MFCC features are directly compared with the features corrupted with errors but concealed either –by repetition –by interpolation –by subvector concealment
Department of Communication Technology 30/08/ Comparative study - MFCC features (cont.) MFCC c0 Two observations
Department of Communication Technology 30/08/ Comparative study - MFCC features (cont.) Interpolation: straight line – constant value segment – zero value segment
Department of Communication Technology 30/08/ Comparative study - MFCC features (cont.) Repetition generated feature curves display similar shapes even though there are some displacements along the time axis as compared to the iMFCC feature. However, the DP embedded in the Viterbi algorithm makes this displacement relatively irrelevant.
Department of Communication Technology 30/08/ Comparative study - DP distances –The Euclidean and DP distances between c 0 of MFCC and MFCC generated by different EC techniques for word “et” –General expectation: interpolation performs better Signal reconstruction vs. speech recognition Euclidean distance vs. DP distance
Department of Communication Technology 30/08/ Comparative study - DP distances (cont.) Over 328 testing utterances Number of smaller distances Subvector EC always gives the smallest for both distances.
Department of Communication Technology 30/08/ Comparative study - HMM state durations Viterbi decoding tracks the HMM state alignment The average state-durations Two facts are observed: –repetition vs. interpolation –subvector vs. error-free
Department of Communication Technology 30/08/ Summary Three different EC techniques compared –the simple repetition technique is as good as or even better than linear interpolation –subvector concealment performs best Comparative study –MFCC features –Euclidean and DP distances –HMM state durations
Department of Communication Technology 30/08/ A Comparative Study of Feature-Domain Error Concealment Techniques for Distributed Speech Recognition Thanks!