Presentation is loading. Please wait.

Presentation is loading. Please wait.

Speech recognition in MUMIS Eric Sanders (KUN) March 2003.

Similar presentations

Presentation on theme: "Speech recognition in MUMIS Eric Sanders (KUN) March 2003."— Presentation transcript:


2 Speech recognition in MUMIS Eric Sanders (KUN) March 2003

3 People involved at KUN Helmer Strik Judith Kessens Mirjam Wester Janienke Sturm Eric Sanders Febe de Wet Paul Tielen

4 Overview Speech data Baseline recognition Adding data Noise robustness Word types Conclusions

5 Examples of Data Dutch “op _t ogenblik wordt in dit stadion de opstelling voorgelezen” English “and they wanna make the change before the corner” German “und die beiden Tore die die Hollaender bekommen hat haben” From Yugoslavia-The Netherlands

6 Speech Data All data LanguageDutchEnglishGerman # matches6321 # words40,29634,684127,265

7 Speech Data MatchDutchEnglishGerman Yugoslavia – The Netherlands5,92210,1883,998 England – Germany5,79813,4887,280 Test data (#words)

8 Baseline recognition PMs:- trained on the other test match Lex:- based on the other test set - match specific words added LM: - category LM - based on the other test match - match specific words added

9 Baseline recognition

10 Adding Data Extra training data: Dutch = 4 matches German = 19 matches English = 1 match Adding training data to train the lexicon and the language models (phone models trained on 1 match)

11 Adding Data (German)

12 Noise Robustness Dutch English German

13 Noise Robustness

14 Matching acoustic properties of train and test material Training SNR dependent phone models Applying noise robust feature extraction: Histogram Normalisation & FTNR Possible solutions:

15 Noise Robustness YUG-NL, very noisy

16 Word Types Not all words are equally important for an information retrieval task Categories: - function words (prepositions, pronouns) - application specific words (player names) - other content words WERs for different categories

17 Word Types

18 Conclusions SNR values explain the WERs to a large extent More data is not necessarily better Applying noise robust features leads to best results Overall WERs are very high, but application specific words are recognised relatively well

19 The end

Download ppt "Speech recognition in MUMIS Eric Sanders (KUN) March 2003."

Similar presentations

Ads by Google