Speech recognition in MUMIS Eric Sanders (KUN) March 2003.

Slides:



Advertisements
Similar presentations
Matthias Gruhne, Page 1 Fraunhofer Institut Integrierte Schaltungen Robust Audio Identification for Commercial Applications Matthias.
Advertisements

Multilinguality & Semantic Search Eelco Mossel (University of Hamburg) Review Meeting, January 2008, Zürich.
1 Using the HTK speech recogniser to analyse prosody in a corpus of German spoken learners English Toshifumi Oba, Eric Atwell University of Leeds, School.
1 Finding bibliographic information about books on the WWW: an evaluation of available sources Maike Somers Librarian, Public Library, Niel Paul Nieuwenhuysen.
Speech recognition in MUMIS Judith Kessens, Mirjam Wester & Helmer Strik.
Introduction to BLaRKs Helmer Strik Dept. of Linguistics Centre for Language and Speech Technology (CLST) Radboud University Nijmegen, the Netherlands.
Discriminative Training in Speech Processing Filipp Korkmazsky LORIA.
Robust Speech recognition V. Barreaud LORIA. Mismatch Between Training and Testing n mismatch influences scores n causes of mismatch u Speech Variation.
Markpong Jongtaveesataporn † Chai Wutiwiwatchai ‡ Koji Iwano † Sadaoki Furui † † Tokyo Institute of Technology, Japan ‡ NECTEC, Thailand.
Research & Development ICASSP' Analysis of Model Adaptation on Non-Native Speech for Multiple Accent Speech Recognition D. Jouvet & K. Bartkova France.
Combining Heterogeneous Sensors with Standard Microphones for Noise Robust Recognition Horacio Franco 1, Martin Graciarena 12 Kemal Sonmez 1, Harry Bratt.
Advanced Speech Enhancement in Noisy Environments
Frederico Rodrigues and Isabel Trancoso INESC/IST, 2000 Robust Recognition of Digits and Natural Numbers.
Author :Panikos Heracleous, Tohru Shimizu AN EFFICIENT KEYWORD SPOTTING TECHNIQUE USING A COMPLEMENTARY LANGUAGE FOR FILLER MODELS TRAINING Reporter :
Distribution-Based Feature Normalization for Robust Speech Recognition Leveraging Context and Dynamics Cues Yu-Chen Kao and Berlin Chen Presenter : 張庭豪.
PERFORMANCE ANALYSIS OF AURORA LARGE VOCABULARY BASELINE SYSTEM Naveen Parihar, and Joseph Picone Center for Advanced Vehicular Systems Mississippi State.
SPEECH RECOGNITION BASED ON BAYESIAN NETWORKS WITH ENERGY AS AN AUXILIARY VARIABLE Jaume Escofet Carmona IDIAP, Martigny, Switzerland UPC, Barcelona, Spain.
Signal Processing Institute Swiss Federal Institute of Technology, Lausanne 1 Feature selection for audio-visual speech recognition Mihai Gurban.
Communications & Multimedia Signal Processing Analysis of the Effects of Train noise on Recognition Rate using Formants and MFCC Esfandiar Zavarehei Department.
MODULATION SPECTRUM EQUALIZATION FOR ROBUST SPEECH RECOGNITION Source: Automatic Speech Recognition & Understanding, ASRU. IEEE Workshop on Author.
Non-native Speech Languages have different pronunciation spaces
LORIA Irina Illina Dominique Fohr Chania Meeting May 9-10, 2007.
LYU0103 Speech Recognition Techniques for Digital Video Library Supervisor : Prof Michael R. Lyu Students: Gao Zheng Hong Lei Mo.
LORIA Irina Illina Dominique Fohr Christophe Cerisara Torino Meeting March 9-10, 2006.
Visual Speech Recognition Using Hidden Markov Models Kofi A. Boakye CS280 Course Project.
Communications & Multimedia Signal Processing Analysis of Effects of Train/Car noise in Formant Track Estimation Qin Yan Department of Electronic and Computer.
Automatic Transcript Generation Helmer Strik A 2 RT Dept. of Language & Speech University of Nijmegen.
1M4 speech recognition University of Sheffield M4 speech recognition Martin Karafiát*, Steve Renals, Vincent Wan.
1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.
Speaker Recognition By Afshan Hina.
Classifying Tags Using Open Content Resources Simon Overell, Borkur Sigurbjornsson & Roelof van Zwol WSDM ‘09.
DFKI GmbH, , R. Karger Indo-German Workshop on Language Technologies Reinhard Karger, M.A. Deutsches Forschungszentrum für Künstliche Intelligenz.
An Analysis of the Aurora Large Vocabulary Evaluation Authors: Naveen Parihar and Joseph Picone Inst. for Signal and Info. Processing Dept. Electrical.
Speech Recognition and Machine Translation Stephan Kanthak AIXPLAIN AG, Aachen, Germany.
CS 396 Pattern Recognition Project Language Classifier v1.0 By Paul Troncone, David Keiper, Eugene Schvarts.
17.0 Distributed Speech Recognition and Wireless Environment References: 1. “Quantization of Cepstral Parameters for Speech Recognition over the World.
Dutch HLT Resources: from BLARK to Priority Lists Helmer Strik, Diana Binnenpoorte, Janienke Sturm, Folkert de Vriend, and Catia Cucchiarini* A 2 RT, Dept.
LOG-ENERGY DYNAMIC RANGE NORMALIZATON FOR ROBUST SPEECH RECOGNITION Weizhong Zhu and Douglas O’Shaughnessy INRS-EMT, University of Quebec Montreal, Quebec,
Sound-Event Partitioning and Feature Normalization for Robust Sound-Event Detection 2 Department of Electronic and Information Engineering The Hong Kong.
Improving Speech Modelling Viktoria Maier Supervised by Prof. Hynek Hermansky.
Copyright © 2015 by Educational Testing Service. 1 Feature Selection for Automated Speech Scoring Anastassia Loukina, Klaus Zechner, Lei Chen, Michael.
Automatic Speech Recognition: Conditional Random Fields for ASR Jeremy Morris Eric Fosler-Lussier Ray Slyh 9/19/2008.
From last time …. ASR System Architecture Pronunciation Lexicon Signal Processing Probability Estimator Decoder Recognized Words “zero” “three” “two”
UA in ImageCLEF 2005 Maximiliano Saiz Noeda. Index System  Indexing  Retrieval Image category classification  Building  Use Experiments and results.
‘Missing Data’ speech recognition in reverberant conditions using binaural interaction Sue Harding, Jon Barker and Guy J. Brown Speech and Hearing Research.
ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Recognition of foreign names spoken by native speakers Frederik Stouten & Jean-Pierre Martens Ghent University.
Robust Feature Extraction for Automatic Speech Recognition based on Data-driven and Physiologically-motivated Approaches Mark J. Harvilla1, Chanwoo Kim2.
Philip Jackson, Boon-Hooi Lo and Martin Russell Electronic Electrical and Computer Engineering Models of speech dynamics for ASR, using intermediate linear.
Performance Analysis of Advanced Front Ends on the Aurora Large Vocabulary Evaluation Authors: Naveen Parihar and Joseph Picone Inst. for Signal and Info.
Phonetic features in ASR Kurzvortrag Institut für Kommunikationsforschung und Phonetik Bonn 17. Juni 1999 Jacques Koreman Institute of Phonetics University.
Week8 Fatemeh Yazdiananari.  Fixed the issues with classifiers  We retrained SVMs with the new UCF101 histograms  On temporally untrimmed videos: ◦
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
Enhancement of Speech in Noisy Conditions Progress Presentation Paul Coffey.
Analysis of Experiments on Hybridization of different approaches in mono and cross-language information retrieval DAEDALUS – Data, Decisions and Language,
January 2001RESPITE workshop - Martigny Multiband With Contaminated Training Data Results on AURORA 2 TCTS Faculté Polytechnique de Mons Belgium.
Speech Enhancement based on
1 7-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches Recognition Theories Bayse Rule Simple Language Model P(A|W) Network Types.
Automatic Speech Recognition
Speaker : chia hua Authors : Long Qin, Ming Sun, Alexander Rudnicky
ENGLISH MORPHOLOGY Week 1.
Automatic Speech Recognition: Conditional Random Fields for ASR
Human Speech Perception and Feature Extraction
مهندسي و مديريت زنجيره‌هاي تامين
T H E P U B G P R O J E C T.
A maximum likelihood estimation and training on the fly approach
汉语连续语音识别 年1月4日访北京工业大学 973 Project 2019/4/17 汉语连续语音识别 年1月4日访北京工业大学 郑 方 清华大学 计算机科学与技术系 语音实验室
Tema 4 REPORTED SPEECH.
Idiap Research Institute University of Edinburgh
Step 3 of Selling Process
Deep Neural Network Language Models
Presentation transcript:

Speech recognition in MUMIS Eric Sanders (KUN) March 2003

People involved at KUN Helmer Strik Judith Kessens Mirjam Wester Janienke Sturm Eric Sanders Febe de Wet Paul Tielen

Overview Speech data Baseline recognition Adding data Noise robustness Word types Conclusions

Examples of Data Dutch “op _t ogenblik wordt in dit stadion de opstelling voorgelezen” English “and they wanna make the change before the corner” German “und die beiden Tore die die Hollaender bekommen hat haben” From Yugoslavia-The Netherlands

Speech Data All data LanguageDutchEnglishGerman # matches6321 # words40,29634,684127,265

Speech Data MatchDutchEnglishGerman Yugoslavia – The Netherlands5,92210,1883,998 England – Germany5,79813,4887,280 Test data (#words)

Baseline recognition PMs:- trained on the other test match Lex:- based on the other test set - match specific words added LM: - category LM - based on the other test match - match specific words added

Baseline recognition

Adding Data Extra training data: Dutch = 4 matches German = 19 matches English = 1 match Adding training data to train the lexicon and the language models (phone models trained on 1 match)

Adding Data (German)

Noise Robustness Dutch English German

Noise Robustness

Matching acoustic properties of train and test material Training SNR dependent phone models Applying noise robust feature extraction: Histogram Normalisation & FTNR Possible solutions:

Noise Robustness YUG-NL, very noisy

Word Types Not all words are equally important for an information retrieval task Categories: - function words (prepositions, pronouns) - application specific words (player names) - other content words WERs for different categories

Word Types

Conclusions SNR values explain the WERs to a large extent More data is not necessarily better Applying noise robust features leads to best results Overall WERs are very high, but application specific words are recognised relatively well

The end