Methodologies for improving the g2p conversion of Dutch names Henk van den Heuvel, Nanneke Konings (CLST, Radboud Universiteit Nijmegen) Jean-Pierre Martens.

Slides:

Advertisements

Similar presentations

INTER-VIEWs Curation of Interview Data 1 feb. – 1 nov CLST, Nijmegen,, Henk van den Heuvel Centre for.

Advertisements

CNTS LTG (UA) (i) Phoneme-to-Grapheme (ii) Transcription-to-Subtitles Bart Decadt Erik Tjong Kim Sang Walter Daelemans.

A Unified Structure for Dutch Dialect Dictionary Data Folkert de Vriend 1, Lou Boves 1,2, Henk van den Heuvel 1, Roeland van Hout 2, Joep Kruijsen 2, Jos.

DISCO Development and Integration of Speech technology into Courseware for language learning Stevin project partners: CLST, UA, UTN, Polderland Radboud.

The Chinese Room: Understanding and Correcting Machine Translation This work has been supported by NSF Grants IIS Solution: The Chinese Room Conclusions.

New market instruments for RES-E to meet the 20/20/20 targets Sophie Dourlens-Quaranta, Technofi (Market4RES WP4 leader) Market4RES public kick-off Brussels,

Speech Recognition Part 3 Back end processing. Speech recognition simplified block diagram Speech Capture Speech Capture Feature Extraction Feature Extraction.

Jing-Shin Chang National Chi Nan University, IJCNLP-2013, Nagoya 2013/10/15 ACLCLP – Activities ( ) & Text Corpora.

Automatic Prosodic Event Detection Using Acoustic, Lexical, and Syntactic Evidence Sankaranarayanan Ananthakrishnan, Shrikanth S. Narayanan IEEE 2007 Min-Hsuan.

Universität des Saarlandes Seminar: Recent Advances in Parsing Technology Winter Semester Jesús Calvillo.

National Institute for Statistics and Geography (INEGI) is, from 2008, an autonomous institute in Technical and Managing matters. According to Mexican.

“Improving Pronunciation Dictionary Coverage of Names by Modelling Spelling Variation” - Justin Fackrell and Wojciech Skut Presented by Han.

Development of Automatic Speech Recognition and Synthesis Technologies to Support Chinese Learners of English: The CUHK Experience Helen Meng, Wai-Kit.

Languages & The Media, 4 Nov 2004, Berlin 1 Multimodal multilingual information processing for automatic subtitle generation: Resources, Methods and System.

1 Learning from Behavior Performances vs Abstract Behavior Descriptions Tolga Konik University of Michigan.

Construction of phoneme-to-phoneme converters

Lecture 13 Revision IMS Systems Analysis and Design.

Gobalisation Week 8 Text processes part 2 Spelling dictionaries Noisy channel model Candidate strings Prior probability and likelihood Lab session: practising.

Information Modeling: The process and the required competencies of its participants Paul Frederiks Theo van der Weide.

Computational Language Andrew Hippisley. Computational Language Computational language and AI Language engineering: applied computational language Case.

Text-To-Speech Synthesis An Overview. What is a TTS System  Goal A system that can read any text Automatic production of new sentences Not just audio.

Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.

The LC-STAR project (IST ) Objectives: Track I (duration 2 years) Specification and creation of large word lists and lexica suited for flexible.

Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.

Spelling : Best Practices Kristan Bachner Ashley Smith Michele Renner By:

Artificial Intelligence Research Centre Program Systems Institute Russian Academy of Science Pereslavl-Zalessky Russia.

TEXT TO SPEECH SYNTHESIS

Toshiba Update 04/09/2006 Data-Driven Prosody and Voice Quality Generation for Emotional Speech Zeynep Inanoglu & Steve Young Machine Intelligence Lab.

Semantic and phonetic automatic reconstruction of medical dictations STEFAN PETRIK, CHRISTINA DREXEL, LEO FESSLER, JEREMY JANCSARY, ALEXANDRA KLEIN,GERNOT.

Machine Transliteration T BHARGAVA REDDY (Knowledge sharing)

Computational Methods to Vocalize Arabic Texts H. Safadi*, O. Al Dakkak** & N. Ghneim**

Rapid and Accurate Spoken Term Detection Owen Kimball BBN Technologies 15 December 2006.

Lemmatization Tagging LELA /20 Lemmatization Basic form of annotation involving identification of underlying lemmas (lexemes) of the words in.

Midterm Review Spoken Language Processing Prof. Andrew Rosenberg.

PrepTalk a Preprocessor for Talking book production Ted van der Togt, Dedicon, Amsterdam.

1 Speech Perception 3/30/00. 2 Speech Perception How do we perceive speech? –Multifaceted process –Not fully understood –Models & theories attempt to.

Machine Learning in Spoken Language Processing Lecture 21 Spoken Language Processing Prof. Andrew Rosenberg.

Learning Phonetic Similarity for Matching Named Entity Translation and Mining New Translations Wai Lam, Ruizhang Huang, Pik-Shan Cheung ACM SIGIR 2004.

SLTU 2014 – 4th Workshop on Spoken Language Technologies for Under-resourced Languages St. Petersburg, Russia KIT – University of the State.

CROSSMARC Web Pages Collection: Crawling and Spidering Components Vangelis Karkaletsis Institute of Informatics & Telecommunications NCSR “Demokritos”

PETRA – the Personal Embedded Translation and Reading Assistant Werner Winiwarter University of Vienna InSTIL/ICALL Symposium 2004 June 17-19, 2004.

Ajit Kurup, Imperial College London P. Bonnal, B. Daudin, J. De Jonghe, CERN An Accelerator Design Tool for the International Design Study for the Neutrino.

LREC 2008, Marrakech, Morocco1 Automatic phone segmentation of expressive speech L. Charonnat, G. Vidal, O. Boëffard IRISA/Cordial, Université de Rennes.

Modeling and simulation of systems Model building Slovak University of Technology Faculty of Material Science and Technology in Trnava.

Mining Reference Tables for Automatic Text Segmentation Eugene Agichtein Columbia University Venkatesh Ganti Microsoft Research.

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE symposium - 6/2/091 Language modelling (word FST) Operational model for categorizing mispronunciations.

05/03/03-06/03/03 7 th Meeting Edinburgh Naïve Bayes Fact Extractor (NBFE) v.1.

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent CAIR Twente (10/10/2003) Audio Indexing as a first step in an Audio Information Retrieval System Jean-Pierre.

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Recognition of foreign names spoken by native speakers Frederik Stouten & Jean-Pierre Martens Ghent University.

Introduction to Neural Networks and Example Applications in HCI Nick Gentile.

A Fully Annotated Corpus of Russian Speech

Hendrik J Groenewald Centre for Text Technology (CTexT™) Research Unit: Languages and Literature in the South African Context North-West University, Potchefstroom.

© NCSR, Frascati, July 18-19, 2002 WP1: Plan for the remainder (1) Ontology Ontology  Use of PROTÉGÉ to generate ontology and lexicons for the 1 st domain.

Advanced Accounting Information Systems

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/091 Objective intelligibility assessment of pathological speakers Catherine Middag,

NLPRS20011 Grapheme-to- Phoneme for Thai Pongthai Tarsaku.

CSA4050: Advanced Topics in NLP Computational Morphology II Introduction 2 Level Morphology.

October 10, 2003BLTS Kickoff Meeting1 Transfer with Strong Decoding Learning Module Transfer Rules {PP,4894} ;;Score: PP::PP [NP POSTP] -> [PREP.

Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:

S1S1 S2S2 S3S3 8 October 2002 DARTS ATraNoS Automatic Transcription and Normalisation of Speech Jacques Duchateau, Patrick Wambacq, Johan Depoortere,

A Simple English-to-Punjabi Translation System By : Shailendra Singh.

Search and Annotation Tool for Oral History INTER-VIEWS Henk van den Heuvel, Centre for Language and Speech Technology (CLST) Radboud University Nijmegen,

Linguistic Phonics Coordinator’s Training Pack 2.

TKT Tutoring Class Phonology.

Revisiting common errors Tutor: Prof. Cecilia A. Zemborain Student: Ma. De los Ángeles Svidersky.

A NONPARAMETRIC BAYESIAN APPROACH FOR

Linguistic knowledge for Speech recognition

 Corpus Formation [CFT]  Web Pages Annotation [Web Annotator]  Web sites detection [NEACrawler]  Web pages collection [NEAC]  IE Remote.

ADDITION OF IPA TRANSCRIPTION TO THE BELARUSIAN NOOJ MODULE

Emre Yılmaz, Henk van den Heuvel and David A. van Leeuwen

Presentation transcript:

Methodologies for improving the g2p conversion of Dutch names Henk van den Heuvel, Nanneke Konings (CLST, Radboud Universiteit Nijmegen) Jean-Pierre Martens (ELIS-DSSP, Universiteit Gent)

NVFW, 9 June AUTONOMATA project (STEVIN, call1) Deliverables –Name transcription tools described in this presentation –Spoken name corpus to support research in ANR Partners

NVFW, 9 June Phonemic transcription of names Good name transcription imperative for success of many speech-based services directory assistance, car navigation, etc. Problem general purpose g2ps often perform poorly on names Possible solutions –train dedicated g2ps from name type specific pronunciation dictionaries –train dedicated p2ps that aim to correct mistakes of one common general purpose g2p

NVFW, 9 June Advantages p2p must only model phenomena that are specific for envisaged name type  compact p2p converter  small training lexicon to reach asymptotic performance p2p has access to suprasegmental knowledge being output of g2p (syllabification/stress assignment)

NVFW, 9 June Two approaches 1.Inductive: data-driven MBL 2.Deductive: knowledge-driven, rules

NVFW, 9 June Our data NLFL First names19,2527,253 Second names71,98730,025 Toponyms38,12194,727 Sizes test sets First names: 3000 Second names: 8380 Toponyms: 12480

NVFW, 9 June System architecture initial phonemic transcription initial phonemic transcription orthography general purpose g2p converter p2p converter final phonemic transcription final phonemic transcription automatically learned stochastic correction rules

NVFW, 9 June Stochastic correction rules Rule format –IF (input = F) & (context = C)  (output = F’) with probability = P –F, F’ = phonemic patterns (sequences of phonemes) –C= features describing phonemic + orthographic cntxt Rule types –SS: stress substitution rules (no stress = stress level 0) –PSD: phonemic pattern substitution & deletion rules –PI : phonemic pattern insertion rules

NVFW, 9 June Rule learning process (4 steps) initial transcription initial transcription correct transcription correct transcription orthography Rule Learner (4) Rule Learner (4) Example Generator (3) Example Generator (3) Aligner (1) Aligner (1) Aligner (1) Aligner (1) Transformation Generator (2) Transformation Generator (2)

NVFW, 9 June (1) Align transcriptions Two types of alignments –p-to-p : align initial (I) to correct (C) transcription –p-to-g : align initial (I) to orthographic (O) transcription Unified approach –work with correspondence models per symbol of I : associated set of 2 nd transcription units –I-phoneme can be lined up with any unit of other transcription cost is low if unit belongs to associated set of this phoneme –I-prosodic mark can only be lined up with a unit of its associated set e.g. for p-to-p: associated set of prosodic marks

NVFW, 9 June (2) Generate transformations from I-to-C Step 1 : stress mark substitutions I: ’ r o. d $ ~. b A ~. ’2 x l a n C: ’ r o. d $ n. b A x. ~ ~ l a n Step 2 : remove stress marks I: r o. d $ ~. b A ~. x l a n C: r o. d $ n. b A x. ~ l a n Step 3: phonemic pattern transformations –left-to-right longest mismatch (prosodic marks ignored) –omit empty cells (~)  /$ /  /$ n/ and /. x/  /x. /

NVFW, 9 June (3) Generate training examples (PSD only) Segment I in rule inputs & phonemic units –transformation list = /.x/  /x./, /$/  /$n/ –segmentation result (rule inputs in red) I: r o. d $. b A. x l a n –outputs from I-to-C alignment Determine graphemes lined up with rule input I: r o. d $ ~. b A. x ~ l a n O: r o ~ d e n ~ b a ~ c h l aa n Extract linguistic features describing context –L/R phonemic symbol, stress level, 1 st grapheme, …

NVFW, 9 June (4) Learning rewrite rules Goal –train rules in hierarchy (decision tree) –one decision tree per focus

NVFW, 9 June (4) Learning rewrite rules R1=/ t /? L1=SVWL? YN / s. /: 1.0 / s. / SVWL: short vowel YN / s. /: 0.8 /. s / : 0.2 / s. /: 0.14 /. s / : 0.86

NVFW, 9 June Deductive approach Use human expert knowledge to qualify and correct transcription errors –Knowledge sources: Morphology, phonology, etymology, language origin Method: –Compare g2p and correct transcriptions (train set) –Quantify errors (Relative rule application rate) –Qualify underlying causes for the errors –Formulate corresponding generic rules (no probs) –Implement these in FONPARS –Evaluate the result (test set)

NVFW, 9 June Implemented deductive rules Removal of superfluous stresses –E.g. ‘van’ ‘de’ Expansion of contractions –E.g. Ciprianussteeg, Trigoniaerf Frisian names –E.g. ’-dyk’, ‘-wyk’ Syllabification –E.g. diminutives (k.j$ ->.kj$) French names /n/-deletion –E.g. Van Brienenoordbrug Degemination –E.g. Holland, Wellekens

NVFW, 9 June Results on Dutch toponyms (test set) –Inductive approach somewhat better than deductive approach –At the cost of more rules –Room for further improvement Method#RulesWER (%)PER (%) g2p only g2p + inductive> g2p + deductive

NVFW, 9 June Future work More name types (first names, family names) Deductive methodology in synergy with inductive approach  deductive approach AFTER inductive approach  define features to expose causes to learning tools e.g. syllabification errors by not respecting morphological integrity of entities such as ‘kamp’, ‘dijk’, … Compare to plain data-driven approaches –TIMBL on geographical names  same WER, lower WIR, huge size compared to p2p –more tests needed