Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université

Slides:

Advertisements

Similar presentations

Delta Confidential 1 5/29 – 6/6, 2001 SAP R/3 V4.6c PP Module Order Change Management(OCM)

Advertisements

2017/3/25 Test Case Upgrade from “Test Case-Training Material v1.4.ppt” of Testing basics Authors: NganVK Version: 1.4 Last Update: Dec-2005.

AUTOMATIC PHONETIC ANNOTATION OF AN ORTHOGRAPHICALLY TRANSCRIBED SPEECH CORPUS Rui Amaral, Pedro Carvalho, Diamantino Caseiro, Isabel Trancoso, Luís Oliveira.

Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.

Myra Shields Training Manager Introduction to OvidSP.

Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.

Objectives: Generate and describe sequences. Vocabulary:

We need a common denominator to add these fractions.

1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.

1 CREATING AN ADMINISTRATIVE DRAW REQUEST (OCC) Complete a Checklist for Administrative Draw Requests (Form 16.08). Draw Requests amount must agree with.

Exit a Customer Chapter 8. Exit a Customer 8-2 Objectives Perform exit summary process consisting of the following steps: Review service records Close.

FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.

C1 Sequences and series. Write down the first 4 terms of the sequence u n+1 =u n +6, u 1 =6 6, 12, 18, 24.

Anaphoric Third Person Pronouns and Prosodic Features as Markers of Cohesion in English Spoken Discourse: A Corpus Study Cyril Auran Laboratoire Parole.

Break Time Remaining 10:00.

The basics for simulations

Configuration management

Turing Machines.

Database Performance Tuning and Query Optimization

PP Test Review Sections 6-1 to 6-6

EIS Bridge Tool and Staging Tables September 1, 2009 Instructor: Way Poteat Slide: 1.

Benchmark Series Microsoft Excel 2013 Level 2

Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.

1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.

Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.

MaK_Full ahead loaded 1 Alarm Page Directory (F11)

Page 1 of 43 To the ETS – Bidding Query by Map Online Training Course Welcome This training module provides the procedures for using Query by Map for a.

Artificial Intelligence

Addition 1’s to 20.

25 seconds left…...

1 hi at no doifpi me be go we of at be do go hi if me no of pi we Inorder Traversal Inorder traversal. n Visit the left subtree. n Visit the node. n Visit.

Chapter 10: The Traditional Approach to Design

Systems Analysis and Design in a Changing World, Fifth Edition

Essential Cell Biology

Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)

Clock will move after 1 minute

PSSA Preparation.

Essential Cell Biology

Physics for Scientists & Engineers, 3rd Edition

Energy Generation in Mitochondria and Chlorplasts

Select a time to count down from the clock above

16. Mean Square Estimation

RefWorks: The Basics October 12, What is RefWorks? A personal bibliographic software manager –Manages citations –Creates bibliogaphies Accessible.

Teaching Pronunciation

1 Dr. Scott Schaefer Least Squares Curves, Rational Representations, Splines and Continuity.

Introduction Embedded Universal Tools and Online Features 2.

Adaptive Segmentation Based on a Learned Quality Metric

Prosodic marking of appositive relative clause types in spoken discourse: pragmatic and phonetic analyses of a British English corpus Cyril Auran & Rudy.

Speaker Clustering using MDL Principles Kofi Boakye Stat212A Project December 3, 2003.

Toshiba Update 04/09/2006 Data-Driven Prosody and Voice Quality Generation for Emotional Speech Zeynep Inanoglu & Steve Young Machine Intelligence Lab.

AUTOMATIC DETECTION OF REGISTER CHANGES FOR THE ANALYSIS OF DISCOURSE STRUCTURE Laboratoire Parole et Langage, CNRS et Université de Provence Aix-en-Provence,

A brief overview of Speech Recognition and Spoken Language Processing Advanced NLP Guest Lecture August 31 Andrew Rosenberg.

Copyright 2007, Toshiba Corporation. How (not) to Select Your Voice Corpus: Random Selection vs. Phonologically Balanced Tanya Lambert, Norbert Braunschweiler,

LREC 2008, Marrakech, Morocco1 Automatic phone segmentation of expressive speech L. Charonnat, G. Vidal, O. Boëffard IRISA/Cordial, Université de Rennes.

LML Speech Recognition Speech Recognition Introduction I E.M. Bakker.

Rundkast at LREC 2008, Marrakech LREC 2008 Ingunn Amdal, Ole Morten Strand, Jørn Almberg, and Torbjørn Svendsen RUNDKAST: An Annotated.

Annotating the HKCSE Pragmatically Martin Weisser Visiting Professor School of English and Education Guangdong University of Foreign Studies mail:

Recognizing Discourse Structure: Speech Discourse & Dialogue CMSC October 11, 2006.

A Fully Annotated Corpus of Russian Speech

New Acoustic-Phonetic Correlates Sorin Dusan and Larry Rabiner Center for Advanced Information Processing Rutgers University Piscataway,

Merging Segmental, Rhythmic and Fundamental Frequency Features for Automatic Language Identification Jean-Luc Rouas 1, Jérôme Farinas 1 & François Pellegrino.

Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:

Audio Books for Phonetics Research CatCod2008 Jiahong Yuan and Mark Liberman University of Pennsylvania Dec. 4, 2008.

Text-To-Speech System for English

Audio Books for Phonetics Research

Presentation transcript:

Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université de Provence

Summary 1. The Aix-MARSEC Project Building Aix-MARSEC Availability of the database Methodology 2. Grapheme-Phoneme Conversion and Alignment The Aix-MARSEC Methodology Integration into PCE 3. Conclusion and Perspectives

The Aix-MARSEC Project

Automatic grapheme-to-phoneme conversion Automatic phoneme level alignment Automatic intonation annotation using the Momel-Intsint methodology 8 annotation levels aligned: phonemes, syllable constituents, syllables, words, feet and rhythmic units, tone groups, Intsint coding Tagging and parsing alignment under way The Aix-MARSEC Project An evolution from the SEC and MARSEC corpora SEC Spoken English Corpus 55,000 words, 339 min. and 18 sec. BBC 1980s recordings 11 speaking styles 53 (17 female and 36 male) speakers Orthographic transcription Syntactic tagging and parsing Prosodic annotation: 14 tonetic stress marks MARSEC Machine Readable SEC Aix-MARSEC Building Aix-MARSEC Alignment of words and tone groups with the signal Conversion of all the TSM to ASCII characters

The Aix-MARSEC Project

Availability of the database Online version: Annotation files (TextGrids) Phonemes data tables Perl and Praat scripts CD-Rom version: Annotation files (TextGrids) Phonemes data tables Perl and Praat scripts Sound files (.wav format)

The Aix-MARSEC Project Methodology Automatic alignment Orthographic transcription Raw phonemic transcription Optimised phonemic transcription Aligned phonemic transcription Elision prediction G2P conversion SC annotationSyllable annotation Word annotation TSM annotation Rhythmic annotation

Grapheme-Phoneme Conversion and Alignment

G2P Conversion and Alignment Orthographic transcription Raw phonemic transcription Optimised phonemic transcription Elision prediction G2P conversion The Aix-MARSEC Methodology Automatic alignment Aligned phonemic transcription SC annotationSyllable annotation Word annotation

G2P Conversion and Alignment Orthographic transcription Raw phonemic transcription G2P conversion The Aix-MARSEC Methodology

G2P Conversion and Alignment The Aix-MARSEC Methodology G2P Conversion: General principles Dictionary-based method (4 dictionaries used) Specific processing for numbers, abbreviations, etc. Syntagmatic effects (linking r, definite article) Raw transcription

G2P Conversion and Alignment The Aix-MARSEC Methodology G2P Conversion: The 4 dictionaries Primary pronunciation dictionary (Advanced Learners Dictionary, Oxford University Press; entries) Complementary dictionary (700 entries) Problematic forms dictionary (for hesitations, partial words,…; 26 entries) Reduced forms dictionary (75 entries)dictionary

G2P Conversion and Alignment The Aix-MARSEC Methodology G2P Conversion: Specific issues Abbreviations Numbers Sequences of numbers and capitals (Post Codes) Genitives and Contractions 3 rd person and plural forms Preterite and past participle forms

G2P Conversion and Alignment Orthographic transcription Raw phonemic transcription G2P conversion The Aix-MARSEC Methodology Optimised phonemic transcription Elision prediction

G2P Conversion and Alignment The Aix-MARSEC Methodology Elision Prediction: General principles Raw transcription citation forms Continuous speech specific phenomena (elisions, epenthesis, metathesis, etc.)

G2P Conversion and Alignment The Aix-MARSEC Methodology Elision prediction: Constraints - Intonation constraints (TSM) - Temporal constraints: Minimal threshold: 5ms Thresholds for specific phonemes (Klatt, 1979) /t – d/= 55ms; 55ms; /T/= 110ms Lengthening « z » factor: z < 0 elision z 0 no elision - Phonotactic constraints (rules)

G2P Conversion and Alignment Elision prediction: Rules 1 Th.: duration threshold

G2P Conversion and Alignment Elision prediction: Evaluation 4077 elided phonemes out of 199,770 in the corpus ( 2 %) Half of all elisions are correctly predicted ¾ predicted elisions are correct Global quality of the algorithm

G2P Conversion and Alignment Orthographic transcription Raw phonemic transcription Optimised phonemic transcription Elision prediction G2P conversion The Aix-MARSEC Methodology Automatic alignment Aligned phonemic transcription

G2P Conversion and Alignment Alignment: General principles HMM and Viterbi based alignment by Christophe Lévy (LIA, France) - HMM trained on the TIMIT corpus of American English - Gaussian Mixture Model (8 components & diagonal covariance matrices estimated through the Expectation-Maximisation algorithm optimising the Maximum-Likelihood criterion) - 12 MFCC (filter bank analysis) increased by energy, delta and delta-delta coefficients 39-coefficient vector per speech frame

G2P Conversion and Alignment Absolute mean error: 22 ms Mean error: - 6,29 ms Kurtosis: 8,15 (narrow distribution) Skewness: -0,94 (left bias) Alignment: Evaluation

G2P Conversion and Alignment Acceptance Threshold Optimised transcription 64 ms93.25 % 32 ms82.02 % 20 ms68.37 % 16 ms59.97 % 15 ms57.40 % 10 ms42.43 % 5 ms23.72 % Alignment: Evaluation

Integration into PCE Integration: Motivations Double focus: Segmental phenomena Prosodic phenomena Formant charts Tonal alignment Phoneme level alignment For phoneticians and phonologists

Integration into PCE Integration: 2 possible policies Direct integration: Exact Aix-MARSEC methodology Requires word level manual alignment Alternative integration: Adaptation of the Aix-MARSEC methodology Optional elisions predicted on the basis of phonotactic rules only + decision during the alignment phase

Conclusions and Perspectives

An easily evolutive fully automatic methodology Diverse types of phonological / phonetic segmental / prosodic exploitation (formant charts, temporal, intonational and metrical studies, …) Full interactivity with other ProZEd modules (Momel-Intsint, …) Realistic integration into PCE (2 options)

Well… This time its for good !! Presentation available from

14 ASCII prosodic annotation symbols: _low level ~high level <step-down >step-up / (high) rise-fall /high \high fall fall-rise /high rise,low rise low fall,\(low rise-fall – not used) \,low fall-rise *stressed but unaccented |minor intonation unit boundary ||major intonation unit boundary (Roach, 1994) Back to the presentation

Reduced forms processing Creation of a reduced forms dictionary based on OConnor (1967) and Faure (1975) Reduction constraint: TSM absence Aim: improving G2P conversion Back to the presentation Example: TSM: /and converted into /{nd/ No TSM: and converted into