Data-driven approach to rapid prototyping Xhosa speech synthesis Albert Visagie Justus Roux Centre for Language and Speech Technology Stellenbosch University.

Slides:



Advertisements
Similar presentations
CEBUANO-VISAYAN A PEDAGOGIC GRAMMAR FOR Dr. Angel O. Pesirla,
Advertisements

Grammar: Meaning and Contexts * From Presentation at NCTE annual conference in Pittsburgh, 2005.
CODE/ CODE SWITCHING.
Morphology.
Alphabetic Understanding, Phonics and Word Study
MAI Internship April-May MAI Internship 2002 Slide 2 of 14 What? The AST Project promotes development of speech technology for official languages.
Morphology and Lexicon Chapter 3
Ian Cushing English teacher, Surbiton High School UK Linguistics Olympiad Committee Education Committee, Linguistics Association of Great Britain Grammar.
AN ACOUSTIC PROFILE OF SPEECH EFFICIENCY R.J.J.H. van Son, Barbertje M. Streefkerk, and Louis C.W. Pols Institute of Phonetic Sciences / ACLC University.
Automatic Prosodic Event Detection Using Acoustic, Lexical, and Syntactic Evidence Sankaranarayanan Ananthakrishnan, Shrikanth S. Narayanan IEEE 2007 Min-Hsuan.
1 A Hidden Markov Model- Based POS Tagger for Arabic ICS 482 Presentation A Hidden Markov Model- Based POS Tagger for Arabic By Saleh Yousef Al-Hudail.
Recognition of Voice Onset Time for Use in Detecting Pronunciation Variation ● Project Description ● What is Voice Onset Time (VOT)? – Physical Realization.
Linguisitics Levels of description. Speech and language Language as communication Speech vs. text –Speech primary –Text is derived –Text is not “written.
1 Facoltà di Economia Corso di Laurea in Economia e Gestione Aziendale Economia e Finanza Economia e Finanza Economia e Gestione dei Servizi Turistici.
Chapter three Phonology
Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.
KS2 English Parent Workshop January 2015
Dictionary.
The classification of languages Introduction to Linguistics 2.
Building High Quality Databases for Minority Languages such as Galician F. Campillo, D. Braga, A.B. Mourín, Carmen García-Mateo, P. Silva, M. Sales Dias,
Phonology, phonotactics, and suprasegmentals
Assessing Reading: Meeting Year 3 Expectations
1 NLP in Thailand by Asanee Kawtrakul Kasetsart University.
Computational Methods to Vocalize Arabic Texts H. Safadi*, O. Al Dakkak** & N. Ghneim**
Phonetics and Phonology
L103 Course Overview. What does every speaker know? The sound system — phonetics/phonology The lexicon (words/morphemes) — semantics / morphology How.
GREENBAUM, S & QUIRK, R. (1990) A
1 Speech Perception 3/30/00. 2 Speech Perception How do we perceive speech? –Multifaceted process –Not fully understood –Models & theories attempt to.
Phonemes A phoneme is the smallest phonetic unit in a language that is capable of conveying a distinction in meaning. These units are identified within.
West Africa Geographical designation 1. It refers to a culturally, historically, and linguistically diverse region north of the Equator on the western.
An overview of the first four chapters. Chapter 1 Linguistics is the scientific study of language. “What makes a field a science is if it involves constructing.
LOGO Dinh Cong Triet. TEST DESIGNING & TEST SPECIFICATION BUILDING Summer training course- August 2011.
Morphology An Introduction to the Structure of Words Lori Levin and Christian Monson Grammars and Lexicons Fall Term, 2004.
Morphology A Closer Look at Words By: Shaswar Kamal Mahmud.
Metalanguage Revision English language year
Levels of Language 6 Levels of Language. Levels of Language Aspect of language are often referred to as 'language levels'. To look carefully at language.
English Phonetics 许德华 许德华. Objectives of the Course This course is intended to help the students to improve their English pronunciation, including such.
A Fully Annotated Corpus of Russian Speech
Supporting Early Literacy Learning Ballarat March, 2011.
Literacy Instruction in Linguistically Diverse Classrooms.
General Rule: Students’ ability to understand what they hear can improve very much if they are regularly exposed to audio materials: the more English.
Hybrid Method for Tagging Arabic Text Written By: Yamina Tlili-Guiassa University Badji Mokhtar Annaba, Algeria Presented By: Ahmed Bukhamsin.
Jeopardy Syntax Morphology Sociolinguistics and Prescriptivism Phonology Language and Diversity Q $100 Q $200 Q $300 Q $400 Q $500 Q $100 Q $200 Q $300.
Macmillan IELTS Improving students' answers in IELTS Writing Task 1 Sam McCarter.
Words Which Way? CURR 511. What are you wondering? How does WTW work? Is it an assessment or a program? How do WTW levels relate to GR/DRA levels? What.
Slang. Informal verbal communication that is generally unacceptable for formal writing.
EXPRESS YOURSELF. NEUTRAL ACCENT Neutral accent is a way of speaking a language without regionalism. Accent means variation in pronunciation and it should.
Outline  I. Introduction  II. Reading fluency components  III. Experimental study  1) Method and participants  2) Testing materials  IV. Interpretation.
Speech in the DHH Classroom A new perspective. Speech in the DHH Bilingual Classroom Important to look beyond the traditional view of speech Think of.
Standard Assessment Tests Glynne Primary School SATs Information Evening.
Welcome to All S. Course Code: EL 120 Course Name English Phonetics and Linguistics Lecture 1 Introducing the Course (p.2-8) Unit 1: Introducing Phonetics.
Word Study With Diverse Learners What? Why? How? 2009 IRA Regional Conference: Branson, MO Presenters: Jenifer Pastore and Brandi Clowers.
Definition of syllable One or more letters representing a unit ofletters spoken language consisting of a single uninterrupted sound.language A syllable.
SPAG.
English. New National Curriculum Aims The overarching aim for English in the national curriculum is to promote high standards of language and literacy.
ACT REVIEW. RUN-ONS A complete sentence contains a subject, a verb, and a complete thought. If any of the three is lacking, the sentence is called a.
2014 Development of a Text-to-Speech Synthesis System for Yorùbá Language Olúòkun Adédayọ̀ Tolulope Department of Computer Science.
INTONATION And IT’S FUNCTIONS
Introduction to phonetics and English phonology:
INFORMATION FOR PARENTS AUTUMN 2014 SPELLING, PUNCTUATION AND GRAMMAR.
SPOKEN english.
Revision Outcome 1, Unit 1 The Nature and Functions of Language
Grammar Workshop Thursday 9th June.
Job Google Job Title: Linguistic Project Manager
Purpose of Study & Introduction to Sarf (Morphology)
Representing Intonational Variation
THE LEXEME WORD-FORM GRAMMATICAL WORD MORPHEME MORPH ALLOMORPH
Facoltà di Economia Economia e Gestione Aziendale Economia e Finanza
A Teaching Plan Presentation
Artificial Intelligence 2004 Speech & Natural Language Processing
Presentation transcript:

Data-driven approach to rapid prototyping Xhosa speech synthesis Albert Visagie Justus Roux Centre for Language and Speech Technology Stellenbosch University South Africa

Introduction Japan-South African Intergovernmental Science and Technology Cooperation Programme. Goals: –Understand what is needed from a linguistic and technology standpoint. –Build a text-analysis front-end. –Experimental platform.

Outline Xhosa: –orthography, –phonetics, –tone Approach: –Text analysis, –HTS.

Xhosa Xhosa is spoken in South Africa, by about 8 million people. One of the official languages of South Africa Writing system is relatively young, and based on English letters. Many dialects. Borrowed clicks from Khoisan.

Xhosa: Orthography Agglutinative language. Nouns: –15 classes (including plural & singular). –Nouns affixed for dimunitive. Verbs: –Verbs affixed according to subject, tense, negative etc. Examples: teach: -fund- preacher (teacher): umfundisi  u + m(u) + fund + is + i small preacher: umfundisana  u + m(u) + fund + is + ana He/she will teach them: uzakubafundisa  u + za + ku + ba + fund + is + a

Xhosa: Phonetics Consonants: Implosive /b/ Ejectives and aspirated versions of stops. 15 Clicks Vowels Five basic vowels, including long versions.

Xhosa: Tone According to the literature, it’s a tone language. High, Low, and Falling tones. Recent dictionary: has tone marked for root morphemes, rules can be constructed to predict movement under morphological composition. Recent work: –Downing, Roux, argue for accent. –Kuun: Statistical experiment suggests highly regular structure. Observed regularity on pitch rises and duration increase gives a simple method to use in a first prototype.

Approach Focus on language dependent components: –Build the text analyser, –use an existing synthesiser. Choice: HTS 2.0 –Model driven, trainable synthesiser. –Contains language independent F0 and duration models –Good use of synthesis database by predicting spectrum, F0 and segment duration separately.

HTS

HTS: Symbolic Features Each segment of audio (HMM state) is labelled according to its linguistic context Examples: Phonetic context: labels of preceding and following phones. Parts-of-speech. Stress or canonical tone. Counting.

Text Analyser Components Components: –Orthographic to phonetic –Morphological analysis –Parts-of-speech –Canonical tone marks

Orthographic to Phonetic The orthography is very young, and highly consistent with the pronunciation. Hand-written letter-to-sound rewrite rules. Lexicon for loan words.

Morphology Specially bootstrapped from a Zulu version for this project. Requires a lexicon of root morphemes. Works with isolated words. Ambiguous! Ideal: root morpheme boundaries, affix types, POS tagger for disambiguation. Implemented: None

Parts-of-Speech Morphological analysis. Ideal: POS tagger. Implemented: Exhaustive lists of closed sets – pronouns, conjunctions, prepositions, etc.

Tone A printed dictionary with canonical tone markings for root morphemes is available. Rules can be constructed to determine movement of at least High tones, under morphological composition. Highly regular structure: 3 rd -from-last syllable starts high pitch excursion, 2 nd -from-last syllable lengthened. Ideal: Exhaustive specification of set tones Implemented: Word-level syllable counts (3-1, 2-2, 1-3)

Tests Basic intelligibility test: Listeners asked to transcribe what they hear. –Incomplete phrases. –Two versions of the question set, and natural utterances (recoded) –Mother-tongue and second language speakers. Impressions: –“He’s from the townships.” –“That’s perfect, there’s nothing wrong with that.” –Also frowns and repeats.

Next Steps Comprehension test? Impressions. Baseline comparative/preference test. Improvements –Question phrases. –Information from morphological analysis. –Canonical tone markings. Zulu

Conclusion The system worked very well, considering the bare minimum of knowledge currently incorporated. Data driven approach with HTS well suited to bootstrapping a new language. Got experimental platform

Demos “Ubangele amadoda amaninzi kule lali,” –Natural: –Synthesised: “waqalisa ukunqwenela ukuba nomzi.” –Natural: –Synthesised: Click song: