L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari.

Slides:



Advertisements
Similar presentations
Pronunciation Modeling Lecture 11 Spoken Language Processing Prof. Andrew Rosenberg.
Advertisements

Speech Recognition Part 3 Back end processing. Speech recognition simplified block diagram Speech Capture Speech Capture Feature Extraction Feature Extraction.
HMM II: Parameter Estimation. Reminder: Hidden Markov Model Markov Chain transition probabilities: p(S i+1 = t|S i = s) = a st Emission probabilities:
1 Fuchun Peng Microsoft Bing 7/23/  Query is often treated as a bag of words  But when people are formulating queries, they use “concepts” as.
Dual-domain Hierarchical Classification of Phonetic Time Series Hossein Hamooni, Abdullah Mueen University of New Mexico Department of Computer Science.
Hidden Markov Models Reading: Russell and Norvig, Chapter 15, Sections
Development of Automatic Speech Recognition and Synthesis Technologies to Support Chinese Learners of English: The CUHK Experience Helen Meng, Wai-Kit.
O PTICAL C HARACTER R ECOGNITION USING H IDDEN M ARKOV M ODELS Jan Rupnik.
Page 1 Hidden Markov Models for Automatic Speech Recognition Dr. Mike Johnson Marquette University, EECE Dept.
Hidden Markov Models Theory By Johan Walters (SR 2003)
Tagging with Hidden Markov Models. Viterbi Algorithm. Forward-backward algorithm Reading: Chap 6, Jurafsky & Martin Instructor: Paul Tarau, based on Rada.
Progress Report Reihaneh Rabbany Presented for NLP Group Computing Science Department University of Alberta April 2009.
HMM-BASED PATTERN DETECTION. Outline  Markov Process  Hidden Markov Models Elements Basic Problems Evaluation Optimization Training Implementation 2-D.
Hidden Markov Model 11/28/07. Bayes Rule The posterior distribution Select k with the largest posterior distribution. Minimizes the average misclassification.
Phoneme Alignment. Slide 1 Phoneme Alignment based on Discriminative Learning Shai Shalev-Shwartz The Hebrew University, Jerusalem Joint work with Joseph.
1 Lecture 5: Automatic cluster detection Lecture 6: Artificial neural networks Lecture 7: Evaluation of discovered knowledge Brief introduction to lectures.
Hidden Markov Models K 1 … 2. Outline Hidden Markov Models – Formalism The Three Basic Problems of HMMs Solutions Applications of HMMs for Automatic Speech.
Application of RNNs to Language Processing Andrey Malinin, Shixiang Gu CUED Division F Speech Group.
Sparse vs. Ensemble Approaches to Supervised Learning
A Pattern Matching Method for Finding Noun and Proper Noun Translations from Noisy Parallel Corpora Benjamin Arai Computer Science and Engineering Department.
Albert Gatt Corpora and Statistical Methods Lecture 9.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Using Error-Correcting Codes For Text Classification Rayid Ghani Center for Automated Learning & Discovery, Carnegie Mellon University.
1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.
Midterm Review Spoken Language Processing Prof. Andrew Rosenberg.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Machine Learning in Spoken Language Processing Lecture 21 Spoken Language Processing Prof. Andrew Rosenberg.
A brief overview of Speech Recognition and Spoken Language Processing Advanced NLP Guest Lecture August 31 Andrew Rosenberg.
Copyright 2007, Toshiba Corporation. How (not) to Select Your Voice Corpus: Random Selection vs. Phonologically Balanced Tanya Lambert, Norbert Braunschweiler,
LOGO Ensemble Learning Lecturer: Dr. Bo Yuan
This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number.
LREC 2008, Marrakech, Morocco1 Automatic phone segmentation of expressive speech L. Charonnat, G. Vidal, O. Boëffard IRISA/Cordial, Université de Rennes.
Hyperparameter Estimation for Speech Recognition Based on Variational Bayesian Approach Kei Hashimoto, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee and Keiichi.
IRCS/CCN Summer Workshop June 2003 Speech Recognition.
Laxman Yetukuri T : Modeling of Proteomics Data
Hidden Markov Models in Keystroke Dynamics Md Liakat Ali, John V. Monaco, and Charles C. Tappert Seidenberg School of CSIS, Pace University, White Plains,
CS774. Markov Random Field : Theory and Application Lecture 19 Kyomin Jung KAIST Nov
Cluster-specific Named Entity Transliteration Fei Huang HLT/EMNLP 2005.
Introduction to Neural Networks and Example Applications in HCI Nick Gentile.
PGM 2003/04 Tirgul 2 Hidden Markov Models. Introduction Hidden Markov Models (HMM) are one of the most common form of probabilistic graphical models,
Slides for “Data Mining” by I. H. Witten and E. Frank.
CHAPTER 8 DISCRIMINATIVE CLASSIFIERS HIDDEN MARKOV MODELS.
Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech Synthesis Kei Hashimoto, Yoshihiko Nankaku, and Keiichi.
Conditional Random Fields for ASR Jeremy Morris July 25, 2006.
Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.
C ONSTRUCTING L OAD -B ALANCED D ATA A GGREGATION T REES IN P ROBABILISTIC W IRELESS S ENSOR N ETWORKS Jing (Selena) He, Shouling Ji, Yi Pan, Yingshu Li.
Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1.
Support Vector Machines and Gene Function Prediction Brown et al PNAS. CS 466 Saurabh Sinha.
The HTK Book (for HTK Version 3.2.1) Young et al., 2002.
Presented by: Fang-Hui Chu Discriminative Models for Speech Recognition M.J.F. Gales Cambridge University Engineering Department 2007.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Elements of a Discrete Model Evaluation.
Bayesian Speech Synthesis Framework Integrating Training and Synthesis Processes Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda Nagoya Institute.
Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
1 Hidden Markov Models Hsin-min Wang References: 1.L. R. Rabiner and B. H. Juang, (1993) Fundamentals of Speech Recognition, Chapter.
Consensus Relevance with Topic and Worker Conditional Models Paul N. Bennett, Microsoft Research Joint with Ece Kamar, Microsoft Research Gabriella Kazai,
Machine Learning: A Brief Introduction Fu Chang Institute of Information Science Academia Sinica ext. 1819
Hidden Markov Model Parameter Estimation BMI/CS 576 Colin Dewey Fall 2015.
Definition of the Hidden Markov Model A Seminar Speech Recognition presentation A Seminar Speech Recognition presentation October 24 th 2002 Pieter Bas.
Probabilistic Pronunciation + N-gram Models CMSC Natural Language Processing April 15, 2003.
Dan Roth University of Illinois, Urbana-Champaign 7 Sequential Models Tutorial on Machine Learning in Natural.
Hidden Markov Models Wassnaa AL-mawee Western Michigan University Department of Computer Science CS6800 Adv. Theory of Computation Prof. Elise De Doncker.
Hidden Markov Models BMI/CS 576
Speaker : chia hua Authors : Long Qin, Ming Sun, Alexander Rudnicky
Hidden Markov Models (HMM)
Statistical Models for Automatic Speech Recognition
Do-Gil Lee1*, Ilhwan Kim1 and Seok Kee Lee2
Intelligent Information System Lab
Data Mining Lecture 11.
CSc4730/6730 Scientific Visualization
Stance Classification of Ideological Debates
Presentation transcript:

L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari

O UTLINE Motivation Problem and its Challenges Relevant Works Our Work Formal Model EM Dynamic Bayesian Network Evaluation Letter to Phoneme Generator AER Result 2

T EXT TO S PEECH T EXT TO S PEECH P ROBLEM Conversion of Text to Speech: TTS Automated Telecom Services by Phone Banking Systems Handicapped People 3

P RONUNCIATION Pronunciation of the words Dictionary Words Non-Dictionary Words Phonetic Analysis Dictionary Look-up  Language is alive, new words add  Proper Nouns 4 Phonetic Analysis WordPronunciation

O UTLINE Motivation Problem and its Challenges Relevant Works Our Work Formal Model EM Dynamic Bayesian Network Evaluation Letter to Phoneme Generator AER Result 5

P ROBLEM Letter to Phoneme Alignment ◦ Letter : c a k e ◦ Phoneme : k ei k  6 L2P

C HALLENGES No Consistency ◦ City  / s / ◦ Cake  / k / ◦ Kid  / k / No Transparency ◦ K i d (3)  / k i d / (3) ◦ S i x (3)  / s i k s / (4) ‏ ◦ Q u e u e (5)  / k j u: / (3) ‏ ◦ A x e (3)  / a k s / (3) ‏ 7

O UTLINE Motivation Problem and its Challenges Relevant Works Our Work Formal Model EM Dynamic Bayesian Network Evaluation Letter to Phoneme Generator AER Result 8

O NE - TO - ONE EM D AELEMANS ET. AL., 1996 Length of word = pronunciation Produce all possible alignments Inserting null letter/phoneme Alignment probability 9

D ECISION T REE B LACK ET. AL., 1996 Train a CART Using Aligned Dictionary Why CART? A Single Tree for Each Letter 10

K ONDRAK Alignments are not always one-to-one Ax e  / a k s / Boo k  /b ú k / Only Null Phoneme Similar to one-to-one EM Produce All Possible Alignments Compute the Probabilities 11

O UTLINE Motivation Problem and its Challenges Relevant Works Our Work Formal Model EM Dynamic Bayesian Network Evaluation Letter to Phoneme Generator AER Result 12

F ORMAL M ODEL Word: sequence of letters Pronunciation: sequence of phonemes Alignment: sequence of subalignments Problem: Finding the most probable alignment 13

M ANY - TO -M ANY EM 1. Initialize prob(SubAlignmnets) // Expectation Step 2. For each word in training_set 2.1. Produce all possible alignments 2.2. Choose the most probable alignment // Maximization Step 3. For all subalignments 3.1. Compute new_p(SubAlignmnets) 14

D YNAMIC B AYESIAN N ETWORK 15 Model Subaligments are considered as hidden variables Learn DBN by EM lili lili PiPi PiPi aiai

C ONTEXT D EPENDENT DBN Context independency assumption Makes the model simpler It is not always a correct assumption Example: Chat and Hat Model 16 lili lili PiPi PiPi aiai a i-1

O UTLINE Motivation Problem and its Challenges Relevant Works Our Work Formal Model EM Dynamic Bayesian Network Evaluation Letter to Phoneme Generator AER Result 17

E VALUATION D IFFICULTIES Unsupervised Evaluation No Aligned Dictionary Solutions How much it boost a supervised module Letter to Phoneme Generator Comparing the result with a gold alignment AER 18

Letter to Phoneme Generator Percentage of correctly generated phonemes and words How it works? Finding Chunks Binary Classification Using Instance-Based-Learning Phoneme Prediction Phoneme is predicted independently for each letter Phoneme is predicted for each chunk Hidden Markov Model 19

A LIGNMENT E RROR R ATIO AER Evaluating by Alignment Error Ratio Counting common pairs between Our aligned output Gold alignment Calculating AER 20

O UTLINE Motivation Problem and its Challenges Relevant Works Our Work Formal Model EM Dynamic Bayesian Network Evaluation Letter to Phoneme Generator AER Result 21

R ESULTS fold cross validation ModelWord Accuracy Phoneme Accuracy Best previous results One_To_OneEM53.87%85.66% Many_To_ManyEM76%94.5% DBNContext Independent 79.12%95.23% Context Dependent 81.54%96. 70%