L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari.

L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari

O UTLINE Motivation Problem and its Challenges Relevant Works Our Work Formal Model EM Dynamic Bayesian Network Evaluation Letter to Phoneme Generator AER Result 2

T EXT TO S PEECH T EXT TO S PEECH P ROBLEM Conversion of Text to Speech: TTS Automated Telecom Services E-mail by Phone Banking Systems Handicapped People 3

P RONUNCIATION Pronunciation of the words Dictionary Words Non-Dictionary Words Phonetic Analysis Dictionary Look-up  Language is alive, new words add  Proper Nouns 4 Phonetic Analysis WordPronunciation

P ROBLEM Letter to Phoneme Alignment ◦ Letter : c a k e ◦ Phoneme : k ei k  6 L2P

C HALLENGES No Consistency ◦ City  / s / ◦ Cake  / k / ◦ Kid  / k / No Transparency ◦ K i d (3)  / k i d / (3) ◦ S i x (3)  / s i k s / (4) ‏ ◦ Q u e u e (5)  / k j u: / (3) ‏ ◦ A x e (3)  / a k s / (3) ‏ 7

O NE - TO - ONE EM D AELEMANS ET. AL., 1996 Length of word = pronunciation Produce all possible alignments Inserting null letter/phoneme Alignment probability 9

D ECISION T REE B LACK ET. AL., 1996 Train a CART Using Aligned Dictionary Why CART? A Single Tree for Each Letter 10

K ONDRAK Alignments are not always one-to-one Ax e  / a k s / Boo k  /b ú k / Only Null Phoneme Similar to one-to-one EM Produce All Possible Alignments Compute the Probabilities 11

F ORMAL M ODEL Word: sequence of letters Pronunciation: sequence of phonemes Alignment: sequence of subalignments Problem: Finding the most probable alignment 13

M ANY - TO -M ANY EM 1. Initialize prob(SubAlignmnets) // Expectation Step 2. For each word in training_set 2.1. Produce all possible alignments 2.2. Choose the most probable alignment // Maximization Step 3. For all subalignments 3.1. Compute new_p(SubAlignmnets) 14

D YNAMIC B AYESIAN N ETWORK 15 Model Subaligments are considered as hidden variables Learn DBN by EM lili lili PiPi PiPi aiai

C ONTEXT D EPENDENT DBN Context independency assumption Makes the model simpler It is not always a correct assumption Example: Chat and Hat Model 16 lili lili PiPi PiPi aiai a i-1

E VALUATION D IFFICULTIES Unsupervised Evaluation No Aligned Dictionary Solutions How much it boost a supervised module Letter to Phoneme Generator Comparing the result with a gold alignment AER 18

Letter to Phoneme Generator Percentage of correctly generated phonemes and words How it works? Finding Chunks Binary Classification Using Instance-Based-Learning Phoneme Prediction Phoneme is predicted independently for each letter Phoneme is predicted for each chunk Hidden Markov Model 19

A LIGNMENT E RROR R ATIO AER Evaluating by Alignment Error Ratio Counting common pairs between Our aligned output Gold alignment Calculating AER 20

R ESULTS 22 10 fold cross validation ModelWord Accuracy Phoneme Accuracy Best previous results66.8292.45 One_To_OneEM53.87%85.66% Many_To_ManyEM76%94.5% DBNContext Independent 79.12%95.23% Context Dependent 81.54%96. 70%

L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari.

Similar presentations

Presentation on theme: "L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari.

Similar presentations

Presentation on theme: "L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari."— Presentation transcript:

Similar presentations

About project

Feedback