06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.1 Application of Noisy Channel, Channel Entropy CS 621 Artificial Intelligence Lecture 15 - 06/09/05.

06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.1 Application of Noisy Channel, Channel Entropy CS 621 Artificial Intelligence Lecture 15 - 06/09/05 Prof. Pushpak Bhattacharyya

06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.2 S = {s 1, s 2 … s q } R = {t 1, t 2 … t q } Noisy Channel S R SPEECH RECOGNITION ( ASR – Automatic SR) - Signal processing (low level). - Cognitive Processing (higher level categories).

06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.3 Noisy Channel Metaphor Due to Jelinek (IBM) – 1970’s Main field of study – speech. Problem Definition S = {Speech signals} = {s 1, s 2 … s s } R = {w 1, w 2 … w r } {s 1, s 2 … s p }  {w 1, w 2 … w q }

06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.4 Special and Easier case Isolated word Recognition (IWR) Complexity due to ‘Word Boundary’ will not arise. Example : I got a plate vs I got up late

06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.5 Homophones: Words have same pronunciation. Example:bear, beer : Homographs: Words have same spellings but different meaning Example: bank; River bank and finance bank Homophones And Homographs

06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.6 World of sounds – speech signals Phonetics Phonology World of words  Orthography letters :Consonants Vowels World Of Sounds

06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.7 Since alphabet to sound mapping is not one to one Vowels Tomato Tomaeto Tomaato

06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.8 Sound Variations Lexical variations ‘because’ ‘cause because Allophonic variations ‘because’ because becase

06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.9 Allophonic variations: More remarkable example Do  [ δ][U] Go  [G][0]

06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.10 Socio-cultural variations something somethingsomethin formalinformal Dialectic variation Very – bheri in Bengal apple – ieple in south eple in north aapel in bengal

06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.11 Orthography -- Phonology complex problem Very difficult to model using ‘Rule Governed’ system.

06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.12 Probabilistic Approach W* = Best estimate for a word given S N C S W* W* = ARGMAX [ P(w|s) ] w belongs to set of words

06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.13 P(w|s) called the ‘parameter’ of the system. Estimation  Training The probability values need to be estimated from “SPEECH CORPORA”. Record speech of many speakers.

06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.14 Look of Speech Corpora Annotation – Unique pronunciation. Signal Apple

06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.15 Repository of Standard Sound Symbols IPA – International Phonetic Association. ARPABET – American’s Phonetic STD.

06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.16 t Augment the Roman Alphabet with Greek symbols e [Є] ‘ebb’ [i] ‘need’ top [ t] IPA tool [θ] IPA

06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.17 Speech corpora are annotated with IPA/ARPABET symbols. Indian Scenario Hindi TIFR Marathi IITB Tamil IITM

06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.18 How to Estimate P(w|s) from speech corpora count(w,s)/ count(s) Not done this way

06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.19 Apply Bayes Theorem P(w|s) = P(w). P(s|w) / P(s) W* = ARGMAX (P(w). P(s|w)) / P(s)

06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.20 W* =ARGMAX (P(w). P(s|w)) w belongs to Words P(w) = Prior = Language model. P(s|w) = Likelihood of W being pronounced as ‘s’. = Acoustic Model.

06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.21 Acoustic Model Pronunciation dictionary (Finite State Automata). Manually Built - Costly Resource. Example s 1 2 3 4 5 60 t 0m aa t ae 0

06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.22 W* obtained from P(w) and P(w|s) Language model ? Rel. frequency of w in the corpora Ref freq Ξ unigram model P(knee) > P(need) I _ _ _ _ _ Knee  High probability need  Low probability

06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.23 Language Modelling by N-grams N – grams N: 2 – bigrams. 3 – trigrams (Best empirically for English).

06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.1 Application of Noisy Channel, Channel Entropy CS 621 Artificial Intelligence Lecture 15 - 06/09/05.

Similar presentations

Presentation on theme: "06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.1 Application of Noisy Channel, Channel Entropy CS 621 Artificial Intelligence Lecture 15 - 06/09/05."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.1 Application of Noisy Channel, Channel Entropy CS 621 Artificial Intelligence Lecture 15 - 06/09/05.

Similar presentations

Presentation on theme: "06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.1 Application of Noisy Channel, Channel Entropy CS 621 Artificial Intelligence Lecture 15 - 06/09/05."— Presentation transcript:

Similar presentations

About project

Feedback