Presentation is loading. Please wait.

Presentation is loading. Please wait.

06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.1 Application of Noisy Channel, Channel Entropy CS 621 Artificial Intelligence Lecture 15 - 06/09/05.

Similar presentations


Presentation on theme: "06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.1 Application of Noisy Channel, Channel Entropy CS 621 Artificial Intelligence Lecture 15 - 06/09/05."— Presentation transcript:

1 06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.1 Application of Noisy Channel, Channel Entropy CS 621 Artificial Intelligence Lecture 15 - 06/09/05 Prof. Pushpak Bhattacharyya

2 06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.2 S = {s 1, s 2 … s q } R = {t 1, t 2 … t q } Noisy Channel S R SPEECH RECOGNITION ( ASR – Automatic SR) - Signal processing (low level). - Cognitive Processing (higher level categories).

3 06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.3 Noisy Channel Metaphor Due to Jelinek (IBM) – 1970’s Main field of study – speech. Problem Definition S = {Speech signals} = {s 1, s 2 … s s } R = {w 1, w 2 … w r } {s 1, s 2 … s p }  {w 1, w 2 … w q }

4 06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.4 Special and Easier case Isolated word Recognition (IWR) Complexity due to ‘Word Boundary’ will not arise. Example : I got a plate vs I got up late

5 06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.5 Homophones: Words have same pronunciation. Example:bear, beer : Homographs: Words have same spellings but different meaning Example: bank; River bank and finance bank Homophones And Homographs

6 06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.6 World of sounds – speech signals Phonetics Phonology World of words  Orthography letters :Consonants Vowels World Of Sounds

7 06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.7 Since alphabet to sound mapping is not one to one Vowels Tomato Tomaeto Tomaato

8 06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.8 Sound Variations Lexical variations ‘because’ ‘cause because Allophonic variations ‘because’ because becase

9 06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.9 Allophonic variations: More remarkable example Do  [ δ][U] Go  [G][0]

10 06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.10 Socio-cultural variations something somethingsomethin formalinformal Dialectic variation Very – bheri in Bengal apple – ieple in south eple in north aapel in bengal

11 06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.11 Orthography -- Phonology complex problem Very difficult to model using ‘Rule Governed’ system.

12 06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.12 Probabilistic Approach W* = Best estimate for a word given S N C S W* W* = ARGMAX [ P(w|s) ] w belongs to set of words

13 06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.13 P(w|s) called the ‘parameter’ of the system. Estimation  Training The probability values need to be estimated from “SPEECH CORPORA”. Record speech of many speakers.

14 06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.14 Look of Speech Corpora Annotation – Unique pronunciation. Signal Apple

15 06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.15 Repository of Standard Sound Symbols IPA – International Phonetic Association. ARPABET – American’s Phonetic STD.

16 06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.16 t Augment the Roman Alphabet with Greek symbols e [Є] ‘ebb’ [i] ‘need’ top [ t] IPA tool [θ] IPA

17 06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.17 Speech corpora are annotated with IPA/ARPABET symbols. Indian Scenario Hindi TIFR Marathi IITB Tamil IITM

18 06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.18 How to Estimate P(w|s) from speech corpora count(w,s)/ count(s) Not done this way

19 06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.19 Apply Bayes Theorem P(w|s) = P(w). P(s|w) / P(s) W* = ARGMAX (P(w). P(s|w)) / P(s)

20 06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.20 W* =ARGMAX (P(w). P(s|w)) w belongs to Words P(w) = Prior = Language model. P(s|w) = Likelihood of W being pronounced as ‘s’. = Acoustic Model.

21 06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.21 Acoustic Model Pronunciation dictionary (Finite State Automata). Manually Built - Costly Resource. Example s 1 2 3 4 5 60 t 0m aa t ae 0

22 06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.22 W* obtained from P(w) and P(w|s) Language model ? Rel. frequency of w in the corpora Ref freq Ξ unigram model P(knee) > P(need) I _ _ _ _ _ Knee  High probability need  Low probability

23 06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.23 Language Modelling by N-grams N – grams N: 2 – bigrams. 3 – trigrams (Best empirically for English).


Download ppt "06.09.2005Prof. Pushpak Bhattacharyya, IIT Bombay.1 Application of Noisy Channel, Channel Entropy CS 621 Artificial Intelligence Lecture 15 - 06/09/05."

Similar presentations


Ads by Google