Presentation is loading. Please wait.

Presentation is loading. Please wait.

Spotting Multilingual Consonant-Vowel Units of Speech using Neural Network Models Suryakanth V.Gangashetty, C. Chandra Sekhar, and B.Yegnanarayana Speech.

Similar presentations


Presentation on theme: "Spotting Multilingual Consonant-Vowel Units of Speech using Neural Network Models Suryakanth V.Gangashetty, C. Chandra Sekhar, and B.Yegnanarayana Speech."— Presentation transcript:

1

2 Spotting Multilingual Consonant-Vowel Units of Speech using Neural Network Models Suryakanth V.Gangashetty, C. Chandra Sekhar, and B.Yegnanarayana Speech and Vision Laboratory Department of Computer Science and Engineering Indian Institute of Technology Madras, Chennai – India Email: {svg,chandra, yegna}@cs.iitm.ernet.in

3 isbuleTinki mu khyasa mAchAr mu nnAL mu dalameiccarselvijeylali ta IrOjuvAr ta lolu mukhya m sa lu Speech Signal-to-Symbol Transformation Phonetic engine: Capable of speech signal-to-symbol transformation independent of vocabulary and language

4 Approaches to Speech Signal-to-Symbol Transformation Based on segmentation and labeling –Segmentation of continuous speech signal into regions of subword units –Assignment of labels to the segmented regions using a subword unit classifier Based on spotting subword units in continuous speech –Detection of anchor points in continuous speech –Assignment of labels to the segments around the anchor points using a subword unit classifier

5 Spotting CV Units in Continuous Speech CV type units have the highest frequency of occurrence in speech in Indian languages Subword units of CCV, CCCV and CVC types also contain CV segments Vowel onset point (VOP) can be used as an anchor point for recognition of CV units Detection of VOPs using distributions of feature vectors of C and V regions Models for classification of CV segments

6 Significant Events in a CV Unit

7 VOP Detection using AANN Models AANN models for capturing the distribution of data One AANN for the consonant region of a CV unit Another AANN for the vowel region of a CV unit

8 System for Detection of VOPs using AANNs

9 Illustration of Detection of VOPs (a) Waveform, (b) Hypothesised region labels for each frame, (c)Hypothesised VOPs, and (d) Manually marked (actual) VOPs for the Tamil language sentence /kArgil pahudiyilirundu UDuruvalkArarhaL/

10 Broadcast News Corpus of Indian Languages Description (Number of) Language TamilTeluguHindiMultilingual Bulletins 33 20 1972 Training bulletins 27 16 59 Testing bulletins 6 4 313 CV classes considered 123 138 103196 Training CV segments43,541 41,725 20,2361,05,502 Sentences for testing1,4161,3486303,094

11 Performance for Detection of VOPs Matching hypothesis: A hypothesis with a deviation upto 25 msecs from an actual VOP Missing hypothesis: There is no hypothesis with a deviation upto 25 msecs from an actual VOP Spurious hypothesis: –Multiple hypotheses with a deviation upto 25 msecs –A hypothesis with a deviation greater than 25 msecs VOP Hypotheses (in %) Matching Missing Spurious 68.62 31.38 6.21

12 Classification of CV Segments using SVMs

13 System for Spotting CV Units The system gives a 5-best performance of about 74.63% for spotting CV units in 300 test sentences containing 3,924 syllable-like units

14 Illustration of Spotting CV Units VOP locations (Sample numbers) Lattice of 5-best hypothesised CVs Actual syllable ActualHypothesised 1 2 3 4 5 280 320pA kA vAhashukAr ---------- 720kApAhAnApa----- 2360 2440 gi yE hi yayaigil 3800 3760hApA pa sAsapa 4920 4800 hu gumuvupuhu 5480 5560bIviTiNi dI di 6320 6200 yi lAlizi tIyi 7400 7480 li nirujalaili 8200 ------ VOP Missedrun 9440 9480 du RujadE rAdu 1116011120 mumu kU vapO vAU 12080 Du dadAnAtuDu 12520-------- VOP Missedru 1320013240 va dakai hivAval 1452014560 kA kagachazAkA 15840------- VOP Missedrar 16960 ha kAkaga sahaL

15 Summary and Conclusions Spotting multilingual CV units in continuous speech AANN models for detecting VOPs SVM classifier for recognition of CV units around the VOPs Need to reduce # missing VOPs Further processing of hypothesised CV lattice

16 Thank You


Download ppt "Spotting Multilingual Consonant-Vowel Units of Speech using Neural Network Models Suryakanth V.Gangashetty, C. Chandra Sekhar, and B.Yegnanarayana Speech."

Similar presentations


Ads by Google