Presentation is loading. Please wait.

Presentation is loading. Please wait.

Outlines  Objectives  Study of Thai tones  Construction of contextual factors  Design of decision-tree structures  Design of context clustering.

Similar presentations


Presentation on theme: "Outlines  Objectives  Study of Thai tones  Construction of contextual factors  Design of decision-tree structures  Design of context clustering."— Presentation transcript:

1

2

3 Outlines  Objectives  Study of Thai tones  Construction of contextual factors  Design of decision-tree structures  Design of context clustering styles  Characteristics of Thai tones  Categorizations of Thai tones  Tree-based context clustering  Evaluation of overall tone correctness  Evaluation of tone correctness for each tone type  Evaluation of syllable duration distortion  Experiments  Conclusions

4 Objectives  To implement an HMM-based speech synthesis system for Thai language with the highest correctness of tone.

5 Study of Thai tones  Characteristics of Thai tones  Syllable Structure [Nakasakul2002]  Thai : Tonal Language รัก r-a-k^-3 (love) เรื่อย r-va-j^-2 (always) เคร่ง khr-e-ng^-2 (strict) เครียด khr-ia-t^-2 (stress) และ l-x-3 (and) เพลีย phl-iia-0 (exhausted) เสีย s-iia-4 (spoil) ปริ pr-i-1 (break)

6 Study of Thai tones  Characteristics of Thai tones  F0 contours of Standard Thai Tones (normalized duration) [Luksaneeyanawin1992] สามัญ Middle(0) เอก Low(1) โท Falling(2) ตรี High(3) จัตวา Rising(4)

7 Study of Thai tones  Categorizations of Thai tones  Abramson divided the tones into two groups:  static group  dynamic group  According to the final trend of contours:  upward trend group  downward trend group

8 HMM-based speech synthesizer Phoneme based speech unit modeling Provide flexible models, an efficient adaptation  Speaker adaptation  Speaking style conversion  1994 K. Tokuda; et al, proposed HMM-based speech synthesizer for Japanese

9  Phrase level current word position in current phrase the number of syllables in {preceding, current, succeeding} phrase  Utterance level current phrase position in current sentence the number of syllables in current sentence the number of words in current sentence  Phoneme level {preceding, current, succeeding} phonetic type {preceding, current, succeeding} part of syllable structure  Syllable level {preceding, current, succeeding} tone type the number of phones in {preceding, current, succeeding} syllable current phone position in current syllable  Word level current syllable position in current word part of speech the number of syllables in {preceding, current, succeeding} word Tree-based context clustering  Construction of contextual factors Context clustering is to treat the problem of limitation of training data.

10 Tree-based context clustering  Design of decision-tree structures F0 contours of (a) synthesized speech from the clustering style of single binary tree without tone type questions and (b) natural speech. Problem of Misshaped F0 contour

11 Tree-based context clustering  Design of decision-tree structures

12 Tree-based context clustering  Design of 8 context clustering styles (a)-(h) + tone type questions (g)+ tone type questions (e)+ tone type questions (h)+ tone type questions (f)

13 1. Sentence structure analysis 2. Word structure analysis 3. Full context labeling 4. Construction of question set for context clustering 5. Feature extraction System Preparations VAJA Speech corpus Wav fileLabel file ORCHID Text corpus Wav file Label file XML file Parameter file (.cmp) Full context Labeling Feature Extraction (mcep,f0) Parameter file (.cmp) Parameter file (.cmp) Parameter file (.cmp) Full context label file(.lab) Label file (.lab) Label file (.lab) Label file (.lab) Label file (.lab) Full context label file(.lab) Full context label file(.lab) Full context label file(.lab) HMM Training and Synthesis Synthetic Speech

14 Experiments  Evaluation of overall tone correctness Figure 5: F0 contours of synthesized speech from 8 different clustering styles; and F0 contour of natural speech.

15 Experiments  Evaluation of overall tone correctness Figure 6: Tone error percentages of synthesized speech from 4 different clustering styles

16 Experiments  Evaluation of overall tone correctness Figure 7: Tone error percentages of synthesized speech from 8 different clustering styles

17 Experiments  Evaluation of tone correctness for each tone type Figure 8: Tone error percentages of synthesized speech from 8 different clustering styles categorized by tone types;

18 Experiments  Evaluation of syllable duration distortion Figure 9: Scores of a paired-comparison test for natural duration among 4 different clustering styles;

19 Examples of synthesized speech Female Method corpus size (number of training utterances) Examples 1 2 3 HMM 100 500 2500 VAJA (Unit Selection) Analysis-Synthesis speech Female MethodTree StructureAdd tone question set HMM (a)(e) (b)(f) (c)(g) (d)(h)

20 Conclusions  An analysis of tree-based context clustering of an HMM-based Thai speech synthesis system has been conducted in this paper.  Four structures of decision tree were designed according to tone groups and tone types to obtain higher correctness of tone of synthesized speech.  The results show that the tone-separated tree structures can reduce the tone error percentage of the synthesized speech compared to the single binary tree structure significantly.  As for using the contextual tone information in the syllable level, it can improve the tone correctness for all structures of decision tree.  There are some distortions of the syllable duration appearing in the case of using the simple tone-separated tree context clustering with a small amount of training data, however it can be relieved when using the constancy-based-tone-separated or the trend-based-tone-separated tree context clustering.  The analysis of tone correctness of the average-voice-based speech model and the intonation analysis issues are anticipated to be studied in the future.


Download ppt "Outlines  Objectives  Study of Thai tones  Construction of contextual factors  Design of decision-tree structures  Design of context clustering."

Similar presentations


Ads by Google