Presentation is loading. Please wait.

Presentation is loading. Please wait.

SSML Extension for Expressive Mandarin TTS Shuang Li Hongwu Yang Lianhong Cai Tsinghua University.

Similar presentations


Presentation on theme: "SSML Extension for Expressive Mandarin TTS Shuang Li Hongwu Yang Lianhong Cai Tsinghua University."— Presentation transcript:

1

2 SSML Extension for Expressive Mandarin TTS Shuang Li Hongwu Yang Lianhong Cai Tsinghua University

3 Outline  Motivation  Expression of Speech  Proposed SSML extension  Conclusion

4 Motivation(1/3) Sentences with the same text can be expressed with different styles, emotions and moods Current tts system lacks variability

5 Motivation(2/3) Current SSML cannot define speaking style, emotion and mood –Good news: 生日快乐 “Happy birthday” expressed in happiness (emotion) –Bad news: 张总去世了 “Director Zhang passed away” expressed in sadness (emotion) –Information provider: 飞往纽约的飞机将要起飞 “Flight for New York is going to take off”: Expressed in a mild mood –Dialog: 是中国队赢了吗? “Did Chinese team win?”: Emphasize “Chinese”, with interrogative mood Current SSML hard to show the difference between the expressions above

6 Motivation(3/3) emotion Positive, neutral, negative style news Sports comment dialog Info providing …… characteristic Expressive speech  Emotion, style and characteristic are relatively independent but cannot be separated  Characteristic and style: relatively stable and global features  Emotion: short-time, local feature Expressing pattern No tag Phisiological/social characteristics Voice tag Phisiological reactations No tag  With different speaking styles  Representing speaker’s attitude, purpose and emotion  More harmonious with the circumstance

7 Outline  Motivation  Expression of Speech  Proposed SSML extension  Conclusion

8 Expression of Speech Style : speaking style( dialog, news, information providing…) Mood : mood( request, acquisition, affirmation, apology…) Emotion : emotional activities( neutral, negative, positive)

9 Hierarchical framework of Prosody Break level –B0: no break –B1: Syllable –B2: Prosodic word –B3: Prosodic Phrase –B4: Breath Group –B5: Prosodic Group Chiu-yu Tseng,et al. Fluent speech prosody: Framework and modeling. Speech Communication, 46(2005) 284-399

10 我永远忘不了 一张对日抗战时的新 闻照片, 轰炸后的废墟焦土上, 一个衣不蔽体、 满身 尘土灰烟的幼儿 坐在地上 无助的大哭着。 那是 一再令我热泪盈眶的镜头。 新闻 摄影中的战争传真 已不能只称是 照片了。 From Chiu-yu Tseng, report in Beijing University, Oct 11, 2005

11 Outline  Introduction  Expression of Speech  Proposed SSML extension  Conclusion

12 Proposed tag ( 1/2 ) Utterance: prosodic group, expressing a complete meaning –Attributes: Style : speaking style Value : News, Reading, Information provider, dialog, etc Emotion: speaking emotion Value : Happy 、 Sad 、 Angry 、 Calm 、 Despair, etc +1 for positive,0 for neutral, -1 for negative mood : speaking mood Value : given, request, acquisition, affirmation,apology, etc

13 Proposed tag ( 2/2 ) BG: breath group –attributes: intonation : Value : indicative, interrogative, imperative PPh: prosodic phrase PW: prosodic word Syl: Syllable

14 Some examples(1/3) <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/10/synthesis http://www.w3.org/TR/speech-synthesis/synthesis.xsd" xml:lang=“zh-CN"> 1121 次航班 (Flight 1121) 延误 ( has been delayed ) 1 小时 ( for an hour ) 请旅客们到 ( Please go to ) G6 候机厅等候 ( the waiting room )

15 Some examples(2/3) <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/10/synthesis http://www.w3.org/TR/speech-synthesis/synthesis.xsd" xml:lang=“zh-CN"> 张威 (Zhang Wei ) 担心肖荫开车发晕 (is afraid of Xiao Yin being dizzy when driving )

16 Some examples(3/3) <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/10/synthesis http://www.w3.org/TR/speech-synthesis/synthesis.xsd" xml:lang=“zh-CN"> 难道不是你的错吗? (Isn’t it your fault? ) 以后你小心一点 (Be careful next time)

17 Outline  Motivation  Expression of Speech  Proposed SSML extension  Conclusion

18 Conclusion & question? 5 elements for hierarchic prosodic structure –utterance, bg, pph, pw, syl 3 expressive attributes for utterance –style –emotion –mood 1 intonation attributes for bg –intonation


Download ppt "SSML Extension for Expressive Mandarin TTS Shuang Li Hongwu Yang Lianhong Cai Tsinghua University."

Similar presentations


Ads by Google