Download presentation
Presentation is loading. Please wait.
Published byDonald Manning Modified over 8 years ago
2
SSML Extension for Expressive Mandarin TTS Shuang Li Hongwu Yang Lianhong Cai Tsinghua University
3
Outline Motivation Expression of Speech Proposed SSML extension Conclusion
4
Motivation(1/3) Sentences with the same text can be expressed with different styles, emotions and moods Current tts system lacks variability
5
Motivation(2/3) Current SSML cannot define speaking style, emotion and mood –Good news: 生日快乐 “Happy birthday” expressed in happiness (emotion) –Bad news: 张总去世了 “Director Zhang passed away” expressed in sadness (emotion) –Information provider: 飞往纽约的飞机将要起飞 “Flight for New York is going to take off”: Expressed in a mild mood –Dialog: 是中国队赢了吗? “Did Chinese team win?”: Emphasize “Chinese”, with interrogative mood Current SSML hard to show the difference between the expressions above
6
Motivation(3/3) emotion Positive, neutral, negative style news Sports comment dialog Info providing …… characteristic Expressive speech Emotion, style and characteristic are relatively independent but cannot be separated Characteristic and style: relatively stable and global features Emotion: short-time, local feature Expressing pattern No tag Phisiological/social characteristics Voice tag Phisiological reactations No tag With different speaking styles Representing speaker’s attitude, purpose and emotion More harmonious with the circumstance
7
Outline Motivation Expression of Speech Proposed SSML extension Conclusion
8
Expression of Speech Style : speaking style( dialog, news, information providing…) Mood : mood( request, acquisition, affirmation, apology…) Emotion : emotional activities( neutral, negative, positive)
9
Hierarchical framework of Prosody Break level –B0: no break –B1: Syllable –B2: Prosodic word –B3: Prosodic Phrase –B4: Breath Group –B5: Prosodic Group Chiu-yu Tseng,et al. Fluent speech prosody: Framework and modeling. Speech Communication, 46(2005) 284-399
10
我永远忘不了 一张对日抗战时的新 闻照片, 轰炸后的废墟焦土上, 一个衣不蔽体、 满身 尘土灰烟的幼儿 坐在地上 无助的大哭着。 那是 一再令我热泪盈眶的镜头。 新闻 摄影中的战争传真 已不能只称是 照片了。 From Chiu-yu Tseng, report in Beijing University, Oct 11, 2005
11
Outline Introduction Expression of Speech Proposed SSML extension Conclusion
12
Proposed tag ( 1/2 ) Utterance: prosodic group, expressing a complete meaning –Attributes: Style : speaking style Value : News, Reading, Information provider, dialog, etc Emotion: speaking emotion Value : Happy 、 Sad 、 Angry 、 Calm 、 Despair, etc +1 for positive,0 for neutral, -1 for negative mood : speaking mood Value : given, request, acquisition, affirmation,apology, etc
13
Proposed tag ( 2/2 ) BG: breath group –attributes: intonation : Value : indicative, interrogative, imperative PPh: prosodic phrase PW: prosodic word Syl: Syllable
14
Some examples(1/3) <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/10/synthesis http://www.w3.org/TR/speech-synthesis/synthesis.xsd" xml:lang=“zh-CN"> 1121 次航班 (Flight 1121) 延误 ( has been delayed ) 1 小时 ( for an hour ) 请旅客们到 ( Please go to ) G6 候机厅等候 ( the waiting room )
15
Some examples(2/3) <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/10/synthesis http://www.w3.org/TR/speech-synthesis/synthesis.xsd" xml:lang=“zh-CN"> 张威 (Zhang Wei ) 担心肖荫开车发晕 (is afraid of Xiao Yin being dizzy when driving )
16
Some examples(3/3) <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/10/synthesis http://www.w3.org/TR/speech-synthesis/synthesis.xsd" xml:lang=“zh-CN"> 难道不是你的错吗? (Isn’t it your fault? ) 以后你小心一点 (Be careful next time)
17
Outline Motivation Expression of Speech Proposed SSML extension Conclusion
18
Conclusion & question? 5 elements for hierarchic prosodic structure –utterance, bg, pph, pw, syl 3 expressive attributes for utterance –style –emotion –mood 1 intonation attributes for bg –intonation
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.