Presentation is loading. Please wait.

Presentation is loading. Please wait.

“Effect of Genre, Speaker, and Word Class on the Realization of Given and New Information” Julia Agustín Gravano & Julia Hirschberg {agus,

Similar presentations


Presentation on theme: "“Effect of Genre, Speaker, and Word Class on the Realization of Given and New Information” Julia Agustín Gravano & Julia Hirschberg {agus,"— Presentation transcript:

1

2 “Effect of Genre, Speaker, and Word Class on the Realization of Given and New Information” Julia Agustín Gravano & Julia Hirschberg {agus, julia}@cs.columbia.edu Interspeech 2006 - Pittsburgh, PA Spoken Language Processing Group Columbia University

3 Agustín Gravano Interspeech 20062 Motivation Speakers of American English tend to: accent references to “new” information, and deaccent references to “old” (or “given”) information. (Chafe 1974, Prince 1981 & 1992, inter alia) Variation of prominence in “given” entities is strongly affected by the persistence of: grammatical function (subject, object, etc.) and position in the sentence. (Terken & Hirschberg, 1994)

4 Agustín Gravano Interspeech 20063 Motivation Possible applications: Improve naturalness of TTS systems. Aid ASR. Questions: What are other sources of variation? What is the effect of: speaker? genre? word class?

5 Agustín Gravano Interspeech 20064 Main Results Speakers vary the manner in which they realize differences in information status. Speakers tend to produce “given” verbs with higher intensity than “new” verbs, both in read and spontaneous speech.

6 Agustín Gravano Interspeech 20065 Overview Materials and Methods Corpus Information status Word classes Features Results Nouns Verbs Discussion Conclusions

7 Agustín Gravano Interspeech 20066 Boston Directions Corpus Hirschberg & Nakatani 1996 Spontaneous and read monologues. 9 increasingly complex direction-giving tasks: Describe how to get to MIT from Harvard. Method: 1. Spontaneous speech recorded and transcribed. 2. Speakers returned and read. 4 speakers (3 male, 1 female).

8 Agustín Gravano Interspeech 20067 Boston Directions Corpus Mean length of tasks: Spontaneous:111s Read: 84s Excerpt from the spontaneous part of the corpus: first # enter the Harvard Square T stop # and buy a token # then proceed to get on the # Inbound um Red Line # uh subway [...] Corpus size: Spontaneous:~66m Read:~50m Prosody labeled using the ToBI convention.

9 Agustín Gravano Interspeech 20068 Information Status Prince 1981: Entities are new when first introduced in the discourse. Evoked entities are given. They are already in the discourse. Simple definition: A word w is given if in the current task there is at least one previous occurrence of a word with the same stem. Otherwise, we say that w is new.

10 Agustín Gravano Interspeech 20069 Word Classes Automatically labeled the part-of-speech of all the words in the corpus using the Brill Tagger. Categorized words into: Nouns Verbs Adjectives Adverbs Others Significant results only for Nouns and Verbs.

11 Agustín Gravano Interspeech 200610 Features Word acoustic features, extracted using Praat: Max, mean, min pitch Max, mean, min intensity Pitch and intensity features were also normalized with respect to the mean value of: ± 1 second around the target word, ± 5 words around the target word, the target word’s Intermediate Phrase. Pause before and after the word.

12 Agustín Gravano Interspeech 200611 Results: Nouns READSPON S1S2S3S4S1S2S3S4 Max Pitchg Mean Pitchnng Min Pitchng Max Pitch / Context Mean Pitchnnnn Mean Pitch/ Context Mean Pitchnnnn Min Pitch/ Context Mean Pitchnn Max Intensitynngg Mean Intensitynngg Min Intensityggg Max Int/ Context Mean Intensitynnnn Mean Int/ Context Mean Intensitynnngg Min Int / Context Mean Intensityg Pause Beforennnnn Pause Afterggg n = mean value for the new words is significantly larger than for the given words g = mean value for the given words is significantly larger than for the new words

13 Agustín Gravano Interspeech 200612 Results: Verbs READSPON S1S2S3S4S1S2S3S4 Max Pitchn Mean Pitchggn Min Pitchgn Max Pitch / Context Mean Pitchgn Mean Pitch/ Context Mean Pitchg Min Pitch/ Context Mean Pitchgn Max Intensityggggggg Mean Intensityggggggg Min Intensityggg Max Int/ Context Mean Intensityggggg Mean Int/ Context Mean Intensitygggggggg Min Int / Context Mean Intensitygg PauseBeforegg PauseAfterg n = mean value for the new words is significantly larger than for the given words g = mean value for the given words is significantly larger than for the new words

14 Agustín Gravano Interspeech 200613 Discussion: Variation of intensity in verbs Examples: you get out of the T stop # you cross Massachusetts Avenue [...] you wanna cross Mass Ave opposite that # there's usually a bunch of cabs and and people standing around there # so # then once you've crossed it you're you're in Harvard Yard proper then you're right at the entrance to what is called the Infinite Corridor # and it's called the Infinite Corridor because it's this really long # place you can walk entirely indoors Direct objects of ‘cross’ and ‘call’ are either deaccented or pronominalized in the second and third mentions. With no other salient accented items in their phrases, the given mentions of these verbs are more prominent.

15 Agustín Gravano Interspeech 200614 Discussion: Variation of intensity in verbs Example: so you're going to have to transfer # you transfer by going to Government Center which is inbound The increased intensity of the second mention of ‘transfer’ might be due to the change in its verb form. Similar to Terken & Hirschberg, 1994: Given nouns tend to be accented if they represent a different grammatical function from the first mention.

16 Agustín Gravano Interspeech 200615 Conclusions and Future Work Evidence of: Speaker variation in the way they realize differences in information status. Given verbs tend to be produced with a greater intensity than new verbs. Nouns and verbs behave very differently. Only preliminary results: more work needed. Future Work: Repeat and deepen these analyses on larger corpora of read and spontaneous speech, and in conversation.

17 “Effect of Genre, Speaker, and Word Class on the Realization of Given and New Information” Julia Agustín Gravano & Julia Hirschberg {agus, julia}@cs.columbia.edu Interspeech 2006 - Pittsburgh, PA Spoken Language Processing Group Columbia University


Download ppt "“Effect of Genre, Speaker, and Word Class on the Realization of Given and New Information” Julia Agustín Gravano & Julia Hirschberg {agus,"

Similar presentations


Ads by Google