The influence of tempo and speaking style on timing patterns in Polish Agnieszka Wagner, Katarzyna Klessa Institute of Linguistics, Adam Mickiewicz University in Poznan, Poland ExLing th International Conference on Experimental Linguistics, June, Athens, Greece
Background and motivation The effect of speech tempo on timing patterns: ▫well documented for various languages, but not for Polish ▫significant for the results of the debate on rhythmic classification of languages ( stress/syllable/mora-timing) and correlates of perceived speech rhythm (metrics vs. other models/concepts) State of the art: ▫tendency towards greater syllable-timing/stress-timing with descreasing/increasing tempo (Russo & Barry 2008, Malisz 2013) ▫higher/lower durational variability (of consonantal and vocalic intervals) at slower/faster rates (Barry et al. 2003, Dellwo 2010) ▫average spech rate and tempo differentiation are language-specific (Dellwo & Wagner 2003): FR GER EN ExLing th International Conference on Experimental Linguistics, June, Athens, Greece
Background and motivation Timing patterns & speaking styles – state of the art: ▫higher durational variability ( rhythm metrics) and a shift towards stress-timing in spontaneous speech (Arvaniti 2012) ▫higher nPVI(syl), regression intercept and slope ( TGA – Time Group Analysis) in conversational speech than in read text, and in more informal speaking styles (Yu, Gibbon & Klessa 2014, Gibbon, Klessa & Bachan 2014) Goals: ▫the effects of speech rate and style variation on timing properties of Polish ▫the relationship between intended and laboratory measured tempo ExLing th International Conference on Experimental Linguistics, June, Athens, Greece
Methodology Speech corpus accessed via „timing & duration” database: ▫5 speakers ▫3 speaking styles poems (different metres) text 3 x 5 sentences of different phonotactic structure (stressed-timed, syllable- timed and uncontrolled) ▫5 intended tempi: very slow to very fast (vslow, slow, norm, fast, vfast) ▫for future application: 15 non-native speakers (different L1); spontaneous speech, mini-dialogues automatic segmentation and labeling at phone, syllable & word level with AnnotationPro (Klessa, Karpinski, Wagner 2013) ▫verification and manual correction by trained annotators ExLing th International Conference on Experimental Linguistics, June, Athens, Greece
Methodology TGA – time groups analysis (Gibbon 2013) ▫implemented as AnnotationPro+TGA (Klessa & Gibbon 2014) ▫automatic segmentation into interpausal time groups ▫extraction of linear regression intercept and slope intercept: correlates with segmental duration slope: tendency towards deceleration/acceleration, i.e. average increase/decrease in duration over the pause-defined segment ExLing th International Conference on Experimental Linguistics, June, Athens, Greece
Methodology nPVI-V & rPVI-C (Grabe & Low 2002) ▫automatic segmentation into C and V intervals based on phonetic alignment ▫extraction of interval durations and calculation of PVIs for interpausal time groups using special plugin for AnnotationPro ExLing th International Conference on Experimental Linguistics, June, Athens, Greece
Results 1265 inter-pausal groups min. average of 38 groups, max. average of 64 per speaker more groups produced at slower than at faster rates ExLing th International Conference on Experimental Linguistics, June, Athens, Greece
Intended vs. realized tempo mean speaking rates (syl/sec) realized by the speakers in agreement with the assumptions of the recording scenarios: ▫the fastest in the vfast condition and the slowest in the vslow condition ▫after normalization by speakers and files - slightly higher rates in the sent subcorpus than in poetic verses (poe) and read text (text) ExLing th International Conference on Experimental Linguistics, June, Athens, Greece Mean values of speech rates (syll/sec) realized by the speakers.
Results: statistics (multivariate ANOVA) variableFdfp speaker style tempo speaker*style speaker*tempo style*tempo speaker*style*tempo
TGA results The mean values of regression intercepts: ▫positive correlations with segmental durations ▫negative correlations with speaking rates ▫confirmed for the present data, except fot vfast, text ExLing th International Conference on Experimental Linguistics, June, Athens, Greece Mean values of intercept across speaking styles & tempi.
The mean values of regression slopes: ▫slightly higher at normal or slower rates than at the very fast or fast rates, depending on the speaking style ▫more deceleration at slower rates more dynamics in within- group timing ▫grater differences in slopes in poe and text than in the sent subcorpus ExLing th International Conference on Experimental Linguistics, June, Athens, Greece TGA results Mean values of slope across speaking styles & tempi.
Durational variability and tempo ANOVA style: F=15.9; p<0.01 tempo: F=66.34; p<0.01 style*tempo: F=1.69; p=0.04 nPVI-V style: F=16.32; p<0.01 tempo: F=2.52, p=0.04 style*tempo: F=1.28; p=0.25 rPVI-C style: F=16.77; p<0.01 tempo: F=146.6; p<0.01 style*tempo: F=2.05; p=0.04 ExLing th International Conference on Experimental Linguistics, June, Athens, Greece
Durational variability and style ANOVA: F=6.19; p<0.01 nPVI-V: F=6.62 rPVI-C: F=7.08 cf. nPVI-syl between 35 and 40 (Gibbon et al. 2007); nPVI-V=42.1 (Wagner 2014) and 46.6 (Grabe & Low 2002); rPVI-C=68.1 and 79.1 (Wagner 2014, Grabe & Low 2002) ExLing th International Conference on Experimental Linguistics, June, Athens, Greece
Rhythmic classification of Polish SENTTEXTPOOLED English German Italian Spanish Euclidean distances between Polish and other languages (based on Arvaniti 2012). ExLing th International Conference on Experimental Linguistics, June, Athens, Greece
Summary First insights into a new “speech timing”/”rhythmic” corpus: ▫variability in speech rate measures in agreement with the intended five tempo conditions ▫higher intercepts at slower rates, weaker effect of style and speaker ( speaker normalization) ▫interaction between speaking style & slope: differences between poe, text and sent subcorpora, less variability in sent ExLing th International Conference on Experimental Linguistics, June, Athens, Greece
Summary PVI analysis: ▫generally very low durational variability of V intervals (cf. data for Spanish or Italian, Arvaniti 2012) ▫differences in the nPVI-V due to speaking style related to different timing pattern in the jambic poem ▫nPVI-V stable at different tempi ▫higher durational variability of C intervals at slower rates ▫rPVI-C sensitive also to speech style differences (lower in sent) ▫greatest similarity of Polish timing to Spanish (the lowest euclidean distances) ▫from among English, German (stress-timed), Spanish and Italian (sylable-timed) Polish displays the greatest durational variability among speaking styles as measured by PVIs ExLing th International Conference on Experimental Linguistics, June, Athens, Greece
Future work analyses based on the whole corpus normalization of speaking rates among speakers application of different measures of timing patterns (e.g. other TGA measures, Wagner P. 2008) duration is not everything! ExLing th International Conference on Experimental Linguistics, June, Athens, Greece
Acknowledgements Project “Rhythmic structure of utterances in Polish: A corpus study” from National Science Center (NCN) for years , project number: 2013/11/D/HS2/ Special thanks to: Jolanta Bachan (Institute of Linguistics, AMU) annotators & speakers
References 1.Wiget, L., White, L., Schuppler, B., Grenon, I., Rauch, O. &Arvaniti, A. (2012). The usefulness of metrics in the quantification of speech rhythm. Journal of Phonetics, 40(3), Barry, W. J., Andreeva, B., Russo, M., Dimitrova, S., and Kostadinova, T. (2003). Do rhythm measures tell us anything about language type? D. Recasens, M. J. Solé and J. Romero (eds.) Proceedings of the 15th International Congress of Phonetic Sciences, Barcelona, Spain: 2693– Dauer, R. M. (1983). Stress-timing and syllable-timing reanalyzed. Journal of Phonetics, 11, Dauer, R. M. (1987). Phonetic and phonological components of language rhythm. Proceedings of the XIIth ICPhS, Tallinn, Estonia, (pp ). 5.Dellwo, V. (2006) Rhythm and speech rate: A variation coefficient for deltaC. In P. Karnowski & I. Szigeti (Eds.), Language and Language- Processing: Proceedings of the 38th Linguistics Colloquium, Piliscsaba 2003 (pp ). Frankfurt am Main, Germany: Peter Lang. 6.Dellwo, V. (2010). Influences of speech rate on the acoustic correlates of speech rhythm: An experimental phonetic study based on acoustic and perceptual evidence. Universität Bonn, Bonn University. Bonn, Germany. 7.Dellwo, V., Wagner, P Relations between language rhythm and speech rate. Proc. 15th ICPhS Barcelona, Klessa, K. and Gibbon, D. (2014) Annotation Pro + TGA: automation of speech timing analysis, P roc. of the 9th Language Resources and Evaluation Conference, Reykjavik, Iceland. ISBN Gibbon, D. (2013). TGA: a web tool for Time Group Analysis. Proc. of Tools and Resources for the Analysis of Speech Prosody (TRASP). Aix-en-Provence, 30 August Yu, J., Gibbon, D., & Klessa, K. (2014). Computational annotation-mining of syllable durations in speech varieties. In Proceedings of 7th Speech Prosody Conference (pp ). 11.Grabe, E., & Low, E. L. (2002). Durational variability in speech and the rhythm class hypothesis. Laboratory Phonology 7, Berlin: Mouton de Gruyter. 12.Malisz, Z. (2013). Speech rhythm variability in Polish and English: A study of interaction between rhythmic levels. Niepublikowana rozprawa doktorska. Wydział Anglistyki, Uniwersytet im. Adama Mickiewicza w Poznaniu. 13.Ramus, F., Nespor, M., & Mehler, J. (1999). Correlates of linguistic rhythm in the speech signal. Cognition, 73, Russo, M., & Barry, W. J. (2008). Isochrony reconsidered. Objectifying relations between rhythm measures and speech tempo. In Proceedings of Speech Prosody (Vol. 4, No. 08, pp ).