The influence of tempo and speaking style on timing patterns in Polish Agnieszka Wagner, Katarzyna Klessa Institute of Linguistics, Adam Mickiewicz University.

Slides:



Advertisements
Similar presentations
Variation and regularities in translation: insights from multiple translation corpora Sara Castagnoli (University of Bologna at Forlì – University of Pisa)
Advertisements

How does first language influence second language rhythm? Laurence White and Sven Mattys Experimental Psychology Bristol University.
Tone perception and production by Cantonese-speaking and English- speaking L2 learners of Mandarin Chinese Yen-Chen Hao Indiana University.
Coarticulation Analysis of Dysarthric Speech Xiaochuan Niu, advised by Jan van Santen.
Rate of Syllable Production in Selected Languages Aubrey Wilson and Ron Netsell Missouri State University Abstract In different situations and across varying.
Phonetic variability of the Greek rhotic sound Mary Baltazani University of Ioannina, Greece  Rhotics exhibit considerable phonetic variety cross-linguistically.
Analyses on IFA corpus Louis C.W. Pols Institute of Phonetic Sciences (IFA) Amsterdam Center for Language and Communication (ACLC) Project meeting INTAS.
1 The Effect of Pitch Span on the Alignment of Intonational Peaks and Plateaux Rachael-Anne Knight University of Cambridge.
Spoken Language Analysis Dept. of General & Comparative Linguistics Christian-Albrechts-Universität zu Kiel Oliver Niebuhr 1 At the Segment-Prosody.
Nuclear Accent Shape and the Perception of Prominence Rachael-Anne Knight Prosody and Pragmatics 15 th November 2003.
Nigerian English prosody Sociolinguistics: Varieties of English Class 8.
EP and BP Rhythm: Acoustic and Perceptual Evidence Sónia Frota Universidade de Lisboa Marina Vigário, Fernando Martins.
AN ACOUSTIC PROFILE OF SPEECH EFFICIENCY R.J.J.H. van Son, Barbertje M. Streefkerk, and Louis C.W. Pols Institute of Phonetic Sciences / ACLC University.
Perception of syllable prominence by listeners with and without competence in the tested language Anders Eriksson 1, Esther Grabe 2 & Hartmut Traunmüller.
: Recognition Speech Segmentation Speech activity detection Vowel detection Duration parameters extraction Intonation parameters extraction German Italian.
Results ISI Variance in STP Corpus ISI Variance in BU Corpus * p
Niebuhr, D‘Imperio, Gili Fivela, Cangemi 1 Are there “Shapers” and “Aligners” ? Individual differences in signalling pitch accent category.
A comparison of rhythms in Jamaican Creole speech and reggae music Project’s long term goals We chose to compare the rhythmic patterns of Jamaican Creole.
Emotions and Voice Quality: Experiments with Sinusoidal Modeling Authors: Carlo Drioli, Graziano Tisato, Piero Cosi, Fabio Tesser Institute of Cognitive.
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
The statistical analysis of acoustic correlates of speech rhythm.
Languages’ rhythm and language acquisition Franck Ramus Laboratoire de Sciences Cognitives et Psycholinguistique, Paris Jacques Mehler, Marina Nespor,
Sonority as a Basis for Rhythmic Class Discrimination Antonio Galves, USP. Jesus Garcia, USP. Denise Duarte, USP and UFGo. Charlotte Galves, UNICAMP.
Spoken Language Technologies: A review of application areas and research issues Analysis and synthesis of F0 contours Agnieszka Wagner Department of Phonetics,
Comparing feature sets for acted and spontaneous speech in view of automatic emotion recognition Thurid Vogt, Elisabeth André ICME 2005 Multimedia concepts.
Classification of Discourse Functions of Affirmative Words in Spoken Dialogue Julia Agustín Gravano, Stefan Benus, Julia Hirschberg Shira Mitchell, Ilia.
Anchoring effects in Spanish Pilar Prieto and Francisco Torreira (ICREA-UAB & ULB) 2004 TIE Workshop Santorini, September 11-13, 2004.
Semantic Video Classification Based on Subtitles and Domain Terminologies Polyxeni Katsiouli, Vassileios Tsetsos, Stathes Hadjiefthymiades P ervasive C.
Acoustic and Linguistic Characterization of Spontaneous Speech Masanobu Nakamura, Koji Iwano, and Sadaoki Furui Department of Computer Science Tokyo Institute.
STANDARDIZATION OF SPEECH CORPUS Li Ai-jun, Yin Zhi-gang Phonetics Laboratory, Institute of Linguistics, Chinese Academy of Social Sciences.
Schizophrenia and Depression – Evidence in Speech Prosody Student: Yonatan Vaizman Advisor: Prof. Daphna Weinshall Joint work with Roie Kliper and Dr.
9 th Conference on Telecommunications – Conftele 2013 Castelo Branco, Portugal, May 8-10, 2013 Sara Candeias 1 Dirce Celorico 1 Jorge Proença 1 Arlindo.
Segmental factors in language proficiency: Velarization degree as a signature of pronunciation talent Henrike Baumotte and Grzegorz Dogil {henrike.baumotte,
As a conclusion, our system can perform good performance on a read speech corpus, but we will have to develop more accurate tools in order to model the.
Some thoughts on modelling phonetic effects in corpora.
Copyright 2007, Toshiba Corporation. How (not) to Select Your Voice Corpus: Random Selection vs. Phonologically Balanced Tanya Lambert, Norbert Braunschweiler,
On Speaker-Specific Prosodic Models for Automatic Dialog Act Segmentation of Multi-Party Meetings Jáchym Kolář 1,2 Elizabeth Shriberg 1,3 Yang Liu 1,4.
SPEECH CONTENT Spanish Expressive Voices: Corpus for Emotion Research in Spanish R. Barra-Chicote 1, J. M. Montero 1, J. Macias-Guarasa 2, S. Lufti 1,
LREC 2008, Marrakech, Morocco1 Automatic phone segmentation of expressive speech L. Charonnat, G. Vidal, O. Boëffard IRISA/Cordial, Université de Rennes.
Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University Korea.
5aSC5. The Correlation between Perceiving and Producing English Obstruents across Korean Learners Kenneth de Jong & Yen-chen Hao Department of Linguistics.
Bernd Möbius CoE MMCI Saarland University Lecture 7 8 Dec 2010 Unit Selection Synthesis B Möbius Unit selection synthesis Text-to-Speech Synthesis.
Evaluating prosody prediction in synthesis with respect to Modern Greek prenuclear accents Elisabeth Chorianopoulou MSc in Speech and Language Processing.
The vowel detection algorithm provides an estimation of the actual number of vowel present in the waveform. It thus provides an estimate of SR(u) : François.
A methodology for the creation of a forensic speaker recognition database to handle mismatched conditions Anil Alexander and Andrzej Drygajlo Swiss Federal.
The Effect of Pitch Span on Intonational Plateaux Rachael-Anne Knight University of Cambridge Speech Prosody 2002.
Recognizing Discourse Structure: Speech Discourse & Dialogue CMSC October 11, 2006.
1 Cross-language evidence for three factors in speech perception Sandra Anacleto uOttawa.
© 2005, it - instituto de telecomunicações. Todos os direitos reservados. Arlindo Veiga 1,2 Sara Cadeias 1 Carla Lopes 1,2 Fernando Perdigão 1,2 1 Instituto.
Arlindo Veiga Dirce Celorico Jorge Proença Sara Candeias Fernando Perdigão Prosodic and Phonetic Features for Speaking Styles Classification and Detection.
The Relation Between Speech Intelligibility and The Complex Modulation Spectrum Steven Greenberg International Computer Science Institute 1947 Center Street,
Merging Segmental, Rhythmic and Fundamental Frequency Features for Automatic Language Identification Jean-Luc Rouas 1, Jérôme Farinas 1 & François Pellegrino.
Control of prosodic features under perturbation in collaboration with Frank Guenther Dept. of Cognitive and Neural Systems, BU Carrie Niziolek [carrien]
The role of prosody in dialect authentication Simulating Masan dialect with Seoul speech segments Kyuchul Yoon Division of English, Kyungnam University.
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
Dialect Simulation through Prosody Transfer: A preliminary study on simulating Masan dialect with Seoul dialect Kyuchul Yoon Division of English, Kyungnam.
Review: Review: Translating without in-domain corpus: Machine translation post-editing with online learning techniques Antonio L. Lagarda, Daniel Ortiz-Martínez,
RESEARCH MOTHODOLOGY SZRZ6014 Dr. Farzana Kabir Ahmad Taqiyah Khadijah Ghazali (814537) SENTIMENT ANALYSIS FOR VOICE OF THE CUSTOMER.
Arnar Thor Jensson Koji Iwano Sadaoki Furui Tokyo Institute of Technology Development of a Speech Recognition System For Icelandic Using Machine Translated.
Dean Luo, Wentao Gu, Ruxin Luo and Lixin Wang
Effects of Musical Experience on Learning Lexical Tone Categories
17th International Conference on Infant Studies Baltimore, Maryland, March 2010 Language Discrimination by Infants: Discriminating Within the Native.
August 15, 2008, presented by Rio Akasaka
Acoustics´08 Paris, 29 June – July 2008
Text-To-Speech System for English
Dean Luo, Wentao Gu, Ruxin Luo and Lixin Wang
Understanding Variation of VOT in spontaneous speech
Analyzing F0 and vowel formants of Persian based on long-term features
Within-speaker variability in long-term F0
Presentation transcript:

The influence of tempo and speaking style on timing patterns in Polish Agnieszka Wagner, Katarzyna Klessa Institute of Linguistics, Adam Mickiewicz University in Poznan, Poland ExLing th International Conference on Experimental Linguistics, June, Athens, Greece

Background and motivation The effect of speech tempo on timing patterns: ▫well documented for various languages, but not for Polish ▫significant for the results of the debate on rhythmic classification of languages (  stress/syllable/mora-timing) and correlates of perceived speech rhythm (metrics vs. other models/concepts) State of the art: ▫tendency towards greater syllable-timing/stress-timing with descreasing/increasing tempo (Russo & Barry 2008, Malisz 2013) ▫higher/lower durational variability (of consonantal and vocalic intervals) at slower/faster rates (Barry et al. 2003, Dellwo 2010) ▫average spech rate and tempo differentiation are language-specific (Dellwo & Wagner 2003): FR  GER  EN ExLing th International Conference on Experimental Linguistics, June, Athens, Greece

Background and motivation Timing patterns & speaking styles – state of the art: ▫higher durational variability (  rhythm metrics) and a shift towards stress-timing in spontaneous speech (Arvaniti 2012) ▫higher nPVI(syl), regression intercept and slope (  TGA – Time Group Analysis) in conversational speech than in read text, and in more informal speaking styles (Yu, Gibbon & Klessa 2014, Gibbon, Klessa & Bachan 2014) Goals: ▫the effects of speech rate and style variation on timing properties of Polish ▫the relationship between intended and laboratory measured tempo ExLing th International Conference on Experimental Linguistics, June, Athens, Greece

Methodology Speech corpus accessed via „timing & duration” database: ▫5 speakers ▫3 speaking styles  poems (different metres)  text  3 x 5 sentences of different phonotactic structure (stressed-timed, syllable- timed and uncontrolled) ▫5 intended tempi: very slow to very fast (vslow, slow, norm, fast, vfast) ▫for future application: 15 non-native speakers (different L1); spontaneous speech, mini-dialogues automatic segmentation and labeling at phone, syllable & word level with AnnotationPro (Klessa, Karpinski, Wagner 2013) ▫verification and manual correction by trained annotators ExLing th International Conference on Experimental Linguistics, June, Athens, Greece

Methodology TGA – time groups analysis (Gibbon 2013) ▫implemented as AnnotationPro+TGA (Klessa & Gibbon 2014) ▫automatic segmentation into interpausal time groups ▫extraction of linear regression intercept and slope  intercept: correlates with segmental duration  slope: tendency towards deceleration/acceleration, i.e. average increase/decrease in duration over the pause-defined segment ExLing th International Conference on Experimental Linguistics, June, Athens, Greece

Methodology nPVI-V & rPVI-C (Grabe & Low 2002) ▫automatic segmentation into C and V intervals based on phonetic alignment ▫extraction of interval durations and calculation of PVIs for interpausal time groups using special plugin for AnnotationPro ExLing th International Conference on Experimental Linguistics, June, Athens, Greece

Results 1265 inter-pausal groups min. average of 38 groups, max. average of 64 per speaker more groups produced at slower than at faster rates ExLing th International Conference on Experimental Linguistics, June, Athens, Greece

Intended vs. realized tempo mean speaking rates (syl/sec) realized by the speakers in agreement with the assumptions of the recording scenarios: ▫the fastest in the vfast condition and the slowest in the vslow condition ▫after normalization by speakers and files - slightly higher rates in the sent subcorpus than in poetic verses (poe) and read text (text) ExLing th International Conference on Experimental Linguistics, June, Athens, Greece Mean values of speech rates (syll/sec) realized by the speakers.

Results: statistics (multivariate ANOVA) variableFdfp speaker style tempo speaker*style speaker*tempo style*tempo speaker*style*tempo

TGA results The mean values of regression intercepts: ▫positive correlations with segmental durations ▫negative correlations with speaking rates ▫confirmed for the present data, except fot vfast, text ExLing th International Conference on Experimental Linguistics, June, Athens, Greece Mean values of intercept across speaking styles & tempi.

The mean values of regression slopes: ▫slightly higher at normal or slower rates than at the very fast or fast rates, depending on the speaking style ▫more deceleration at slower rates  more dynamics in within- group timing ▫grater differences in slopes in poe and text than in the sent subcorpus ExLing th International Conference on Experimental Linguistics, June, Athens, Greece TGA results Mean values of slope across speaking styles & tempi.

Durational variability and tempo ANOVA style: F=15.9; p<0.01 tempo: F=66.34; p<0.01 style*tempo: F=1.69; p=0.04 nPVI-V style: F=16.32; p<0.01 tempo: F=2.52, p=0.04 style*tempo: F=1.28; p=0.25 rPVI-C style: F=16.77; p<0.01 tempo: F=146.6; p<0.01 style*tempo: F=2.05; p=0.04 ExLing th International Conference on Experimental Linguistics, June, Athens, Greece

Durational variability and style ANOVA: F=6.19; p<0.01 nPVI-V: F=6.62 rPVI-C: F=7.08 cf. nPVI-syl between 35 and 40 (Gibbon et al. 2007); nPVI-V=42.1 (Wagner 2014) and 46.6 (Grabe & Low 2002); rPVI-C=68.1 and 79.1 (Wagner 2014, Grabe & Low 2002) ExLing th International Conference on Experimental Linguistics, June, Athens, Greece

Rhythmic classification of Polish SENTTEXTPOOLED English German Italian Spanish Euclidean distances between Polish and other languages (based on Arvaniti 2012). ExLing th International Conference on Experimental Linguistics, June, Athens, Greece

Summary First insights into a new “speech timing”/”rhythmic” corpus: ▫variability in speech rate measures in agreement with the intended five tempo conditions ▫higher intercepts at slower rates, weaker effect of style and speaker (  speaker normalization) ▫interaction between speaking style & slope: differences between poe, text and sent subcorpora, less variability in sent ExLing th International Conference on Experimental Linguistics, June, Athens, Greece

Summary PVI analysis: ▫generally very low durational variability of V intervals (cf. data for Spanish or Italian, Arvaniti 2012) ▫differences in the nPVI-V due to speaking style related to different timing pattern in the jambic poem ▫nPVI-V stable at different tempi ▫higher durational variability of C intervals at slower rates ▫rPVI-C sensitive also to speech style differences (lower in sent) ▫greatest similarity of Polish timing to Spanish (the lowest euclidean distances) ▫from among English, German (stress-timed), Spanish and Italian (sylable-timed) Polish displays the greatest durational variability among speaking styles as measured by PVIs ExLing th International Conference on Experimental Linguistics, June, Athens, Greece

Future work analyses based on the whole corpus normalization of speaking rates among speakers application of different measures of timing patterns (e.g. other TGA measures, Wagner P. 2008) duration is not everything! ExLing th International Conference on Experimental Linguistics, June, Athens, Greece

Acknowledgements Project “Rhythmic structure of utterances in Polish: A corpus study” from National Science Center (NCN) for years , project number: 2013/11/D/HS2/ Special thanks to: Jolanta Bachan (Institute of Linguistics, AMU) annotators & speakers

References 1.Wiget, L., White, L., Schuppler, B., Grenon, I., Rauch, O. &Arvaniti, A. (2012). The usefulness of metrics in the quantification of speech rhythm. Journal of Phonetics, 40(3), Barry, W. J., Andreeva, B., Russo, M., Dimitrova, S., and Kostadinova, T. (2003). Do rhythm measures tell us anything about language type? D. Recasens, M. J. Solé and J. Romero (eds.) Proceedings of the 15th International Congress of Phonetic Sciences, Barcelona, Spain: 2693– Dauer, R. M. (1983). Stress-timing and syllable-timing reanalyzed. Journal of Phonetics, 11, Dauer, R. M. (1987). Phonetic and phonological components of language rhythm. Proceedings of the XIIth ICPhS, Tallinn, Estonia, (pp ). 5.Dellwo, V. (2006) Rhythm and speech rate: A variation coefficient for deltaC. In P. Karnowski & I. Szigeti (Eds.), Language and Language- Processing: Proceedings of the 38th Linguistics Colloquium, Piliscsaba 2003 (pp ). Frankfurt am Main, Germany: Peter Lang. 6.Dellwo, V. (2010). Influences of speech rate on the acoustic correlates of speech rhythm: An experimental phonetic study based on acoustic and perceptual evidence. Universität Bonn, Bonn University. Bonn, Germany. 7.Dellwo, V., Wagner, P Relations between language rhythm and speech rate. Proc. 15th ICPhS Barcelona, Klessa, K. and Gibbon, D. (2014) Annotation Pro + TGA: automation of speech timing analysis, P roc. of the 9th Language Resources and Evaluation Conference, Reykjavik, Iceland. ISBN Gibbon, D. (2013). TGA: a web tool for Time Group Analysis. Proc. of Tools and Resources for the Analysis of Speech Prosody (TRASP). Aix-en-Provence, 30 August Yu, J., Gibbon, D., & Klessa, K. (2014). Computational annotation-mining of syllable durations in speech varieties. In Proceedings of 7th Speech Prosody Conference (pp ). 11.Grabe, E., & Low, E. L. (2002). Durational variability in speech and the rhythm class hypothesis. Laboratory Phonology 7, Berlin: Mouton de Gruyter. 12.Malisz, Z. (2013). Speech rhythm variability in Polish and English: A study of interaction between rhythmic levels. Niepublikowana rozprawa doktorska. Wydział Anglistyki, Uniwersytet im. Adama Mickiewicza w Poznaniu. 13.Ramus, F., Nespor, M., & Mehler, J. (1999). Correlates of linguistic rhythm in the speech signal. Cognition, 73, Russo, M., & Barry, W. J. (2008). Isochrony reconsidered. Objectifying relations between rhythm measures and speech tempo. In Proceedings of Speech Prosody (Vol. 4, No. 08, pp ).