DFG Project BA 737/10: "Cross-language and individual differences in the production and perception of syllabic prominence. Rhythm-typology revisited."

Slides:



Advertisements
Similar presentations
How does first language influence second language rhythm? Laurence White and Sven Mattys Experimental Psychology Bristol University.
Advertisements

Tone perception and production by Cantonese-speaking and English- speaking L2 learners of Mandarin Chinese Yen-Chen Hao Indiana University.
The Role of F0 in the Perceived Accentedness of L2 Speech Mary Grantham O’Brien Stephen Winters GLAC-15, Banff, Alberta May 1, 2009.
Speech Productions of French- English Bilingual Speakers in Western Canada Nicole Netelenbos Fangfang Li.
COST2102 International School - Development of Multimodal Interfacesslide 1 Analyzing complementary acoustic cues for signalling prominence in different.
The perception of dialect Julia Fischer-Weppler HS Speaker Characteristics Venice International University
Using prosody to avoid ambiguity: Effects of speaker awareness and referential context Snedeker and Trueswell (2003) Psych 526 Eun-Kyung Lee.
Speech perception 2 Perceptual organization of speech.
Speech Science XII Speech Perception (acoustic cues) Version
Suprasegmentals The term suprasegmental refers to those properties of an utterance which aren't properties of any single segment. The following are usually.
Syllables and Stress, part II October 22, 2012 Potentialities There are homeworks to hand back! Production Exercise #2 is due at 5 pm today! First off:
Prosodics, Part 1 LIN Prosodics, or Suprasegmentals Remember, from our first discussions in class, that speech is really a continuous flow of initiation,
Nuclear Accent Shape and the Perception of Prominence Rachael-Anne Knight Prosody and Pragmatics 15 th November 2003.
Nigerian English prosody Sociolinguistics: Varieties of English Class 8.
EP and BP Rhythm: Acoustic and Perceptual Evidence Sónia Frota Universidade de Lisboa Marina Vigário, Fernando Martins.
Perception of syllable prominence by listeners with and without competence in the tested language Anders Eriksson 1, Esther Grabe 2 & Hartmut Traunmüller.
Niebuhr, D‘Imperio, Gili Fivela, Cangemi 1 Are there “Shapers” and “Aligners” ? Individual differences in signalling pitch accent category.
A comparison of rhythms in Jamaican Creole speech and reggae music Project’s long term goals We chose to compare the rhythmic patterns of Jamaican Creole.
Prosodic Signalling of (Un)Expected Information in South Swedish Gilbert Ambrazaitis Linguistics and Phonetics Centre for Languages and Literature.
Tone, Accent and Stress February 14, 2014 Practicalities Production Exercise #2 is due at 5 pm today! For Monday after the break: Yoruba tone transcription.
Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun
A.Diederich – International University Bremen – USC – MMM – Spring 2005 Rhythm and timing  Clarke, E.F. Rhythm and timing in music. In Deutsch, D. Chapter.
On the Correlation between Energy and Pitch Accent in Read English Speech Andrew Rosenberg, Julia Hirschberg Columbia University Interspeech /14/06.
Dianne Bradley & Eva Fern á ndez Graduate Center & Queens College CUNY Eliciting and Documenting Default Prosody ABRALIN23-FEB-05.
On the Correlation between Energy and Pitch Accent in Read English Speech Andrew Rosenberg Weekly Speech Lab Talk 6/27/06.
Syllables and Stress October 21, 2009 Syllables “defined” “Syllables are necessary units in the organization and production of utterances.” (Ladefoged,
Sound and Speech. The vocal tract Figures from Graddol et al.
Chapter three Phonology
Intonation September 18, 2014 The Plan for Today Also: I have posted a couple of readings on TOBI (an intonation transcription system) to the course.
Phonology, phonotactics, and suprasegmentals
Toshiba Update 04/09/2006 Data-Driven Prosody and Voice Quality Generation for Emotional Speech Zeynep Inanoglu & Steve Young Machine Intelligence Lab.
Segment Duration and Vowel Quality in German Lexical Stress Perception Klaus J. Kohler University of Kiel, Germany Paper presented at Speech Prosody 2012.
Introduction To know how perceptual and attentional processes and properties of words guide the eyes through a sentence, the following issues are particularly.
Phonetics and Phonology
Whither Linguistic Interpretation of Acoustic Pronunciation Variation Annika Hämäläinen, Yan Han, Lou Boves & Louis ten Bosch.
Perceived prominence and nuclear accent shape Rachael-Anne Knight LAGB 5 th September 2003.
Infant Speech Perception & Language Processing. Languages of the World Similar and Different on many features Similarities –Arbitrary mapping of sound.
1 Speech Perception 3/30/00. 2 Speech Perception How do we perceive speech? –Multifaceted process –Not fully understood –Models & theories attempt to.
Suprasegmentals Segmental Segmental refers to phonemes and allophones and their attributes refers to phonemes and allophones and their attributes Supra-
Is phonetic variation represented in memory for pitch accents ? Amelia E. Kimball Jennifer Cole Gary Dell Stefanie Shattuck-Hufnagel ETAP 3 May 28, 2015.
A prosodically sensitive diphone synthesis system for Korean Kyuchul Yoon Linguistics Department The Ohio State University.
Segmental encoding of prosodic categories: A perception study through speech synthesis Kyuchul Yoon, Mary Beckman & Chris Brew.
Evaluating prosody prediction in synthesis with respect to Modern Greek prenuclear accents Elisabeth Chorianopoulou MSc in Speech and Language Processing.
LATERALIZATION OF PHONOLOGY 2 DAY 23 – OCT 21, 2013 Brain & Language LING NSCI Harry Howard Tulane University.
SEPARATION OF CO-OCCURRING SYLLABLES: SEQUENTIAL AND SIMULTANEOUS GROUPING or CAN SCHEMATA OVERRULE PRIMITIVE GROUPING CUES IN SPEECH PERCEPTION? William.
English Phonetics 许德华 许德华. Objectives of the Course This course is intended to help the students to improve their English pronunciation, including such.
1 Cross-language evidence for three factors in speech perception Sandra Anacleto uOttawa.
Neurophysiologic correlates of cross-language phonetic perception LING 7912 Professor Nina Kazanina.
Tone, Accent and Quantity October 19, 2015 Thanks to Chilin Shih for making some of these lecture materials available.
Bettina Braun Max Planck Institute for Psycholinguistics Effects of dialect and context on the realisation of German prenuclear accents.
Syllables and Stress October 21, 2015.
Introduction to psycho-acoustics: Some basic auditory attributes For audio demonstrations, click on any loudspeaker icons you see....
THE SOUND PATTERNS OF LANGUAGE
Speech Perception.
Nuclear Accent Shape and the Perception of Syllable Pitch Rachael-Anne Knight LAGB 16 April 2003.
IIT Bombay 17 th National Conference on Communications, Jan. 2011, Bangalore, India Sp Pr. 1, P3 1/21 Detection of Burst Onset Landmarks in Speech.
Suprasegmental Properties of Speech Robert A. Prosek, Ph.D. CSD 301 Robert A. Prosek, Ph.D. CSD 301.
Speech in the DHH Classroom A new perspective. Speech in the DHH Bilingual Classroom Important to look beyond the traditional view of speech Think of.
Pitch Tracking + Prosody January 19, 2012 Homework! For Tuesday: introductory course project report Background information on your consultant and the.
Suprasegmental features and Prosody Lect 6A&B LING1005/6105.
11 How we organize the sounds of speech 12 How we use tone of voice 2009 년 1 학기 담당교수 : 홍우평 언어커뮤니케이션의 기 초.
Auditory Perception 1 Streaming 400 vs. 504 Hz 400 vs. 566 Hz 400 vs. 635 Hz 400 vs. 713 Hz A 400-Hz tone (tone A) is alternated with a tone of a higher.
4AOD Malinnikova Ekaterina
Tone in Sherpa (Sino-Tibetan) Joyce McDonough1, Rebecca Baier2 and
Studying Intonation Julia Hirschberg CS /21/2018.
The American School and ToBI
Speech Perception.
Representing Intonational Variation
Patricia Keating, Marco Baroni, Sven Mattys, Rebecca Scarborough,
Speech Perception (acoustic cues)
Presentation transcript:

DFG Project BA 737/10: "Cross-language and individual differences in the production and perception of syllabic prominence. Rhythm-typology revisited." Rhythm-typology revisited. B. Andreeva & W. Barry Jacques Koreman

Outline Research questions Recordings Measurements Statistical analysis Results Discussion Conclusions and Outlook

Research questions How do different languages exploit the universal, psycho-acoustically determined means of modifying the prominence of words in an utterance? duration fundamental frequency energy spectral properties Do the different word-phonological requirements of a language affect the degree to which the properties are exploited? duration (length opposition; word stress) fundamental frequency (tonal word-accent) spectral properties (phonologized vowel reduction) Do speakers of a language vary in the strategies they adopt (for production and fot perception)?

For further clarification We have NOT investigated "word stress / word accent"….. …..but rather the change in a given word as a result of making it more or less informationally prominent in the utterance; i.e., the loss of length distinction in the [o] in German Philosophie vs. Philosoph or the vowel quality alternation between [ ɒ ] and [ə] in English philosopherand philosophical is not the focus of our investigation. (though it may have a bearing on our interpretation of results)

Phrasal (de-)accentuation Accentuation (phonological) can make prominent (phonetic) …. by lengthening, …. by increasing loudness, …. by changing the pitch and combinations thereof De-accentuation can reduce prominence …. by shortening (including segment elision), …. by decreasing loudness, …. by avoiding pitch changes, …. by reducing spectral distinctiveness. These properties determine the „rhythm type“

The link to ‘rhythm’? Speech rhythm (as a regular syllable-based or foot-based "beat") is an appealing myth….. Though we do have a very fine sense of the appropriate temporal patterning of any particular utterance (in any particular situation) …..... in fact we decode it in terms of information weight. Structural differences between languages are important.… because they determine the temporal patterns, and they may constrain how words are made prominent. 'Rhythm' = utterance dependent prominence pattern (not only determined by duration)

Principle of our approach Comparable production task across languages (different degrees of accentuation on same words by eliciting different focus conditions for the same sentence)

Material and elicitation Short sentences were constructed containing two one- or two-syllable "critical words" (CWs), one early (but not initial) and one late (but not final) in the sentence. + iterative versions (dada) to support comparisons across languages

Question:Was sagst du? (broad) Response:Der Mann fuhr den Wagen vor. Question:Wer fuhr den Wagen vor? (narrow early) Response:Der MANN fuhr den Wagen vor. Question: Was fuhr der Mann vor? (narrow late) Response: Der Mann fuhr den WAGEN vor. Question: Die DAME fuhr den Wagen vor? (narrow contr. early) Response: Der MANN fuhr den Wagen vor Question: Der Mann fuhr die KLAGEN vor? (narrow contr. late) Response: Der Mann fuhr den WAGEN vor. The questions were pre-recorded to accompany a PowerPoint presentation of the responses. German example (comparable in BG, F, N, RUS) text dada

Levels of prominence + stress + acc. + nucl. + narrow + stress - acc. - nucl. + narrow + stress - acc. - nucl. + narrow + stress + acc. + nucl. + narrow + stress + acc. - nucl. - narrow + stress + acc. + nucl. - narrow CW1CW2CW1CW2CW1CW2

Levels of prominence + stress + acc. + nucl. + narrow + stress - acc. - nucl. + narrow + stress - acc. - nucl. + narrow + stress + acc. + nucl. + narrow + stress + acc. - nucl. - narrow + stress + acc. + nucl. - narrow CW1CW2CW1CW2CW1CW2

Break down of analysis Material:6 sentences 6 repetitions 3 focus condition (broad, narrow, narow contr.) 2 sentence positions (early, late) 2 realisational variants (lexical, delexicalised iterative) Language: Bulgarian, French, German, Norwegian, Russian Speakers:6 regionally homogeneous Speakers (3 m, 3 f) per language (Sofia, northern standard French, Saarland, south-east Norway, Moscow area) Analysis total per language: 2160 utterances

Measurements DurationDuration (ms) of stressed vowels, stressed syllables, CWs, feet F 0 Mean F 0 across stressed vowel of CW F 0 change (comparison of stressed vowel in CW with preceding/following vowels) Energy intensity (dB) of stressed vowel in CW Spectral balance = difference between Hz band and Hz band in stressed vowel of CW Normalized relative to mean across corresp. units in sentence Spectr. def.F1–F3 at middle of stressed nucleus of CW

Statistical analysis One Way Repeated Measures ANOVA per parameter for CW1 and CW2 separately with dependent variables: - duration: syll, onset, vowel; F 0 mean, F 0 change; intensity, spectral tilt; F1, F2, F3); with within-subject variable: - prominence (broad, early narrow, late narrow, contr. early narrow, contr. late narrow) with between-subject variable: - language (BG, D, F, N, RUS) To see whether the prominence categories are realised differently across languages

Statistical analysis (cont.) Multivariate Anova’s per language for CW1 and CW2 separately with dependent variables: - duration: syll, onset, vowel; F 0 mean, F 0 change; intensity, spectral tilt; F1, F2, F3) with independent variable: - prominence (broad, early narrow, late narrow, contr. early narrow, contr. late narrow) To evaluate wich parameters are used to distiungish prominence categories in the five languages

main effects for language lang. x prominence Parameter CW1CW2 syllable dur. onset dur. vowel dur.  n.s.  n.s.  F 0 mean F 0 change    n.s. intensity spect. tilt   F1 F2 F3 n.s.  n.s.  n.s. ParameterCW1CW2 syllable dur. onset dur. vowel dur.  F 0 mean F 0 change n.s.  n.s.  intensity spect. tilt  n.s.  n.s. F1 F2 F3 n.s.  n.s.  n.s. Results: ANOVA with Repeated measures

Languages use the acoustic carriers of prominence to different degrees: * Results given here for CW1 but similar patterns for CW2 η2-values are a ratio of conditions (prominence) and total variance, and thus indicate the part of the total variance explained by the focus conditions. η 2 -values for prominence BG DF NRUS syllable onset vowel F1 F2 F3 intensity spec. Tilt F0 change F0 mean

Results: Duration Syllable duration range from accented to deaccented (from [dada] recordings): N > F > D ~ RUS > BG CS1 49% 30% 25% 24% 15% N > F > RUS > D ~ BG CS2 55% 37% 26% 19% 16% Note: No apparent connection between vowel-length opposition and use of duration for accentuation (compare N and D vs. F, RUS and BG)

Results: Duration CW1CW2 BG:nc_late < c_early D:c_late < c_early F:late, br < br, early N: late, br < early RUS:c_late < nc_early BG:early < c_late G: early, br < late F: early, br < late N: early < br < late RUS:early, br < late c. late nc. late broad c. early nc. early

Results: F 0 range F 0 range in % from accented to deaccented (from [dada] recordings): F > D > BG ~ RUS ~ N CS1 29% 23% 18% 14% 13% F ~ D > BG > RUS > N CS2 28% 27% 23% 16% 7% These values do not have any systematic link to pitch accent categories, but note Norwegian (lexical tones)

Results: F 0 change CW1CW2 BG:- D:late, br < early F: late, br < early N: late < br < early RUS:late < c_early BG:early, br < c_early, c_late < br, late D: early < br < late F: early, br < late N: - RUS:early, br < br, late c. late nc. late broad c. early nc. early

Results: Intensity Intensity range in dB from deaccented to accented (from [dada] recordings): BG > F > D = RUS > N CS BG > F ~ D > RUS > N CS Note: Larger intensity range for CS2 than CS1 due to greater post-nuclear than pre-nuclear de-accenting.

Results: Intensity CW1CW2 BG:late, br < early D: late < br < early F: late < br < early N: late, br < early RUS: late, br < br, nc_early < early BG: early, br < late D: early < br < late F: early < br < late N: early, br < br, late RUS:early, br < late c. late nc. late broad c. early nc. early

Perception tests Different values in production analysis imply differential perceptual judgements therefore pairwise presentation of different conditions ( broad, contrastive early, contrastive late, non-contrastive early, non-contrastive late ) Continuous prominence values preferable for statistical treatment therefore non-categorical judgements (using a graphic interface)

A mouseclick plays the two versions in sequence The sequence may be played as often as required Both sequences are offered during the course of the experiment Der Mann fuhr den Wagen vor. Erster Satz: Zweiter Satz: 1. stärker 2. stärker beide gleich stark Der Mann fuhr den Wagen vor. Erster Satz: Zweiter Satz: 1. stärker 2. stärker beide gleich stark Der Mann fuhr den Wagen vor. Interface for 1st critical word Interface for 2nd critical word

Perception tests (cont.) Signal manipulation: Change one parameter at a time to the value of the opposite prominence status (accented  unaccented and vice versa) Problems: Parameters are not totally independent: Durational change affects F 0 contours

Results Parewise comparison of natural stimuli: The subjects are well able to distinguish the different level of prominence. Perception with parameter manipulated stimuli: F 0 > Duration > Intensity (Russian subjects are slightly more sensitive to Intensity)

Discussion Isačenko & Schädlich 1966, Fry 1958 found the same hierarchy in their perception experiments but Kochanski et al., 2005, Tamburini & Wagner, 2007: Loudness/Intensity as the main predictor of „prominence“ in their production analyses N.B. Fry and Isačenko & Schädlich worked exclusively with lexical stress; Kochanski et al. and T&W combine lexical stress and phrasal prominence and worked only on production Our results (η2-values) show a similar importance of intensity in production, but the perception work supports Fry and Isačenko & Schädlich‘s conclusions!

Conclusions and Outlook The languages differ in the degree to which they exploit duration, F 0 and intensity in production and to some extent in perception The differences (in production and perception) are not directly linked to structural differences between the languages None of the results support the „mythological“ rhythm typology: stress-timed vs. syllable-timed The complex picture of language differences in production contrasts with an apparent universal perceptional hierarchy (F 0 > Duration > Intensity) All previos rhythm typology work has concentrated solely on duration. Natural communication combines intonation and segmental structure within an information structural framework. Languages will therefore differ rhythmically as a product of duration AND F 0 and rhythm measures need to reflect this.