Jennifer J. Venditti Postdoctoral Research Associate

Slides:



Advertisements
Similar presentations
Syllable Structure in English
Advertisements

Speech: Fundamentals CS 3710 / ISSP 3565
Pushpak Bhattacharyya CSE Dept., IIT Bombay 31st March, 2011
Normal Aspects of Articulation. Definitions Phonetics Phonology Articulatory phonetics Acoustic phonetics Speech perception Phonemic transcription Phonetic.
Building an ASR using HTK CS4706
From Sounds to Language
From Sounds to Language
From Sounds to Language Lecture 2 Spoken Language Processing Prof. Andrew Rosenberg.
Making & marking text for synthesis Caroline Henton 10 August 2006.
PHYSICAL PROPERTIES OF SPEECH SOUNDS
From Sounds to Language CS 4706 Julia Hirschberg.
Computational Extraction of Social and Interactional Meaning SSLST, Summer 2011 Dan Jurafsky Prosody IP notice: many slides for today from Jennifer Venditti,
Introduction to Intonation Jennifer J. Venditti Cognitive Science March 2001.
Accent Profile Qin Yan Dept of Electronic & Computer Engineering, Brunel University November, 2002.
CS 4705 Lecture 4 CS4705 Sound Systems and Text-to- Speech.
Jennifer J. Venditti Postdoctoral Research Associate
What is Phonetics? Short answer: The study of speech sounds in all their aspects. Phonetics is about describing speech. (Note: phonetics ¹ phonics) Phonetic.
Chapter three Phonology
CS 224S / LINGUIST 285 Spoken Language Processing
STUDY OF ENGLISH STRESS AND INTONATION
Chapter 3 Phonetics: Describing Sounds. Phonetics -study of speech sounds Sounds and symbols --use a system of written symbols --one sound represents.
Phonetics Linguistics for ELT B Ed TESL 2005 Cohort 2.
The sounds of language Phonetics Chapter 4.
Introduction to Phonetics
Phonetics and Phonology
LING 001 Introduction to Linguistics Fall 2010 Sound Structure I: Phonetics Articulatory phonetics Phonetic transcription Jan. 25.
An Introduction to Linguistics
Structure of Spoken Language
1 Speech Perception 3/30/00. 2 Speech Perception How do we perceive speech? –Multifaceted process –Not fully understood –Models & theories attempt to.
Today we are going to learn about: Speech sounds Anomotical production.
LING 001 Introduction to Linguistics Fall 2010 Sound Structure I: Phonetics Acoustic phonetics Jan. 27.
1 Phonetics and Phonemics. 2 Phonetics and Phonemics : Phonetics The principle goal of Phonetics is to provide an exact description of every known speech.
CS 551/652: Structure of Spoken Language Lecture 2: Spectrogram Reading and Introductory Phonetics John-Paul Hosom Fall 2010.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-27: Phonology (quiz took place on 12/10/09; Lect 26.
Ch 3 Slide 1 Is there a connection between phonemes and speakers’ perception of phonetic differences? (audibility of fine distinctions) Due to phonology,
Daniel May Department of Electrical and Computer Engineering Mississippi State University Analysis of Correlation Dimension Across Phones.
Part aspiration (p. 56) aspiration, a period of voicelessness after the stop articulation and before the start of the voicing for the vowel.
Quantitative and qualitative differences in understanding sentences interrupted with noise by young normal-hearing and elderly hearing-impaired listeners.
Experimentation Duration is the most significant feature with around 40% correlation. Experimentation Duration is the most significant feature with around.
Levels of Language 6 Levels of Language. Levels of Language Aspect of language are often referred to as 'language levels'. To look carefully at language.
Foundations of Language and Speech Technology Speech [Session 1] Jürgen Trouvain.
Phonology: The Context Foundation Skills Cognition Play Socialization Pragmatics Phonology Semantics Metalinguistics.
Introduction to Speech Neal Snider, For LIN110, April 12 th, 2005 (adapted from slides by Florian Jaeger)
Statistical NLP Spring 2011
Intonational Meaning in Discourse Jennifer J. Venditti Tutorial for the IRCS 5 th Annual Undergraduate Summer Workshop in Cognitive Science 18 June 2002.
Experimentation Duration is the most significant feature with around 40% correlation. Experimentation Duration is the most significant feature with around.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-25: Vowels cntd and a “grand” assignment.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-19: Speech: Phonetics (Using Ananthakrishnan’s presentation.
ASSESSING SEARCH TERM STRENGTH IN SPOKEN TERM DETECTION Amir Harati and Joseph Picone Institute for Signal and Information Processing, Temple University.
Audio Books for Phonetics Research CatCod2008 Jiahong Yuan and Mark Liberman University of Pennsylvania Dec. 4, 2008.
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 31–Inside and Outside probabilities; PCFG training; start of phonetics and phonology)
NADYA RUTHERFORD. Building AWARENESS and CONCERN about pronunciation.
ARTICULATORY PHONETICS
Phonology (Additional) Diploma Skills for Life ESOL/ Literacy
Statistical NLP Spring 2010
Structure of Spoken Language
Structure of Spoken Language
Structure of Spoken Language
Phonetics SPAU 3343 Chap. 10 – Grasping the melody of language
Midterm Review (closed book)
Structure of Spoken Language
Phonetics & Phonology of English: How & Why We Speak the Way We Do
Speech Processing August 10, /10/2018.
Jennifer J. Venditti Postdoctoral Research Associate
Audio Books for Phonetics Research
Spoken Language Processing:Summing Up
Phonetics and Phonemics
CS 188: Artificial Intelligence Spring 2006
Phonetics and Phonemics
CONSONANTS ARTICULATORY PHONETICS. Consonants When we pronounce consonants, the airflow out of the mouth is completely blocked, greatly restricted, or.
Presentation transcript:

Some Speech Basics Phonetic Transcription, Context-dependent variation, and Intonation Jennifer J. Venditti Postdoctoral Research Associate Columbia Computer Science 12 September 2002

1. Phonetic Transcription

Spelling vs. Sounds same spelling = different sounds o comb, tomb, bomb oo blood, food, good c court, center, cheese s reason, surreal, shy same sound = different spellings [i] sea, see, scene, receive, thief [s] cereal, same, miss [u] true, few, choose, lieu, do [ay] prime, buy, rhyme, lie combination of letters = single sound ch child, beach th that, bathe oo good, foot gh laugh single letter = combination of sounds x exit, Texas u use, music ‘silent’ letters k knife, know p psycho, pterodactyl e moose, bone gh through

Figures 4.1 and 4.2: Jurafsky & Martin (2000), pages 94-95.

On-line pronunciation dictionaries phoneset derived from: number of wordforms English variety LDC PRONLEX ARPAbet 90,694 American CMUdict 100,000 CELEX IPA 160,595 British Source: Jurafsky & Martin (2000), page 121.

Places of articulation alveolar post-alveolar/palatal dental velar uvular labial pharyngeal laryngeal/glottal http://www.chass.utoronto.ca/~danhall/phonetics/sammy.html

Vocal fold vibration [UCLA Phonetics Lab demo]

Articulatory parameters for English consonants (in ARPAbet) PLACE OF ARTICULATION bilabial labio-dental inter-dental alveolar palatal velar glottal stop p b t d k g q fric. f v th dh s z sh zh h affric. ch jh nasal m n ng approx w l/r y flap dx MANNER OF ARTICULATION VOICING: voiceless voiced

American English vowel space FRONT BACK HIGH LOW iy ih eh ae aa ao uw uh ah ax ix ux ey ow aw oy ay

[iy] vs. [uw] (From a lecture given by Rochelle Newman)

[ae] vs. [aa] (From a lecture given by Rochelle Newman)

Acoustic landmarks [p] [t] [ix] [ih] [ax] [ae] [iy] [sh] [s] [n] [l] “Patricia and Patsy and Sally”

Articulators in action (Sample from the Queen’s University / ATR Labs X-ray Film Database) “Why did Ken set the soggy net on top of his deck?”

Exercise (1) Write your name in: (a) IPA. (b) ARPAbet (if possible). Choose one of the following triplets and transcribe each word in both IPA and ARPAbet. cone, tomb, bottom blood, fool, hook court, race, cheese reason, surreal, cash thing, these, other laugh, through, ghoul

Figures 4.1 and 4.2: Jurafsky & Martin (2000), pages 94-95.

IPA consonants (Distributed by the International Phonetics Association.)

IPA vowels (Distributed by the International Phonetics Association.)

Context-dependent phonetic variation

Context-dependent variation What we would consider a single ‘sound’ can be pronounced differently depending on the phonetic context. For example, the phoneme /t/: Figure 4.8: Jurafsky & Martin (2000), page 104.

Another regular alternation I can ask [ay k ae n ae s k] I can see [ay k ae n s iy] I can bake [ay k ae m b ey k] I can play [ay k ae m p l ey] I can go [ay k ae ng g ow] I can carry [ay k ae ng k ae r iy] n  m / __ [+labial stop] n  ng / __ [+velar stop] (inopportune [n], insatiable [n], impervious [m], immortal [m], incoherent [ng], ingratitude [ng])

English plurals hiccup [p]  hiccups flood [d]  floods sock [k]  socks scab [b]  scabs habit [t]  habits frog [g]  frogs spoof [f]  spoofs comb [m]  combs hearth [th]  hearths grave [v]  graves lathe [dh]  lathes beach [ch]  beaches fool [l]  fools dish [sh]  dishes sewer [r]  sewers judge [jh]  judges pies [ay]  pies race [s]  races curfew [uw]  curfews axe [s]  axes sofa [ax]  sofas raise [z]  raises

Phonological rules for Engl. plurals Assume that the lexical form of plural is /z/. Insertion:   ix / [+sibilant] ^__ z # Devoicing: z  s / [-voice] ^__ # bus+PL cape+PL hen+PL /b ah s +z/ /k ey p +z/ /h eh n +z/ insertion: b ah s +ix z -- -- devoicing: -- k ey p s -- [b ah s ix z] [k ey p s] [h eh n z] /b ah s +z/ /k ey p +z/ /h eh n +z/ devoicing: b ah s s k ey p s -- insertion: -- -- -- *[b ah s s] [k ey p s] [h eh n z]

3. Intonation

Intonation makes the difference A: I’d like to fly to Davenport, Iowa on TWA. B: TWA doesn’t fly there ... B1: They fly to Des Moines. B2: They fly to Des Moines. A: What types of foods are a good source of vitamins? B1: Legumes are a good source of vitamins. B2: Legumes are a good source of vitamins. A1: I met Mary and Elena’s mother at the mall yesterday. A2: I met Mary and Elena’s mother at the mall yesterday.

Intonation is about ... Pitch Melody, or “tune” Alignment Prominence and focus Chunking, or “phrasing” ... and more ...

Vocal fold vibration Physical: Fundamental frequency (F0)  rate of vibration of the vocal folds Perceptual: Pitch fundamental freq. perceived pitch [UCLA Phonetics Lab demo]

Pitch range Differences can be due to physical size, gender, social identity, excitement level, linguistic, etc ... [from Prosody on the Web tutorial on pitch]

English Pitch Accents * * Lenora works for Lucent. Certain words in the speech stream can be made structurally and perceptually prominent by the use of pitch accents. * * Lenora works for Lucent. Pitch accents are local pitch movements (e.g. rising, falling) or pitch maxima/minima that accompany these metrically strong syllables. The intonational “tune” is the melody that is created by sequences of pitch accents over an utterance.

Intonational tunes: What do they mean? Lenora works for Lucent. * * [Tell me something about the world ...] * [... Really? I wasn’t aware of that.] * [... I hope she doesn’t have stock options.] * * [I’ve told you a million times ...] [See works by Bolinger, Ladd, Hirschberg ...]

Alignment differences cue “assertion” vs. “suggestion” A: I’d like to fly to Davenport, Iowa on TWA. B: TWA doesn’t fly there ... they fly to Des Moines they fly to Des Moines

Alignment with different words * * Legumes are a good source of vitamins.  “broad focus” A: What types of foods are a good source of vitamins? * B: LEGUMES are a good source of vitamins.  “narrow focus” # Legumes are a good source of VITAMINS.

Placement of focal accent LEGUMES are a good source of vitamins The rise-fall tune (= “I assert this”) shifts locations.

Placement of focal accent Legumes are a GOOD source of vitamins The rise-fall tune (= “I assert this”) shifts locations.

Placement of focal accent legumes are a good source of VITAMINS The rise-fall tune (= “I assert this”) shifts locations.

Chunking, or “phrasing” A1: I met Mary and Elena’s mother at the mall yesterday. A2: I met Mary and Elena’s mother at the mall yesterday.

Phrasing can disambiguate Mary & Elena’s mother mall I met Mary and Elena’s mother at the mall yesterday One intonation phrase with relatively flat overall pitch range.

Phrasing can disambiguate Elena’s mother mall Mary I met Mary and Elena’s mother at the mall yesterday Separate phrases, with expanded pitch movements.

Lists of numbers, nouns twenty.eight.five ninety.four.three seventy.three.seven forty.seven.seven seventy.seven.seven coffee cake and cream chocolate ice cream and cake fish fingers and bottles cheese sandwiches and milk cream buns and chocolate [from Prosody on the Web tutorial on chunking]

Exercise (2) 1. Sketch out an F0 contour of Does Manitowoc have a bowling alley? as uttered in the following two contexts: (a) “I know Green Bay has a bowling alley, but ...” (b) “I know Manitowoc has a theater, but ...” 2. Complete the sentence: When Madonna sings the song ... Describe the prosodic phrasing of your utterance. 3. How can phrasing help disambiguate the utterance: that’s right at the traffic light