Structure of Spoken Language

Name: Structure of Spoken Language
Uploaded: 2017-10-10T18:51:03+00:00
Duration: PTM10S7
Channel: Suzan Lane
Description: Structure of Spoken Language

Structure of Spoken Language
CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom Fall 2010

Phonological Processes
Phonemes undergo systematic variation depending on their context For example, forming the past tense: cause /k aa z/  caused /k aa z d/ talk /t aa k/  talked /t aa k t/ /d/ vs. /t/ is predictable based on voicing of word-final phoneme Allophones can be viewed as systematic variations of phonemes that are a result of cultural and/or physiological processes, but do not distinguish meaning of utterance For example, /p/ and /ph/ in English is predictable: word or syllable initial voiceless stops are aspirated pit  [ph ih t[h]] tip  [th ih p[h]] kin  [kh ih n] spit  [s p ih t[h]] stick  [s t ih k[h]] skin  [s k ih n]

/ph ih th th ih ph kh ih n/ /s p ih th s t ih kh s k ih n/

Other types of phonetic processes: Assimilation, Deletion, Reduction, Insertion, Substitution, Me'tathesis (switching order of two phonemes) Assimilation “A feature of one segment is shared by a neighboring segment” Examples of Assimilation  Nasalization of vowels before nasal consonants  in- (negative prefix) becomes im- in words beginning with bilabial consonant (imbalance, imperfect, indifferent, intolerance)

Assimilation may be due to coarticulation, or it may be language-specific, “arbitrary”: “word-final alveolar obstruent may take on place of articulation of following word-initial segment if word-initial segment is palato-alveoar” this /dh ih s/ shop /sh aa ph/  this shop /dh ih sh sh aa ph/ this /dh ih s/ fish /f ih sh/  this fish /dh ih s f ih sh/ this /dh ih s/ thing /th ih ng/  this thing /dh ih s th ih ng/ also, depending on dialect, not within-word: misshapen /m ih s sh ei p en/

Example of assimilation of /s/ with /sh/ but not /f/: /dh ih sh sh aa pcl ph dh ih s f ih sh/

Substitution: common in foreign accents or speaking impairments: welcome /v eh l k ah m/ McDonald /m a k uw d ow n aa r uw d ow/ Roger /w aa jh er/ Metathesis: changing order of two phonemes within a word (dialect variation) pretty /p er dx iy/ ask /ae k s/ For the history of ask/aks, Google “axe ask england”:

Deletion: Barbara /b aa r b ax r ah/  /b aa r b r ah/ Memory /m eh m ax r iy/  /m eh m r iy/ Reduction: unstressed vowels become /ax/ conduct (verb) /k ax n d ah k t/ conduct (noun) /k aa n d ax k t/ Insertion: voiceless stop inserted between nasal and voiceless consonant; voiceless stop always has same place of articulation as nasal fancy /f ae n t s iy/ Chomsky /ch aa m p s k iy/ schwa inserted after word-final nasal nine /n ay n ax/ dictionary pronunciation=

Deletion: /m eh m r iy/

Insertion: /f ae n t s iy ch aa m p s k iy/

Phonological Processes: Ladefoged Rules
[–voiced, +stop]  [+aspirated] when syllable initial pit vs. spit [ax]  [–voiced] after syllable-initial [–voiced, +stop] and before [–voiced, +stop] potato [+consonantal]  longer at end of phrase bib, did, don, nod [–voiced, +stop]  [–aspirated] after syllable-initial /s/ spew, stew, skew [+vowel]  shorter before unvoiced phonemes in same syllable cap vs. cab, back vs. bag

Devoicing, End-of-Phrase Length: /ph ax tcl th ey dx ow/ /d aa n n aa dcl d/

Length before Voiceless: /khae pc ph kh ae bc b b ae kc kh b ae gc g/

[–voiced]  longer when at end of syllable sass, shook vs. push [+stop]  unreleased before [+stop] apt, act (often see some mark in spectrogram) [–voiced, +alveolar, +stop]  [+glottal stop] when before an alveolar nasal in same word beaten  /b iy q en/ [+nasal]  [+syllabic] at word end when following [+obstruent] chasm  /k ae z em/ NOT film (obstruent = complete closure of airway; /l/ is not) [+liquid]  [+syllabic] at word end and following [+consonant] paddle, whistle, kennel, razor, hammer, tailor NOT snarl; change to “following [+obstruent]”?

/ae pcl tcl th ae kcl tcl th/ /bcl b iy q tcl en ax_h/

[+alveolar, +stop]  [+voiced, +flap] when between two vowels, second of which is unstressed This rule has speaker-dependent variations [+alveolar, +stop]  omitted between two consonants most people, sandpaper, grand master [+consonant]  shortened before identical [+consonant]   [–voice, +stop] between [+nasal] and [–voice, +fricative] when following vowel absent or unstressed prince vs. prints (e'penthesis)   [&] following word-final [+nasal, +consonantal] nine come sang (e'penthesis)

“most people and grand masters use sandpaper” /m ow s pc ph iy pc ph el n gc g r ae n m ae s tc th er z yu z s ae n pc ph ey pc ph er/

“nine come sang” /n ay n ax kcl kh ah m ax s ae ng ax/

[+vowel]  longer in open syllables sea vs. seed vs. seat sigh vs. side vs. sight (equalize length of syllables with differing numbers of segments) [+vowel]  longer in stressed syllable below vs. billow (stressed syllables are longer in duration than unstressed) [+vowel]  [+nasal] before [+nasal] consonant [+vowel, –stressed]  schwa (vowel reduction) able vs. ability Canada vs. Canadian photograph vs. photography

“sigh side sight” /s ay s ay dcl d s a tcl th/

“below billow” /b ax l ow b ih l ow/

Why is this useful? (a) Providing models of known phenomenon is better than having classifier learn the phenomenon from data (b) Provides humans with appropriate cues for understanding, naturalness (c) Accurate phonetic modeling improves ability of classifier to discriminate between classes Example for Text-to-Speech (case (b)):  Create a TTS system  Don’t shorten vowels before voiceless plosives  Creates, by default, acoustic cue for voiced plosives  Decrease intelligibility or at least naturalness of system

Example for Automatic Speech Recognition (case (c)):  Train a speech recognizer using “dictionary” pronunciation  Then, in all cases where [–voice, +stop] between [+nasal] and [–voice, +fricative] such as “fancy” (in CMU dictionary as /f ae n s iy/), acoustics show alveolar stop, but trained as either nasal /n/ or fricative /s/.  Decreases ability of model to discriminate classes  Decreases performance of system Difficulty is in providing comprehensive, accurate rules that are not inappropriately “forced” on a system

Structure of Spoken Language

Similar presentations

Presentation on theme: "Structure of Spoken Language"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Structure of Spoken Language

Similar presentations

Presentation on theme: "Structure of Spoken Language"— Presentation transcript:

Similar presentations

About project

Feedback