Presentation on theme: "The Human Voice. I. Speech production 1. The vocal organs"— Presentation transcript:
1 The Human Voice. I. Speech production 1. The vocal organs The lungs serve as reservoir of air and a source energyIn speaking, air is forced form the lungs through the larynx into the three main cavities: the pharynx, the nasal and the oral cavitiesAir exits through the nose and mouthAir can be inhaled and exhaled without much soundTo produce speech sounds, the flow of air is interrupted by the vocal cords or by constrictions in the vocal tract (made by the tongue or lips)
2 Larynx and vocal folds (cords) Larynx with focal folds is the major source of sound in the vocal system.Sound is generated by the rhythmic opening and closing of the vocal folds.Open during inhalation, closed when holding one's breath, and vibrating for speech or singing (oscillating 440 times per second when singing A4), the folds are controlled via the valgus nerve.A person's voice pitch (fundamental frequency) is determined by the resonant frequency of the vocal folds.The fundamental frequency is influenced by the length, size, and tension of the vocal folds.In an adult male, this frequency averages about 125 Hz.In an adult females around 210 Hz.In children the frequency is over 300 Hz.The male vocal folds are between 17.5 mm & 25 mm (0.75" - 1.0") in length.The female vocal folds are between 12.5 mm & 17.5 mm (0.5" ") in length.
3 Vocal folds (continued) Vocal folds generate a sound rich in harmonics.Harmonics are produced by collisions of the vocal folds with themselves, by recirculation of some of the air back through the trachea, or both.Some singers can isolate some of those harmonics in a way that is perceived as singing in more than one pitch at the same time - a technique called overtone singing.
5 Vocal tractThe vocal tract is the cavity where sound that is produced at the sound source (larynx) is filtered.In it consists of:pharynx (laryngeal cavity)oral cavitynasal cavityThe estimated average length of the vocal tract in adult male is 17 cm and 14 cm in adult females.
6 2. Articulation of Speech Each syllable is made of one or more phonemesPhonemes are either vowel or consonantVowels are always voiced (with vibrations of the vocal folds)Consonants are either voiced or unvoicedThere are 12 to 21 vowel sounds in English (depending on which speech scientist you talk to)Opinions vary as to whether it is a pure vowel sound rather than a diphthong (a combination of two or more vowel sounds into one phoneme)
7 Consonants are classified according to their manner of articulation: Plosive or stop consonants(p, b, t, etc) are produced by blocking the flow of air somewhere in the vocal tract (usually the mouth) and releasing the pressure rather suddenlyFricatives(f, s, sh, etc) are made by constricting the airflow to produce turbulenceNasals(m, n, ng) are made by lowering the soft palate to connect the nasal cavity to the pharynx and then blocking the mouth cavity at some point along its lengthLiquids (r, l) are produced by raising the tip of the tongue while the oral cavity is somewhat constrictedSemivowel or glide consonants(w, y) are produced by keeping the vocal tract briefly in a vowel position then changing it rapidly to a vowel sound that followsConsonants are further classified according to their place of articulation, primarily the lips (labial), teeth (dental), gums (alveolar), palate (palatal) and glottis (glottal), and lips and teeth (labiodental)There are 24 consonant sounds in English
8 3. Formants: Resonances of the Vocal Tract (The peaks that are observed in the spectrum envelope and are independent of the pitch)They appear as envelopes that modify the amplitudes of the various harmonics of the source sound
10 Formants and speechFormants are the distinguishing/meaningful frequency components of human speech and of singing.The information that humans require to distinguish between vowels can be represented quantitatively by the frequency content of the vowel sounds.In speech, these are the characteristic partials that identify vowels to the listener. Most formants are produced by tube and chamber resonance, but a few whistle tones derive from periodic collapse of Venturi effect low-pressure zones.The formant with the lowest frequency is called f1, the second f2, and the third f3.Most often the two first formants, f1 and f2, are enough to disambiguate the vowel.VowelMain formant regionu200–400 Hzo400–600 Hza800–1200 Hze400–600 and 2200–2600 Hzi200–400 and 3000–3500 Hz
11 Singers' formantFrequency spectrum of trained singers, especially male singers, has a formant around 3000 Hz.It tends to be independent of the vowel and the pitchThis increase in energy at 3000 Hz allows singers to be heard and understood over an orchestra, which peak at much lower frequencies of around 500 Hz.This formant is actively developed through vocal training, for instance through so-called "voce di strega" or witch's voice exercises and is caused by a part of the vocal tract acting as a resonatorIt lies somewhere between the third and the fourth formantIt adds brilliance and carrying power to the male singing voice
12 4. Prosodic Features of Speech Prosodic features are characteristics which convey meaning, emphasis, and emotion without actually changing the phonemes.They include pitch, rhythm, and accentIn English, prosodic features play a secondary roles to the phonemesHowever, in Chinese, prosodic features change the meaning a phonemeProsodic features tend to indicate the emotional state of the speakerThere have been attempts to use them in “lie detection” to analyze recorded speech for evidence of stress