next Analysis and Synthesis of Pure Vowels
up Audio Speech Research Note
previous Anatomy of the Human Vocal Tract



Describing Speech Sounds

Much of the summary that proceeds has been inspired by [Fletcher 1953].

The basic linguistic unit is called a phoneme, denoted by a character enclosed with forward slashes or square braces (e.g. the symbol /i/ represents the vowel sound heard in the word team).1 Phonemes may be classified and described according to a number of criteria. They may be divided, for example, into vowels and consonants. While vowels are produced primarily by the vibration of the vocal folds, consonants are the speech sounds produced by the turbulent or explosive flow of air through constricted or obstructed parts of the vocal tract, known as articulators. Note that consonants may also involve vibration of the vocal folds, as in the phoneme /v/, heard in the word voice. Such consonants are called voiced consonants, while consonants such as the /f/ of the word fish are referred to as unvoiced, since the vocal folds are simply held open during the production of these sounds. A tentative rule of thumb is that vowels consist of the sounds represented by the letters a, e, i, o, u, and sometimes y.

The vowels may be further divided into pure vowels, each consisting of a single voiced sound, such as // of the word took, and dipthongs, created by chaining two pure vowels together, such as the /a/ heard in the word time.

The consonants may also be divided along a number of lines. Fricative consonants, or spirants, such as the aforementioned /f/ of fish and /s/ of sit, are marked by a steady, turbulent flow of air at a constriction created somewhere in the vocal tract other than at the vocal chords. Stop consonants, or plosives, on the other hand, such as the /p/ of push or the /g/ of goat, are produced by the build-up and sudden, explosive release of air pressure at some point in the vocal tract. Fricatives and stop consonants may be either voiced or unvoiced.

Certain terms used in the description of speech sounds may be applied to both vowels and consonants. Nasal sounds are those in which the nasal cavity plays a role in the transmission and broadcast of the vocal sound, whereas non-nasal sounds occur when the nasal cavity is cut off from the vocal tract by the velum during sound production. Continuants are those speech sounds that involve the continuous, steady flow of air from lungs to the environment, while stops involve the complete closure or obstruction of the vocal cavities at some point in the production of the sound.

Finally, there are some classes of phonemes that do not fit neatly into the vowel-consonant classification scheme described above. The sounds /l/, /r/, /m/, /n/, and /ng/, for example, though often thought of as consonants, are referred to as liquids or semi-vowels. The sounds /w/, /y/, and /h/ are referred to as transitionals in [Fletcher 1953]. Speech sounds referred to as affricates consist of a plosive or stop consonant immediately followed by a fricative or spirant, such as the German /pf/.


next Analysis and Synthesis of Pure Vowels
up Audio Speech Research Note
previous Anatomy of the Human Vocal Tract

``Audio Speech Research Note'', Ryan J. Cassidy, published electronically by author, July 2003.
Download PDF version (audio_speech.pdf)
Download compressed PostScript version (audio_speech.ps.gz)

Copyright © 2003-11-28 by Ryan J. Cassidy.
Please email errata, comments, and suggestions to Ryan J. Cassidy <ryanc@ieee.org>
Stanford University