Voice Synthesis

Voice Synthesis

Hui-Ling Lu, in her 2002 thesis [104], developed a model for the singing voice in which the driving glottal pulse train is estimated jointly with filter parameters describing the shape of the vocal tract (the complete airway from the base of the throat to the lip opening). The model can be seen as an improvement over linear-predictive coding (LPC) of voice in the direction of a more accurate physical model of voice production, while maintaining a low computational cost relative to more complex articulatory models of voice production. In particular, the parameter estimation involves only convex optimization plus a one-dimensional (possibly non-convex) line search over a compact interval. The line search determines the so-called ``open quotient'' which is fraction of the time there is glottal flow within each period. The glottal pulse parameters are based on the derivative-glottal-wave models of Liljencrants, Fant, and Klatt [59,94]. Portions of this research have been published in the ICMC-00 [105] and WASPAA-01 [106] proceedings.

Earlier work in voice synthesis includes [18,33,35,39,81,94,129,170]; see also the KTH ``Research Topics'' home page.

Download jnmr.pdf

``Virtual Acoustic Musical Instruments: Review and Update'', by Julius O. Smith III, DRAFT to be submitted to the Journal of New Music Research, special issue for the Stockholm Musical Acoustics Conference (SMAC-03) .
Copyright © 2005-12-28 by Julius O. Smith III
Center for Computer Research in Music and Acoustics (CCRMA), Stanford University
[Automatic-links disclaimer]