Glottal source modeling for singing voice synthesis
Naturalness of sound quality is essential for singing-voice synthesis. Since 95% of singing is voiced sound (Cook, 1990), the focus of this paper is to improve the naturalness of the vowel tone quality via glottal excitation modeling. We propose to use the LF-model (Fant et al., 1985) for the glottal wave shape in conjunction with pitch-synchronous, amplitude-modulated Gaussian noise, which adds an aspiration component to the glottal excitation. The associated analysis and synthesis procedures are also provided in this paper. By analyzing baritone recordings, we have found simple rules to change voice qualities from “laryngealized” (or “pressed”), to normal, to “breathy” phonation.
You can download my ICMC2000 paper for details.