Glottal source modeling for singing voice synthesis
Naturalness
of sound quality is essential for singing-voice synthesis. Since 95% of singing
is voiced sound (Cook, 1990), the focus of this paper is to improve the
naturalness of the vowel tone quality via glottal excitation modeling. We
propose to use the LF-model (Fant et al., 1985) for the glottal wave shape in
conjunction with pitch-synchronous, amplitude-modulated Gaussian noise, which
adds an aspiration component to the glottal excitation. The associated analysis
and synthesis procedures are also provided in this paper. By analyzing baritone
recordings, we have found simple rules to change voice qualities from
“laryngealized” (or “pressed”), to normal, to “breathy” phonation.
You can download my ICMC2000 paper for details.