This appendix is adapted from the original paper describing the PARSHL program [271] for sinusoidal modeling of audio. While many of the main points are summarized elsewhere in the text, the PARSHL paper is included here as a source of more detailed info on carrying out elementary sinusoidal modeling of sound based on the STFT.
As mentioned in §G.7.1, the phase vocoder was a widely used analysis tool for additive synthesis starting in the 1970s. A difficulty with the phase vocoder, as traditionally implemented, is that it uses a fixed uniform filter bank. While this works well for periodic signals, it is relatively inconvenient for inharmonic signals. An ``inharmonic phase vocoder'' called PARSHLH.1 was developed in 1985 to address this problem in the context of piano signal modeling [271]. PARSHL worked by tracking peaks in the short-time Fourier transform (STFT), thereby synthesizing an adaptive inharmonic FIR filter bank, replacing the fixed uniform filter bank of the vocoder. In other respects, PARSHL could be regarded as a phase-vocoder analysis program.
The PARSHL program converted an STFT to a set of amplitude and frequency envelopes for inharmonic, quasi-sinusoidal-sum signals. Only the most prominent peaks in the spectrum of the input signal were tracked. For quasi harmonic sounds, such as the piano, the amplitudes and frequencies were sampled approximately once per period of the lowest frequency in the analysis band. For resynthesis, PARSHL supported both additive synthesis [233] using an oscillator bank and overlap-add reconstruction from the STFT, or both.
PARSHL followed the amplitude, frequency, and phaseH.2 of the most prominent peaks over time in a series of spectra, computed using the Fast Fourier Transform (FFT) The synthesis part of the program used the analysis parameters, or their modification, to generate a sinewave in the output for each peak track found.
The steps carried out by PARSHL were as follows:
The following sections provide further details: