Preprocessing

Preprocessing

The task of the program can be simplified and the analysis/synthesis results improved if the sound input is appropiately manipulated before running the program.

Most important is to equalize the input signal. This controls what it means to find spectral peaks in order of decreasing magnitude. Equalization can be accomplished in many ways and here we present some alternatives.

12pt (1) A good equalization strategy for audio applications is to weight the incoming spectrum by the inverse of the equal-loudness contour for hearing at some nominal listening level (e.g. dB). This makes spectral magnitude ordering closer to perceptual audibility ordering.

12pt (2) For more analytical work, the spectrum can be equalized to provide all partials at nearly the same amplitude (e.g., the asymptotic roll-off of all natural spectra can be eliminated). In this case, the peak finder is most likely to find and track all of the partials.

12pt (3) A good equalization for noise-reduction applications is to ``flatten'' the noise floor. This option is useful when it is desired to set a fixed (frequency-independent) track rejection threshold just above the noise level.

12pt (4) A fourth option is to perform adaptive equalization of types (2) or (3) above. That is, equalize each spectrum independently, or compute the equalization as a function of a weighted average of the most recent power spectrum (FFT squared magnitude) estimates.

Apart from equalization, another preprocessing strategy which has proven very useful is to reverse the sound in time. The attack of most sounds is quite ``noisy'' and PARSHL has a hard time finding the relevant partials in such a complex spectrum. Once the sound is reversed the program will encounter the end of the sound first, and since in most instrumental sounds this is a very stable part, the program will find a very clear definition of the partials. When the program gets to the sound attack, it will already be tracking the main partials. Since PARSHL has a fixed number of oscillators which can be allocated to discovered tracks, and since each track which disappears removes its associated oscillator from the scene forever,⁴ analyzing the sound tail to head tends to allocate the oscillators to the most important frequency tracks first.

Download parshl.pdf

``PARSHL: An Analysis/Synthesis Program for Non-Harmonic Sounds Based on a Sinusoidal Representation'', by Julius O. Smith III and Xavier Serra, Proceedings of the International Computer Music Conference (ICMC-87, Tokyo), Computer Music Association, 1987.
Copyright © 2005-12-28 by Julius O. Smith III and Xavier Serra
Center for Computer Research in Music and Acoustics (CCRMA), Stanford University
[Automatic-links disclaimer]