phvoc.html

The Phase Vocoder is an algorithm for timescale modification of audio. One way of understanding it is to think of it as stretching or compressing the time-base of a spectrogram to change the temporal characteristics of a sound while retaining its short-time spectral characteristics; if the spectrogram is narrowband (analysis window longer than a pitch cycle, so the individual harmonics are resolved), then preserving the spectral characteristics implies preserving the pitch, and avoiding the 'slowing down the tape' pitch drop. The only complication to the algorithm is that the phases associated with each bin in the modified spectrogram image have to be 'fixed up' to maintain the dphase/dtime of the original, thereby ensuring the correct alignment of successive windows in the overlap-add reconstruction.

This implementation first calculates the short-time Fourier transform of the signal using 'stft'; 'pvsample' then builds a modified spectrogram array by sampling the original array at a sequence of fractional time values, interpolating the magnitudes and fixing-up the phases as it goes along. The resulting time-frequency array can be inverted back into a sound with 'istft'. The 'pvoc' script is a wrapper to perform all three of these steps for a fixed time-scaling factor (larger than one for speeding up; smaller than one to slow down). But the underlying pvsample routine would also support arbitrary timebase variation (freezing, reversal, modulation) if one wished to write a suitable interface to specify the time path.

I analyzed the Matlab Phase Vocoder by dan Ellis to learn and understand a possible implemenation of it, in addition to the theory from the Music 420 reader.

Code

pvoc.m - the top-level routine
stft.m - calculate the STFT time-frequency representation
pvsample.m - interpolate/reconstuct the new STFT on the modified timebase
istft.m - overlap-add the modified STFT back into a waveform

Here's an example of how to use pvoc to slow down a soundfile of voice (sampled at 16 kHz) to 3/4 speed:

The Phase Vocoder

Introduction

Code

Sounds