next up previous
Next: Summary and Future Directions Up: BAYESIAN TWO SOURCE MODELING Previous: Bayesian Framework Application

A Pathological Musical Example

We have applied the current Bayesian Two Source Modeling (BTSM) technique to a pathological musical example that is otherwise particularly difficult to deal with. We consider a musical trio in which two sources are always active, and each plays the same note in the same octave. The samples are of clarinet, violin, and cello, and come from the Iowa samples database [7]. As can be seen from the spectrograms in figure 1, the clarinet and violin first play together, then the clarinet and cello, and finally the violin and cello. The mixing was done synthetically as specified by the DUET signal model, with mixing parameters:
source $a_i$ $\delta_i$
1 1.05 -9.07e-5
2 1.01 -2.27e-5
3 0.9 6.80e-5
To prepare the system, we first processed excerpts (segmentation courtesy of Pamornpol `Tak' Jinachitra) from all of the files for each instrument to gain estimations of the variance of each source's STFT magnitude coefficients. We then processed all points in STFT space for the test file containing $X_1$ and $X_2$, calculating $ p(u,v\vert D)$ using the Bayesian approach above, and including null sources to allow a one source output. We used a uniform prior $p(u,v)$, indicating no preference for the activity of any one or two of the three sources.
Figure 2: The separated original mixtures achieved by the previous DUET approach and the new BTSM approach.
3in7incolumn2.eps
In the spectrograms in figure 2 and the output SNRs (dB) in the table below, we see the results achieved by the DUET system and the current BTSM system. Though the DUET system often does separate some of the frequency components correctly, its single active source constraint becomes a liability when most frequency components of the sources overlap. Indeed, we can see cases in the figure where components sharply enter or exit, a highly audible phenomenon. The BTSM approach achieves much higher SNR, and allows sharing of frequency components between two sources. We see that it sometimes chooses the two active sources incorrectly, giving data to the violin, for example, when only the clarinet and cello are active. More often than not, however the system guesses correctly about which two sources are active, and makes less audible errors. Time domain envelope plots (whose inclusion is prevented by space issues) confirm the above.
source Input SNR DUET SNR BTSM SNR
1 -0.4 7.1 15.7
2 -13.2 -6.0 1.1
3 -0.5 6.3 18.3

next up previous
Next: Summary and Future Directions Up: BAYESIAN TWO SOURCE MODELING Previous: Bayesian Framework Application
Aaron S. Master 2003-10-30