next up previous
Next: A Pathological Musical Example Up: BAYESIAN TWO SOURCE MODELING Previous: Bayesian Framework

Bayesian Framework Application

As mentioned above, knowledge of the distributions on $\vert S_u\vert$ and $\vert S_v\vert$ can be used to create distributions on the $\vert Y_i\vert$ values from equations 9 through 11. The distributions on $\vert S_i\vert$ need not be Gaussian to use the technique described here. Assuming this simplifies the situation, however, and informal experiments have shown that we may fairly model the amplitude of the STFT coefficients this way by considering their distribution as the positive values of a zero-mean Gaussian. Specifically, it may be shown that doing so leads to
$\displaystyle \vert\hat{Y}_{i=u}\vert$ $\textstyle \tilde{}$ $\displaystyle N(0,\sigma_v^2\cdot \vert\alpha_{uv}\vert^2)$ (14)
$\displaystyle \vert\hat{Y}_{i=v}\vert$ $\textstyle \tilde{}$ $\displaystyle N(0,\sigma_u^2\cdot \vert\alpha_{vu}\vert^2)$ (15)
$\displaystyle \vert\hat{Y}_{i\neq(u\vert v)}\vert$ $\textstyle \tilde{}$ $\displaystyle N(0,
\sigma_u^2\cdot \vert\alpha_{iu}\vert^2 + \sigma_v^2\cdot \vert\alpha_{iv}\vert^2)$ (16)

where $\sigma_i^2$ represents the variance for source $i$ at a given frequency. (We again recall that these distributions, and their related $\sigma^2$ and $\alpha$ values are source dependent, and different for each frequency. Clearly, $\sigma^2$ will be larger for frequencies corresponding to the active range of a given voice or instrument.) We may calculate then, the probability that a set of data $D$ (in the form of $\vert Y_i\vert$ values given by equation 4) was generated by the presence of sources $S_u$ and $S_v$ via:
$\displaystyle p(D\vert u,v)$ $\textstyle =$ $\displaystyle \prod_{i=1}^{N} p(\vert Y_i\vert \vert u,v)$ (17)

where

\begin{eqnarray*}
p(\vert Y_i\vert \vert u,v) &=& \ensuremath{\frac{1}{\sqrt{2...
...rac{-\vert Y_i\vert}{2\mathrm{var}(\hat{Y}_i\vert u,v)}}\right]
\end{eqnarray*}



and $\mathrm{var}(\hat{Y}_i\vert u,v)$ refers to the variance in the distributions of expressions 14 through 16. To achieve our goal of determining the two most likely sources at a given point in time-frequency space, we first determine $p(D\vert u,v)$ for the point's $\vert Y_i\vert$ values using equation 17 and considering every possible $(u,v)$ combination. Then we substitute in our result to equation 13 which allows us to take into account prior probabilities. By allowing $u$ or $v$ to be ``NULL'' and assigning a value corresponding to the noise floor as the variance of $S_{\mathrm{NULL}}$, we effectively can include the one-source combinations used in the DUET system as well.
Figure 1: The original mixtures $X_1$ and $X_2$ and the source performances $S_1$ (clarinet), $S_2$ (violin), and $S_3$ (cello) used to make them.
3in7incolumn1.eps

next up previous
Next: A Pathological Musical Example Up: BAYESIAN TWO SOURCE MODELING Previous: Bayesian Framework
Aaron S. Master 2003-10-30