When there are multiple sources in the mixture, the system hopefully will emit something closely enough to the original basis components of the sources, allowing physiologically intuitive use of such features described above in identification. Such a dramatically successful example is found in a mixture of Oboe and Bb-Clarinet playing note C4 concurrently with Bb-Clarinet lasting about 0.5 second longer. Despite having the same pitch and very similar sound, the spectral bases have been found to be rather well separated and are readily identified by a comparison to their isolated tone's first ISA spectral bases. Using a classifier in section 3, 7 out of 8 components are classified correctly with transient components matching with similar components in the trained prototypes. Admittedly, however, it is still required to be sufficiently non-overlapping, either temporally or spectrally, for such a healthy separation.
Though more than one bases may belong to the spanning set of one source subspace, we have to stop short of grouping them. Clustering of components belonging to the same source is problematic. This is not only because of the difficulty in estimating a reliable similarity measure as used successfully in [7] for a complex mixture, but also by the fact that they simply cannot be used to group transient and steady-state of the same sources together.
![]() |
The advantage of using such a data-driven algorithm lies in its ability to do auditory grouping with no extra rules [7]. It does not rely on pitch estimation which is hard to do in a polyphonic signal. However, the drawbacks include its reliability on an exposure long enough for meaningful components to be learned. The current linear model is also limiting and only approximately true with respect to the use of magnitude. There is also no guarantee that the bases derived from a mixture will be the same as those learned from a single instrument, or even whether they will be separated like the example shown above. From experiments, this happens from time to time and brings down the identification performance. For example, a beating effect of nearby harmonics can cause the algorithm to yield a basis which is unidentifiable with any of the individual sources. In the next section, we will then examine how well the system can do in lights of these potential obstacles.