Machine recognition of musical features in an audio signal is an important and desirable tool both for identification and retrieval of a given musical recording and for comparisons of multiple music recordings. A robust machine recognition system is critical for successful multimedia database management and useful for interactive music systems. Furthermore, it suggests new research opportunities in music theory, analysis and musicology.
Music is structured at a variety of levels ranging from the quasi-periodicity of frequency to the macro levels of large-scale musical forms. Intermediate levels of structure include structures such as motives and phrases. These structures constitute the salient perceptual units that listeners use to comparatively assess music in terms of the degree of similarity either within a given piece or between pieces. While human listeners are facile at distinguishing recurrence and contrast in music the same task has proven elusive in machine listening paradigms.
In this talk a methodology is proposed that attempts to derive salient and hierarchical musical structures from a raw audio signal by accessing the degree of novelty and redundancy throughout a musical signal.