Modeling Spectral and Temporal Structure in Sound Mixtures

Date:

Wed, 03/31/2010 - 4:30pm

Location:

CCRMA Classroom [Knoll 217]

Event Type:

DSP Seminar

Presenter: Gautham J. Mysore

Mathematical modeling of sounds has been an ongoing pursuit for decades. There is a great deal of structure in audio and good models need to make use of this structure. Particularly, audio has a strong spectral and temporal structure. When dealing with sound mixtures, the structure of the individual sources becomes particularly important if we wish deal with them separately. In recent years, dictionary learning methods such as non-negative matrix factorization (NMF) and probabilistic latent component analysis (PLCA) have become quite popular as they provide a rich representation of audio spectra and are amenable to high quality reconstruction of sounds. However, they fail to provide a statistical description of the temporal structure. On the other hand, Hidden Markov Models (HMMs) have been used for decades to model temporal structure. They can be very powerful for audio analysis, as shown by their application to speech recognition. However, they have several limitations when it comes to reconstruction. This is an issue if we desire a high quality audio output. We propose a new algorithm that combines the best of both worlds. The proposed method jointly learns several small dictionaries that characterize the spectral structure of a given sound. It jointly learns the temporal structure of the sound. As in NMF and PLCA, the dictionary elements are all non-negative, which give them a semantic interpretation as well as allowing non-destructive mixing of the dictionary elements. It additionally imposes a hierarchical structure to the dictionaries. We use this algorithm to decompose sounds, process the individual parts, and reconstruct them. This is demonstrated on content aware audio processing. For example, we change a major arpeggio to a minor arpeggio. We then propose a method of modeling sound mixtures by combining models of individual sources. This can be used for various applications as sound mixtures are commonly encountered. We demonstrate it on the application of source separation.

FREE

Open to the Public

Search this site:

Fall Courses at CCRMA

Music 1A Music, Mind, and Human Behavior
Music 101 Introduction to Creating Electronic Sounds
Music 192A Foundations in Sound Recording Technology
Music 201 CCRMA Colloquium
Music 220A Foundations of Computer-Generated Sound
Music 223A Composing Electronic Sound Poetry
Music 256A Music, Computing, and Design I: Software Paradigms for Computer Music
Music 319 Research Seminar on Computational Models of Sound Perception
Music 320 Introduction to Audio Signal Processing
Music 351A Research Seminar in Music Perception and Cognition I
Music 451A Auditory EEG Research I

Main menu

Secondary menu

Modeling Spectral and Temporal Structure in Sound Mixtures

Search this site:

Fall Courses at CCRMA