Tom Walters - The Intervalgram: An Audio Feature for Large-scale Melody Recognition

Date:

Fri, 04/27/2012 - 1:15pm - 2:30pm

Location:

CCRMA Seminar Room

Event Type:

Hearing Seminar

Melody recognition is hard, not the least of the reasons because the song can be transposed and not change the basic melody. It is easy to consider all tranpositions, but this extra complexity is really an issue for large-scale melody recognition. Tom Walters, a research scientist at Google, will be talking about tests he has done to scale melody recognition to very large databases. I'm sure you can image why this might be important to Google. :-)

    Who:    Tom Walters (Google)
    Why:    Melodies are an interesting feature of audio
    What:    The Intervalgram: An Audio Feature for Large-scale Melody Recognition
    When:    Friday April 27th at 1:15PM
    Where:    CCRMA Seminar Room (Top Floor of the Knoll)

Think about your favorite melodies and bring them to CCRMA.

- Malcolm

The Intervalgram: An Audio Feature for Large-scale Melody Recognition
Tom Walters (Google)

The ‘intervalgram’ is a summary of the local pattern of musical intervals in a segment of music. It is based on a chroma representation derived from the temporal proﬁle of the stabilized auditory image and is made locally pitch invariant by means of a ‘soft’ pitch transposition to a local reference. Sets of intervalgrams are used as the basis of a system for detection of identical melodies across a database of music. Using a dynamic-programming approach for comparisons between a reference and the song database, we evaluated performance on the ‘covers80’ dataset. A ﬁrst test of an intervalgram-based system on this dataset yields a precision at top-1 of 53.8%, with an ROC curve that shows very high precision up to moderate recall, suggesting that the intervalgram is adept at identifying the easier-to-match cover songs in the dataset with high robustness. The intervalgram is designed to support locality-sensitive hashing, such that an index lookup from each single intervalgram feature has a moderate probability of retrieving a match, with few false matches. With this indexing approach, a large reference database can be quickly pruned before more detailed matching, as in previous content-identiﬁcation systems.

Tom is a research scientist at Google in Mountain View where he works on applications of machine hearing to large-scale audio analysis problems. Recent applications have included sound effects search, video content analysis and cover song recognition. Prior to Google, Tom was at the University of Cambridge, where he completed a MSci in experimental and theoretical physics and PhD at the Centre for the Neural Basis of Hearing, under the supervision of Roy Patterson. His PhD research was into applications of the Auditory Image Model to various audio analysis tasks including scale-invariant and noise-robust speech recognition.

FREE

Open to the Public

Search this site:

Spring Quarter 2024

Music 101 Introduction to Creating Electronic Sounds
Music 128 Stanford Laptop Orchestra (SLOrk)
Music 155/255 (ARTSTUDI 239) Intermedia Workshop
Music 220C Research Seminar in Computer-Generated Music
Music 222A Quantum Computer Music
Music 228 SVOrk (Stanford Virtual Reality Orchestra)
Music 250A Physical Interaction Design for Music
Music 254 Computational Music Analysis
Music 257 Neuroplasticity and Musical Gaming
Music 319 Research Seminar on Computational Models of Sound Perception
Music 320C Audio DSP Projects in Faust and C++
Music 423 Graduate Research in Music Technology

Main menu

Secondary menu

Tom Walters - The Intervalgram: An Audio Feature for Large-scale Melody Recognition

Search this site:

Spring Quarter 2024