Feedback Delay Networks are efficient structures for synthesizing room impulse responses, with an increasing echo density over time. They consist of parallel delay lines with associated decay filters that are coupled with each other through a feedback matrix. We study the modal behavior of FDNs and come up with a parameterized orthonormal mixing matrix (feedback matrix) that can be varied continuously from identity (minimum mixing) to Hadamard (maximum mixing). We also relate the perceptual mixing time to our parameterized mixing matrix. Sound examples are available here.
In an extended work, we propose the Grouped Feedback Delay Network (GFDN) , that has different attenuation filters in different delay line groups. We use the GFDN for modeling coupled rooms and single rooms constructed of different materials, and discuss methods for efficient room resizing. Sound examples are available here.
Sounds emanating from resonant objects such as rooms, plates and string instruments are composed of modes (standing waves) vibrating at different frequencies each with its unique decay rate. Modal synthesis aims to reconstruct sounds by estimating these mode parameters and efficiently synthesizing complex exponentials using parallel biquad filters.
We have measured and modeled carillon bells at Stanford's Hoover Tower using modal synthesis. Our 'computer carillon' can ring at different dynamic levels using a parameterized clapper-bell interaction function. Sound examples are available here.
My current research on this topic focuses on direct estimation of modal parameters from the impulse response on a warped frequency axis to resolve beating partials. Frequency warped direct modal estimation with iterative optimization has been used to succesfully model coupled piano strings, that exhibit two-stage decay and beating modes. Sound examples are available here.
The mathematical details of some high resolution frequency estimation algorithms such as MUSIC (Multiple Signal Classification) and ESPRIT (Estimation of Signal Parameters via Rotational Invariance Techniques) are analyzed and extended in this project. In particular, a more efficient MUSIC algorithm is proposed - FAST MUSIC, which is numerically more stable for detecting beating partials in approximately periodic signals. Some possible applications of these techniques in music research include modeling instruments such as pianos and bells where close frequency beating is often observed.
The Extended Kalman Filter is used to track fundamental frequency, amplitude and instantaneous phase of monophonic audio signals. This method is an addition to the extensive pre-existing literature available on pitch detection. It has certain advantages, such as a unique pitch value for each sample of data, unlike most block-based methods like cepstrum or YIN estimator, and is robust to the presence of a large amount of observation noise. However, it has certain drawbacks such as poor transient performance and slow detection of rapid pitch changes. These drawbacks have been addressed in an extended journal paper published in JAES. Performance on vocal singing excerpts can be found here.
The Ranchlands' Hum is a low frequency noise around 40Hz that has been plaguing the residents of Calgary, Canada for years. As an intern in the department of Electrical and Computer Engineering at University of Calgary, I assisted Dr. Mike Smith in developing an Android application that could capture, store and analyze low frequency noise. I added features that integrated the existing application with an SQLite database, calculated and plotted signal metrics. The project received some media attention.
The Kalman Filter is an MMSE estimator that can be used to remove background noise from speech. The filter equations are formulated based on the linear Autoregressive model of speech production. We implement a novel algorithm that tunes the Kalman Filter by accurately determining its parameters - measurement and process noise covariance. We also study the effect of changing AR model order on speech corrupted with various types of noise of various SNRs and summarize the results in an undergraduate thesis.
Tabla is is a membranophone percussion instrument (similar to bongos) which is often used in Hindustani classical music. The instrument consists of a pair of hand drums of contrasting sizes and timbres. The rhythmic pattern of any composition in Indian music is described by the term tala, which is composed of cycles of matra-s. Tala roughly correlates with the metres in Western music. Our aim is to determine the number of beats that constitute tala-s in different tabla solos. We develop a heuristic algorithm that extracts peaks from the tabla signal, corresponding to single or composite strokes and devise statistical methods to ensure that spurious noisy peaks are removed,and missed peaks are accounted for. We obtain excellent results for solo tabla recordings played by human artist.