Publications
- P. Jinachitra & J. O. Smith, "Generative Model of Voice in Noise for
Structured Coding Applications", ICASSP 2007, Honolulu, USA.
- P. Jinachitra, "Glottal Closure and Opening Detection for Flexible
Parametric Voice Coding", Interspeech 2006, Pittsburgh, PA,
USA. (Demo
page)
- P. Jinachitra, "Noisy speech segmentation using non-linear observation
switching
state space model and Unscented Kalman Filtering", ICASSP'2006, Toulouse,
France.
- P. Jinachitra & J. O. Smith"Joint
estimation of
glottal source and vocal tract for vocal synthesis using Kalman smoothing
and EM algorithm", WASPAA'2005, New Paltz, NY.(Demo
page)
- P. Jinachitra & R. Prieto"Towards speech
recognition oriented dereverberation", ICASSP'2005, Philadelphia.
- R. Prieto & P. Jinachitra"Blind source
separation for time-variant mixing systems using piecewise linear
approximations", ICASSP'2005, Philadelphia.
- P. Jinachitra"Polyphonic
instrument identification using independent subspace analysis",
ICME'2004, Taipei.
- P. Jinachitra"Constrained EM
estimates for
harmonic source separation", ICASSP'2003, Hong Kong.
Copyrighted materials. All rights reserved.
Research
Dissertation advisor : Prof. Julius O. Smith III
Affiliation : Center for Computer Research in Music and Acoustic (CCRMA)
Music Applications
My interest is in general audio analysis, applied to
speech and musical signal. Particular applications
include sound source separation, transcription, extraction and removal, pitch detection, instrument
identification, indexing and retrieval, speech enhancement and
dereverberation, robust speech recognition, speech production
model.
My past research activities have been about how to separate musical sources from a song,
especially a singing voice. The application includes melody transcription for indexing and retrieval and
to extract or remove the singing voice track from the song for musical purposes.
The tools I have been studying range from sinusoid model to statistical learning, from
top-down to bottom-up processing. Also, polyphonic transcription is the ultimate problem
I'm interested in. It requires segmentation, instrument identification, pitch detection along
with many other relevant information which, in the future, in lights of MPEG-7 etc., will be
very desirable. Some examples of work in this area are shown below.
Instrument Identification from Polyphonic Signals
About Audio Source Separation
Constrained EM estimate for harmonic source separation
MUS421 project: The sound of a plectrum,
finger or fingernail plucked string
Multidisciplinary
As a sidetrack, I also do useful audio signal processing for interactive applications in
toys. The features that were implemented and tested are silence
detection, voice modification, source localization and separation, and query-by-humming(or singing).
The pause detection was implemented in C/C++ in the eventual prototype for
real-time. Query-by-humming is now also in C/C++ taking a few seconds to
verify one song out of possible ten.
Media X : Interactive Toy project
Speech Applications
From recently, I have been involving research on audio analysis with
applications to speech. In particular, speech enahancement from noise and
reverberation for listening and robust speech recognition purposes. My
research focus is on speech production model and automatic identification
of its parameters under possibly noisy circumstances. The model-based
approach allows for flexibility in reconstructing the speech source with
modifiable pitch and duration among other characteristics. It also fits
into the theme of structured audio where a sound object is described by a
compact set of parameters.
Review of speech
synthesis(last update : Feb 2006)
Review of sound source separation(last
update : June 2003)
Demo of
parametric voice coding(submitted to ICSLP'06)
Demo of
joint source-tract parameter identification in noise(WASPAA'05)
Classes
EE261 : Fourier Transform and Its Applications
EE278 : Intro to Statistical Signal Processing
EE263 : Linear Dynamical System
EE398a : Image Communication I
EE369a : Medical Imaging I
EE367a (Music420) : Applications of
the Fast Fourier
Transform
EE368 : Digital Image Processing
CS229 : Pattern Classification
EE262 : Two-Dimensional Imaging
EE292B : Electronic Documents - Paper to Digital
Math 266 : Wavelets
EE367B(Music421) : Signal Processing
Methods in Musical Acoustics
Blind Source Separation of Convolutive Mixtures (M.Eng. Dissertation 2001)
Partial summary : BSS of convolutive mixtures
UROP report prior : BSS of delayed mixtures
Adaptive Signal Processing, Optimization, Digital Filters, Neural Network, Discrete-time Control, Digital Image Processing,
Power Spectral Estimation, Operation Research, Philosophy I&II
(National Electronics and Computer Technology
Center, Thailand, summer 2002)
Blind Separation of a Single Channel Audio Mixture