MUS423 Research Seminars
The CCRMA Music 423 Research Seminar brings graduate students and supervising faculty together for planning and discussion of original research. Students and faculty meet either in small groups or individually, as appropriate for the research topics and interests of the participants. Research carried out is typically presented at the weekly CCRMA Colloquium (if it is of general interest to the CCRMA community) or at a Special DSP Seminar scheduled for that purpose. In either case, announcements appear on the CCRMA Home Page as Upcoming Events.
Recent DSP Seminars
Concepts and Control: Understanding Creativity in Deep Music Generation
Date:Fri, 11/15/2024 - 2:30pm - 3:30pmLocation:CCRMA Classroom [Knoll 217] (ZOOM Link Below)Event Type:DSP Seminar
Abstract: Recently, generative AI has achieved impressive results in music generation. Yet, the challenge remains: how can these models be meaningfully applied in real-world music creation, for both professional and amateur musicians? We argue that what’s missing is an interpretable generative architecture—one that captures music concepts and their relations, which can be so finely nuanced that they defy straightforward description. In this talk, I will explore various approaches to creating such an architecture, demonstrating how it enhances control and interaction in music generation.
FREEOpen to the PublicJin Woo Lee on "Differentiable Physical Modeling for Sound Synthesis: From Design to Inverse Problems"
Date:Fri, 10/04/2024 - 3:30pm - 5:00pmLocation:CCRMA Classroom [Knoll 217] (ZOOM Link Below)Event Type:DSP SeminarJin Woo Lee is a PhD Candidate at Seoul National University advised by Prof. Kyogu Lee (CCRMA PhD 2008). His research interests are focused on (1) physical modeling for musical instrument sound synthesis and (2) differentiable rendering for immersive and efficient sound simulation. His recent works broadly cover topics in musical sound synthesis, spatial audio rendering, loudspeaker control, and speech quality analysis. He has interned at Meta Reality Labs Research and Supertone. Prior to his PhD, Jin conducted research in computational fluid dynamics during his undergraduate years in Mechanical Engineering from POSTECH. For more information, please visit personal website (http://jnwoo.com/).
Abstract:Open to the PublicAI-based Digital Synthesizer Preset Programming: Parameter Estimation for Sound Matching
Date:Fri, 05/31/2024 - 3:30pm - 5:00pmLocation:CCRMA Classroom [Knoll 217] (ZOOM Link Below)Event Type:DSP Seminar
Presenter: Soohyun Kim
FREEOpen to the PublicGenerative AI for Music and Audio
Date:Fri, 11/10/2023 - 3:30pm - 5:00pmLocation:CCRMA Classroon [Knoll 217]Event Type:DSP SeminarAbstract: Generative AI has been transforming the way we interact with technology and consume content. In this talk, I will briefly introduce the three main directions of my research centered around generative AI for music and audio: 1) multitrack music generation, 2) assistive music creation tools, and 3) multimodal learning for audio and music. I will then zoom into my recent work on learning text-queried sound separation and text-to-audio synthesis from videos using pretrained language-vision models. Finally, I will close this talk by discussing the challenges and future directions of generative AI for music and audio.
FREEOpen to the PublicAdaptive and interactive machine listening with minimal supervision
Date:Fri, 02/10/2023 - 4:30pm - 5:20pmLocation:CCRMA Classroom [Knoll 217]Event Type:DSP SeminarAbstract: Nowadays deep learning-based approaches have become popular tools and achieved promising results in machine listening. However, a deep model that generalizes well needs to be trained on a large amount of labeled data. Rare, fine-grained, or newly emerged classes (e.g. a rare musical instrument or a new sound effect) where large-scale data collection is hard or simply impossible are often considered out-of-vocabulary and unsupported by machine listening systems. In this thesis work, we aim to provide new perspectives and approaches to machine listening tasks with limited labeled data. Specifically, we focus on algorithms that are designed to work with few labeled data (e.g. few-shot learning) and incorporate human input to guide the machine.
FREEOpen to the PublicMeta-AF: Meta-Learning for Adaptive Filters
Date:Fri, 11/18/2022 - 3:30pm - 4:20pmLocation:CCRMA Classroom [Knoll 217]Event Type:DSP SeminarAbstract: Adaptive filtering algorithms are pervasive throughout modern society and have had a significant impact on a wide variety of domains including audio processing, telecommunications, biomedical sensing, astrophysics and cosmology, seismology, and many more. Adaptive filters typically operate via specialized online, iterative optimization methods such as least-mean squares or recursive least squares and aim to process signals in unknown or nonstationary environments. Such algorithms, however, can be slow and laborious to develop, require domain expertise to create, and necessitate mathematical insight for improvement.
FREEOpen to the PublicFeedback Delay Networks for Artificial Reverberation
Date:Fri, 11/11/2022 - 12:00pm - 12:50pmLocation:ZoomEvent Type:DSP SeminarAbstract: Feedback delay networks (FDNs) are recursive filters widely used for artificial reverberation and decorrelation. While vast literature exists on a wide variety of reverb topologies, FDNs provide a unifying framework to design and analyze delay-based reverberators. This talk reviews recent advancements in the FDN theory, such as losslessness, modal and echo representations, and MIMO allpass properties and decorrelation. Many extensions to the FDN were proposed, including time-varying matrices, scattering matrices, high-order attenuation filters, directional reverberation, and coupled room reverberators.
Presentation RecordingFREEOpen to the PublicDeepAFx-ST: Style Transfer of Audio Effects with Differentiable Signal Processing
Date:Fri, 11/04/2022 - 3:30pm - 4:20pmLocation:CCRMA Classroom [Knoll 217]Event Type:DSP SeminarAbstract: We present a framework that can impose the audio effects and production style from one recording to another by example with the goal of simplifying the audio production process. We train a deep neural network to analyze an input recording and a style reference recording and predict the control parameters of audio effects used to render the output. In contrast to past work, we integrate audio effects as differentiable operators in our framework, perform backpropagation through audio effects, and optimize end-to-end using an audio-domain loss. We use a self-supervised training strategy enabling automatic control of audio effects without the use of any labeled or paired training data.
FREEOpen to the PublicTanguy Risset -- Compiling Audio DSP for FPGAs Using the Faust Programming Language and High Level Synthesis
Date:Fri, 10/28/2022 - 3:30pm - 4:20pmLocation:CCRMA Classroom [Knoll 217]Event Type:DSP SeminarAbstract: In this talk, we give a detailed presentation of Syfala (https://github.com/inria-emeraude/syfala), a new "audio DSP to FPGA" compiler based on the Faust programming language (https://faust.grame.fr/ ) and Xilinx/AMD's High level Synthesis (HLS) technology. Our open-source system compiles automatically audio DSP programs to FPGA hardware up to actual sound production (Zynq-based platforms). With this compiler, much smaller audio latency (i.e., one sample at a high sampling rate) can be achieved than with regular "software-based" digital audio systems. This presentation also introduces FPGA architecture in general as well as recent HLS technologies.
FREEOpen to the PublicAudio Understanding and Room Acoustics in the Era of AI
Date:Fri, 10/14/2022 - 3:30pm - 4:20pmLocation:CCRMA Classroom [Knoll 217]Event Type:DSP SeminarAbstract: This talk will aim to bridge the gap between signal processing and the latest machine learning research by discussing several applications in music and audio. In the first part of the talk, we will discuss how classic signal processing properties can be used to spoon-feed powerful neural architectures such as Transformers to tackle a difficult signal processing task: To do re-reverberation(system identification) at scale. This work now enables hearing music in any concert hall/virtual environment for any music. We use arbitrary audio recorded as an approximate proxy for a balloon pop, thus removing the need for them to measure room acoustics. This work has enormous applications in Virtual/Augmented Reality and the Metaverse if it happens!
FREEOpen to the Public
- 1 of 10
- ››

