Welcome to CCRMA World Update attendees!
This collects some items around real-time GPU audio processing and modal filter bank processing for casual discussion in the breakout room. Demos are a bit rough but I uploaded old videos and audio demos where possible (I signed up for the open house late).
For questions or comments, please mail any of:
- travissk@ccrma.stanford.edu
- travisskare@gmail.com
Thank you!
CymbalVerb/Modal-filter-bank-verb
Runs source audio through a modal filter bank, with coeffecients obtained by dragging and dropping an audio file over the plugin. This is an older plugin and it's in need of a GUI refresh.
This plugin runs on the CPU so can be processor-intensive for hundreds of modes. There are controls to use only the top N modes (discarding by amplitude only... the proejct would benefit by throwing out modes in the more over-defined bark bands).
Synthesis is via phasor filters (~complex multiplication-based oscillators). This is in the family tree of running a convolution reverb through a cymbal sample, but allows for "modal effects" to be applied: stretching, shifting, adding or removing complexity (by trimming #modes) etc. If such effects are not used, however, the convolution approach would be more efficient at low numbers of modes.
A "freeze" effect sets decay rates and input couplings, allowing the filters to ring. This is around 0:55 in the audio demo.
Audio Demo (warning: consider setting the volume low at first - it's louder than the other demos on this page)
A drum loop (sourced from Logic's Apple Loops) dry, and then sent through the plugin with various values of different parameters, effects, and wet/dry settings.
Modal Cymbal Synthesizer (LAC 2019)
A plugin that enables writing GPU-powered audio plugins. A JUCE-based plugin running in the DAW communicates to a helper process which shuttles data to and from a GPU.
Technical notes: during this demo, the communication was blocking (communication to/from the GPU could be made streaming) and synthesis was linear (nonlinear extensions would be added later)
Sets of modes at low/medium/high velocities were captured, and are excited via MIDI input e.g. with a percussion pad instrument. "Modal effects" include being able to shape the base frequency, frequency "width", equalization, and decay time of the filter bank, as filter bank parameters (vs. postprocessing of the audio signal, which in some cases is equivalent)
This video is from an older version presented at LAC 2019 and at CCRMA's 2019 Open House. A newer version started to introduce pitch glide effects and started to be more opinionated about being a drum set. It's still based on the approach of capture/replay of sets of modes obtained at different velocities then postprocessed and affected.
I'll be attending the event from a MacOS machine which does not support CUDA so will not be able to demo this live if I can't update remoting into my Windows/Linux machine. However I'm happy to meet up after Zoom or try and switch machines.
I'll update with a better video here and/or on my dissertation page.
System Diagram
JACK-connected GPU audio synthesizer
Work-in-progress but using CCRMA World Update as a space to see if anyone is interested!
The filter bank process used for demo in the previous process is expanded to a JACK-connected version, toward wave field synthesis. A basic kernel runs, sending back N=32 channels of sine waves, but a "real" application is not yet tested.
Proposed API
There is the question of what parameters to provide to authors of a GPU kernel such that there is minimal overhead for simple applications, and no need to recompile the plugin piece of the system. Ideally developers shouldn't even need to recompile the helper program, only the GPU kernel which could then be loaded dynamically and reloaded at runtime (seems possible via the CUDA Driver API, but this is unexplored).Therefore a block of memory is provided to the kernel that may be treated as a structure with input and output:
- Input: 32-byte bitfield for midi 0-127 on/off flags
- Input: N channels of input audio (not yet implemented, experiment is synthesis)
- N channels of CC data, either one value-per-CC# per buffer, or a time-series of them*
- Output buffers for 32 channels of sample data output back to JACK client for speaker array.
*Limitation: This first revision has parameters set per buffer rather than a time series. Someone may wish to expand on this in the future.
System Diagram
Similar to the above, but we remove at least the custom IPC section and replace with JACK.
For many applications we can remove the custom plugin piece as well, if we're just shuttling audio data around. The plugins in the previous section had logic to adjust modal responses in response to velocity, etc.
Practical GPU Programming considerations
Studying latency and variation of latency when running audio kernels on the GPU. Asking the question -- Is this even feasible without dropouts on an end-user's machine?
Less formally, considering how to share the GPU among a chain of plugins written by one author, or among plugins written by different hypothetical plugin manufacturers.
This was written up as a short paper:
GPGPU Patterns For Serial and Parallel Audio Effects, with Jonathan Abel.
Presented at eDAFx 2021: pdf
Further reading: around the same time and without communication/coordination etc., Harri et al. published "There and Back Again: The Practicality of GPU Accelerated Digital Audio" (link) which covers the same topic, but with the emphasis more on different hardware configurations (and CUDA vs OpenCL).
Plate VST with nonlinear experiments
2D Physically-modeled plate based on the digital waveguide mesh (See: STK Mesh2D). Extended with some experiments to try and introduce nonlinearities by altering the physical properties of the mesh or adding extra reflections inside the mesh.
A video demo may be found here (mp4, 6MB); both the local recording setup and nonlinearity sections need debugging so this will ideally be replaced even during the event.
Until then, here are some experimental results created during development with Jupyter notebooks. Please note these are the effects added to bare-bones implementations of a plate, with only rough lowpass filters on the edge (no allpasses, input/output selection, etc).
More exaggerated: