Real-Time Audio-Rate Inferencing for Neural Networks

Jatin Chowdhury (jatin@ccrma)

Neural Networks for Audio Effects

Neural Networks are a powerful tool for creating scientifically and sonically interesting audio effects. Some examples of existing audio effects that use real-time neural networks include:

Real-Time Inference

When using a neural network in a real-time audio effect, it is common to train the network beforehand, so that in the effect itself, the network only needs to take the audio input, and produce an audio output. This process is known as running inference on a neural network, and requires an inferencing engine.

Why not use an existing library?

Major libraries like PyTorch and TensorFlow do have supprt for running real-time inference in C/C++. However, these libraries two main issues.

  • Performance:
    Major neural network libraries are typically designed for tasks like object detection, or language processing, which operate on much slower time-scales than real-time audio. Additionally, the libraries are often tuned to perform well for larger neural networks, but may not perform optimally for smaller networks.
  • Real-Time Safety:
    In audio programming we never want to allocate memory on the audio thread, because it could take an unbounded length of time, causing audible glitches. Major neural network libraries don't follow this rule.

Introducing RTNeural

RTNeural is a C++ library designed to perform real-time audio-rate inferencing for pre-trained neural networks. The main design goals for the library were:

  • Speed: RTNeural needs to be able to run small networks well above the real-time audio threshold.
  • Accuracy: RTNeural needs to maintain the accuracy of the trained network.
  • Flexibility: RTNeural should work on a variety of computing architectures and processors.
  • Ease of use: RTNeural should make it easy to go from training your network to using it in an audio effect.
  • Real-Time Safety: RTNeural should explicitly define code that is safe to call on the audio thread.

RTNeural: Speed

Even at it's slowest settings, RTNeural performs faster than the PyTorch C++ API for most small layer sizes:

RTNeural: Flexibility

RTNeural supports four backends, to allow for maximum flexibility.

  • XSIMD: A fast SIMD library. Optimally fast for smaller networks.
  • Eigen: An optimised linear algebra library. Fast even for larger networks.
  • Accelerate: A fast math library for Apple devices.
  • STL: The C++ standard library, can compile on any device.

RTNeural: Ease of use

RTNeural contains a simple script for exporting the neural network weights from a trained model to a json format. The json file can then be loaded by RTNeural at runtime.

RTNeural in the wild

RTNeural is already being used in real-time audio plugins that are available now!

ChowCentaur is a guitar distortion effect plugin modelled after the famed Klon Centaur pedal. The plugin contains a "Neural" mode that uses a recurrent neural network.

CHOWTapeModel is a physical model of reel-to-reel analog tape. One of the processing modes uses a State Transition Network to help solve the Jiles-Atherton equations that describe magnetic hysteresis.