Digital Waveguides

In the mid 1980s, however, with the advent of digital waveguide methods [209] due to Julius Smith, all this changed. These algorithms, with their roots in digital filter design and scattering theory, and closely allied to wave digital filters [81], offered a convenient solution to the problem of computational expense for a certain class of musical instrument, in particular, those whose vibrating parts could be modelled as one-dimensional linear media described, to a first approximation, by the wave equation. Among these may be included many stringed instruments, as well as most woodwind and brass instruments. In essence, the idea is very simple: the motion of such a medium may be modelled as two travelling non-interacting waves, and in the digital simulation, this is dealt with elegantly by using two ``directional" delay lines, which require no computer arithmetic at all! See Figure

. Digital waveguide techniques also yielded a lucrative patent, formed the basis for at least one commercial synthesizer (the Yamaha VL1), and serve as modular components in many of the increasingly common software synthesis packages (such as MAX/MSP [248], STK [58], and csound [36]). Now, some twenty years on, they are considered the state of the art in physical modelling synthesis, and the basic design has been complemented by a great number of variations intended to deal with more realistic effects (discussed below), usually through more advanced digital filtering blocks. In general, digital waveguides will not be covered in this book, mainly because there already exists a large literature on this topic, as well as a comprehensive, and constantly growing monograph by Smith himself [209]. The relationship between digital waveguides and more standard time domain numerical methods has been addressed by various authors [208,119,22], and will be revisited in some detail in Section 6.2.9. A succinct overview is given in [205] and [175].

The path to the invention of digital waveguides is an interesting one, and is worth elaborating here. In approximately 1983, Karplus and Strong [122] developed an efficient algorithm for generating musical tones strongly resembling string tones, which was almost immediately noticed and subsequently extended by Jaffe and Smith [109]. The Karplus Strong structure is no more than a delay line, or wavetable, in a feedback configuration, in which data is recirculated; generally, the delay line is initialized with random numbers, and is terminated with a low order digital filter,usually with a low-pass characteristic--see Figure 1.8. Tones produced in this way are spectrally rich, and exhibit a decay which is indeed characteristic of plucked string tones, due to the terminating filter. The pitch is determined by the delay-line length and the sample rate: generally, for an

-sample delay line, as pictured in Figure 1.8, with an audio sample rate of $f_{s}$ Hz, the pitch of the tone produced will be at $f_{s}/N$ , though this may be modified through interpolation, just as in the case of wavetable synthesis. In all, the only operations required in a computer implementation are the digital filter additions and multiplications, and the shifting of data in the delay line. The computational cost is on the order of that of a single oscillator, yet instead of producing a single frequency, Karplus-Strong yields an entire harmonic series. The Karplus-Strong plucked string synthesis algorithm is an abstract synthesis technique, in that in its original formulation, though the sound produced behaved perceptually like plucked string sounds, there was no immediate physical interpretation offered.

**Figure 1.8:** *The Karplus-Strong plucked string synthesis algorithm. An -sample delay line is initialized with random values, which are allowed to recirculate, while undergoing a filtering operation.*
$\begin{figure} \begin{center} \begin{picture}(350,80) % graphpaper(0,0)(35... ...elay line} \end{picture} \end{center}\par \vspace{0.2in} \par \end{figure}$

There are two important conceptual steps leading from the Karplus-Strong algorithm to a digital waveguide structure. The first is to associate a spatial position with the values in the wavetable--in other words, a wavetable has a given physical length. The other is to show that the values propagated in the delay lines behave as individual traveling wave solutions to the 1D wave equation; only their sum is a physical variable (such as displacement, or pressure, etc.). The link between the Karplus-Strong algorithm and digital waveguide synthesis, especially in the "single-delay-loop" form, is elaborated by Karjalainen et al. [121]. Excitation elements, such as bows, hammer interactions, reeds, etc., are usually modelled as lumped, and are connected to waveguides via scattering junctions, which are, essentially, power-conserving matrix operations (more will be said about scattering methods in the next section).

**Figure 1.9:** The solution to the 1D wave equation, (a), may be decomposed into a pair of traveling wave solutions, which move to the left and right at a constant speed determined by the system under consideration. This constant speed of propagation leads immediately to a discrete time implementation employing delay lines, as shown in (c).

Waveguide models have been successfully to a multitude of systems; several representative configurations are shown in Figure 1.10.

String vibration has seen perhaps the most activity, owing perhaps to the relationship between waveguides and the Karplus-Strong algorithm. As shown in Figure 1.10(a), the basic picture is of a pair of waveguides separated by a scattering junction connecting to an excitation mechanism, such as a hammer or plectrum; at either end, the structure is terminated by digital filters which model boundary terminations, or potentially coupling to a resonator or other strings. The output is generally read from a point along the waveguide, through a sum of wave variables traveling in opposite directions. Early work was due to Smith [208] and others. In recent years, the Acoustics group at the Helsinki University of Technology has systematically tackled a large variety of stringed instruments using digital waveguides, yielding sound synthesis of extremely high quality. Some of the target instruments have been standard keyboard instruments such as the harpsichord [231] and clavichord [229], but others, such as the Finnish kantele [75,160], are more exotic. There has also been a good deal of work on the extension of digital waveguides to deal with the particular ``tension-modulation," or pitch-glide nonlinearity in string vibration [232,74,220], a topic which will be taken up in great detail in §8.1. Some more related recent areas of activity have included so-called banded waveguides [76,77], which are designed to deal efficiently with systems with a high degree of inharmonicity, commuted synthesis techniques [206,120], which allow for the interconnection of string models with harder-to-model resonators, through the introduction of sampled impulse responses, and the association of digital waveguide methods with underlying PDE models of strings [17].

Woodwind and brass instruments are also well-modelled by digital waveguides; a typical waveguide configuration is shown in Figure 1.10(b), where a digital waveguide is broken up by scattering junctions connected to models of (in the case of woodwind instruments) toneholes. At one end, the waveguide is connected to an excitation mechanism (such as a lip or reed model), and at the other end, output is taken after processing by a filter representing bell and radiation effects. Early work was carried out by Smith, for reed instruments [202], and for brass instruments by Cook [54]. Work on tone hole modeling has appeared [193,71], sometimes involving wave digital filter implementations [239], and efficient digital waveguide models for conical bores have also been developed [204,227].

Vocal tract modeling using digital waveguides was first approached by Cook [53,55]; see Figure 1.10(c). Here, due to the spatial variation of the cross-sectional area of the vocal tract, multiple waveguide segments, separated by scattering junctions, are necessary. The model is driven at one end by a glottal model, and output is taken from the other end after filtering to simulate radiation effects. Such a model is reminiscent of the Kelly-Lochbaum speech synthesis model [124], which in fact predates waveguides altogether, and can be calibrated using linear predictive techniques [167], and wave digital speech synthesis models [211].

Networks of digital waveguides have also been used in a quasi-physical manner in order to effect artificial reverberation--in fact, this was one of the original applications of the technique [201]. In this case, a collection of waveguides, of varying impedances and delay lengths is used; such a network is shown in Figure 1.9(d). Such networks are passive, so that signal energy injected into the network from a dry source signal will produce an output whose amplitude will gradually attenuate, with frequency-dependent decay times dependent on the delays and immittances of the various waveguides--some of the delay lengths can be interpreted as implementing `early reflections'[201]. Such networks provide a cheap and stable way of generating rich impulse responses. Generalizations of waveguide networks to feedback delay networks (FDNs) [178] and circulant delay networks [180] have also been explored, also with an eye towards applications in digital reverberation. When a waveguide network is constructed in a regular arrangement, in two or three spatial dimensions, it is often referred to as a waveguide mesh [235,236,237,22]--see Figure 1.10(e). In 2D, such structures are useful in modelling the behaviour of membranes, and in 3D, potentially for full-scale room modelling (i.e., for artificial reverberation), though real-time implementations of such techniques are probably decades away. Some work on the use of waveguide meshes for the calculation of room impulse responses has appeared recently [14].

**Figure 1.10:** Typical digital waveguide configuration for musical sound synthesis. In all cases, boxes marked ${\bf S}$ represent scattering operations. (a) A simple waveguide string model, involving an excitation at a point along the string and terminating filters, and output read from a point along the string length, (b) a woodwind model, with scattering at tonehole junctions, input from a reed model at the left end, and output read from the right end, (c) a similar vocal tract configuration, involving scattering at junctions between adjacent tube segments of differing cross-sectional areas, (d) an unstructured digital waveguide network, suitable for quasi-physical artificial reverberation, and (e) a regular waveguide mesh, modelling wave propagation in a 2D structure such as a membrane.