Up to now, all the synthesis methods we have examined have been attempts to reproduce (or model) the waveform of a particular instrument or sound. The technique of physical modeling (also sometimes referred to as "Waveguide Synthesis") is however fundamentally different. With physical modeling it is the actual physics of the instrument and its playing technique which are modelled by the computer. Typically, a physical model instrument takes the form of a recursive algorithm in which the contents of a delay line (the length of which corresponds to the number of samples required to produce one period of waveform at some desired pitch) are subject to some kind of modification process on each cycle of the recursion. The initial contents of the delay line, and the details of the modification process play a key role in determining the character of the sound produced, and every effort is made to bring these into line with the way the acoustics of a given instrument really operates. This handout takes a look at the one of the earliest and simplest examples of a physical model - the Karplus-Strong "Pluck-String" algorithm - and also Perry Cook's "Slide Flute".
The results of physical modeling can be surprisingly natural and life-like, as it is possible to give an instrument some notion of its own current state. So, for example, like a real musician playing a real instrument, a good physical model will never produce exactly the same waveform twice. Finally, Perry Cook has written a number of graphical implementations of his various physical model instruments (including Slide-Flute). In /dist/PRCStuff, these are: Clarinot (a virtual clarinet!) HosePlayer (a generic brass instrument), and Slide-Flute. I recommend you play around with these a little to get a feel for the capabilities of physical modeling.
Looking for new and computationally efficient methods of sound synthesis, Alex Strong (a graduate of the CS dept. here at Stanford) was - I believe - the first to find this new synthesis method. Essentially, the algorithm could be described as a variation on wavetable synthesis, an important distinction, however, being that the wavetable data is not kept constant, but subject to modification on each period.
Here is a block diagram of the algorithm:
As can be seen, the modification process involves taking the average of the current and last output values. The initial contents of the delay line is `white noise' 1 ie. random numbers between -1 and +1. Clearly, since the length of the delay line is set to one period's worth of samples, some kind of periodic waveform (albeit one with a lot of high frequency energy) will result. That the recursive averaging process makes this into a plucked-string sound however, is non-obvious, and remarkable in its simplicity of design.
(definstrument plucked-string-k-s (start-time duration frequency amplitude) (let* ((beg (floor (* start-time sampling-rate))) (end (+ beg (floor (* duration sampling-rate)))) ;; find the number of samples needed for one period of waveform ;; given the current sampling-rate and desired frequency (period-samples (floor (/ sampling-rate frequency))) ;; the delay line is implemented here as an array. Since we ;; need to set its initial state to `white noise', it will be ;; more convenient to manage this way (string (make-array period-samples)) ;; some variables needed for the run loop (cnt 0) (last-output 0) (current-output 0)) ;; iterate through each element in the array (ie. for each sample ;; in the initial state of the delay line), and set its value ;; to a random number between -1.0 and +1.0 (loop for i from 0 to (1- period-samples) do (setf (aref string i) (- 1.0 (random 2.0)))) (Run (loop for i from beg to end do ;; get the current output from the delay line array (setf current-output (aref string cnt)) ;; put the average of current-output and last-output into ;; the start of the delay line (setf (aref string cnt) (/ (+ current-output last-output) 2)) (outa i (* amplitude current-output)) ;; a delay of 1 sample is best handled by the simple setting of ;; last-ouput to current-ouput, before the next value for ;; current-output is computed (setf last-output current-output) ;; cnt is a pointer to the current position we need to read from ;; along the delay line (array). On each iteration, it is incremented, ;; and when it gets to the end of the array it is set to zero, ;; and thus goes back to the beginning again (setf cnt (if (= (1- period-samples) cnt) 0 (+ 1 cnt)))))))
1. A 220 Hz plucked string:
(with-sound () (plucked-string-k-s 0 5 220 0.5))
2. Notice (in case you haven't already) that there is no amplitude envelope to this instrument. Thus, in this form, we have no direct control over the decay rate of the string. Setting the duration too short will result in an abrupt cut-off of the sound:
(with-sound () (plucked-string-k-s 0 20 220 0.5))
A simple solution to this is to include a dampening factor in the loop. This should be a number very slightly less than 1. Here is a re-write of the instrument:
(definstrument plucked-string-k-s-2 (start-time duration frequency amplitude &optional (damper 0.99)) (let* ((beg (floor (* start-time sampling-rate))) (end (+ beg (floor (* duration sampling-rate)))) (period-samples (floor (/ sampling-rate frequency))) (string (make-array period-samples)) (cnt 0) (last-output 0) (current-output 0)) (loop for i from 0 to (1- period-samples) do (setf (aref string i) (- 1.0 (random 2.0)))) (Run (loop for i from beg to end do (setf current-output (aref string cnt)) ;; the input to the delay line is further mutiplied by the ;; dampening factor. (setf (aref string cnt) (* damper (/ (+ current-output last-output) 2))) (outa i (* amplitude current-output)) (setf last-output current-output) (setf cnt (if (= (1- period-samples) cnt) 0 (+ 1 cnt)))))))
1. With default damper value:
(with-sound () (plucked-string-k-s-2 0 1 220 0.5))
2. With more extreme damper value:
(NB the value doesn't have to be reduced by that much to produce a marked change in decay time)
(with-sound () (plucked-string-k-s-2 0 1 220 0.5 0.9))
The Jaffe-Smith extensions solve a number of inherent problems with the algorithm. One of these involves a way of achieving greater control over the decay time of the string, another - and probably the most significant one - is concerned with certain tuning problems. For a full discussion of these see the relevant paper - cited at the end of this handout. The version of the pluck instrument which appears in "ins.lisp" is the Jaffe-Smith extension, however (in my opinion), it is not too well documented, particularly with respect to the `helper' functions that precede the instrument. Reference to the Jaffe-Smith paper should help to clarify things (perhaps this is something that would be make suitable project material!).
In the meantime, here is a brief overview before I refer you to the relevant section from "ins.lisp". (To help you find the appropriate section of this fairly long file more easily, put "PLUCK:" into the finder panel, and hit carriage return).
Basically, to improve the tuning capability of the model, an additional filter is introduced into the loop which contributes a small amount of extra delay.
The equation for this filter is:
yn = C xn + x(n-1) - Cy(n-1)
C is the filter coefficient;
xn is the current value;
x(n-1) is the previous value;
yn is the current output of the filter; and
y(n-1) is previous output of the filter.
The functions "getOptimumC" and "tuneIt" deal with this additional filter: the former (as its name suggests) finds an optimal value for the filters single coefficient, C; and the latter implements the filter itself.
This is one of numerous physical model instruments designed by Perry Cook. See the relevant paper - cited below - for a description of this plus several other instruments.
Here is a block diagram of the instrument:
The length of the flute bore delay is one period's worth of samples, and the length of the embouchure delay line is half one period's worth of samples (note that the delay lines are not drawn to scale). The noise source is the usual unit generator noise (ie. nothing "physical" about it). The cubic polynomial (x - x ³ ) models the interaction between energy bouncing back from the end of the flute bore with incoming breath pressure. The low-pass filter models the way that high frequencies are more easily attenuated in the system. Notice that the output from the main delay line is fed back into the system in two places.
;; firstly a helper function (only its really a macro because the RUN loop ;; can't handle functions, but doesn't mind macros) to implement the ;; cubic-polynomial (defmacro cubic-polynomial (x) `(+ ,x (* -1 ,x ,x ,x))) (definstrument slide-flute (start-time duration freq flow &key (flow-envelope '(0 0 5 1 95 1 100 0)) ;; scaler for amount of breath noise (noise 0.0356) ;; embouchure delay line size as a proportion of the ;; main flute bore delay line (embouchure-size 0.5) ;; scalers for the two feedbacks from the delay line (fbk-scl1 0.5) (fbk-scl2 0.55) (degree 0) (distance 0) (reverb 0) (out-scl 1.0) ;; low-pass filter coefficients (a0 0.7) (b1 -0.3)) (let* ((beg (floor (* start-time sampling-rate))) (end (+ beg (floor (* duration sampling-rate)))) ;; envelope function for incoming breath pressure -- flow (flow-function (make-env :envelope flow-envelope :scaler flow :start-time start-time :duration duration)) ;; noise source for breath noise (breath (make-randh :frequency (/ sampling-rate 2) :amplitude 1)) ;; find how many samples are needed for one period (as in plucked string) (period-samples (floor (/ sampling-rate freq))) ;; computation of the length of the embouchure delay line (embouchure-samples (floor (* embouchure-size period-samples))) ;; next two lines set up structures for the embouchure and ;; bore delay lines respectively (embouchure (make-delay embouchure-samples)) (bore (make-delay period-samples)) ;; structure for the low-pass filter (reflection-lowpass-filter (make-one-pole a0 b1)) (loc (make-locsig :degree degree :distance distance :revscale reverb)) ;; variables needed for the RUN loop (sum1 0) (sum2 0) (sum3 0) (out-sig 0) (delay-sig 0) (emb-sig 0) (current-flow 0) (cnt 1)) (Run (loop for i from beg to end do ;; DELAY-SIG is the output from the flute bore ;; ie. delayed OUT-SIG. This line at once ;; gets the next value from the delay line, and puts in ;; the to-be-delayed input. (setf delay-sig (delay bore out-sig)) ;; CURRENT-FLOW is the enveloped breath pressure (setf current-flow (env flow-function)) ;; SUM1 is the sum of the breath-induced noise component with ;; the CURRENT-FLOW (setf sum1 (+ current-flow (* noise (* current-flow (randh breath))))) ;; SUM2 is SUM1 plus feedback from the flute bore (scaled by fbk-scl1) (setf sum2 (+ sum1 (* fbk-scl1 delay-sig))) ;; EMB-SIG is the output from the embouchure delay line. ;; with SUM2 as its input (setf emb-sig (delay embouchure sum2)) ;; SUM3 is the application of the cubic-polynomial ;; to the output of the embouchure delay line, plus ;; input feedback from the bore (scaled by fbk-scl2) (setf sum3 (+ (cubic-polynomial emb-sig) (* fbk-scl2 delay-sig))) ;; out-sig is all the above passed through the low-pass filter ;; notice that the input to the filter is the second argument ;; to ONE-POLE (setf out-sig (one-pole reflection-lowpass-filter sum3)) ;; finally the OUT-SIG value is scaled by OUT-SCL, which provides ;; some control over the possibility of clipping (locsig loc i (* out-scl out-sig)) ;; this line is a simple utility to make lisp print out a count ;; after each second of sound has been computed (when (= i (* sampling-rate cnt)) (print cnt) (incf cnt))))))
1. With default values:
(with-sound (:channels 2) (slide-flute 0 3 440 0.55))
2. With more noise:
(with-sound (:channels 2) (slide-flute 0 3 440 0.55 :noise 0.02))
3. With embouchure delay line set so quarter the length of the main delay line
(with-sound (:channels 2) (slide-flute 0 3 440 0.55 :embouchure-size 0.145989))
[produces tone one octave higher. non-integer fractions of the main delay can produce multi-phonic effects]
4. Notice what happens when two successive calls are made
(with-sound (:channels 2) (slide-flute 0 3 440 0.55) (slide-flute 3.5 6 440 0.55))
ie. a glitch! -- this is caused by an unfortunate DC offset, a common problem with physical modeling.
These add some new features (not particularly "physical" in thier implementaion) such as vibrato and a `stereo' effect, as well as solving the problem of the DC offset.
;; macro to implement the DC-blocker: a one pole/one zero filter ;; with the equation: ym = (xm - xm-1) + (0.995 * ym-1) (defmacro dc-blocker (curr-excit prev-excit prev-val) `(+ (- ,curr-excit ,prev-excit) (* 0.995 ,prev-val))) (definstrument slide-flute2 (start-time duration freq flow &key (flow-envelope '(0 0 10 1 90 1 100 0)) ;; some additional decay time is added to ensure ;; the instrument has decayed fully by the duration ;; of the computed soundfile (decay 0.01) (noise 0.0356) (embouchure-size 0.5) (fbk-scl1 0.5) (fbk-scl2 0.55) ;; offset-pos is a secondary position ;; (between 0 and 1) from which to tap the delay ;; line for the other channel of a stereo signal (offset-pos 0.764264) (out-scl 1.0) (a0 0.7) (b1 -0.3) ;; traditional approach to vibrato (vib-rate 5) (vib-amount 0.03) (ran-rate 5) (ran-amount 0.03)) (let* ((beg (floor (* start-time sampling-rate))) (end (+ beg (floor (* duration sampling-rate)))) (flow-function (make-env :envelope flow-envelope :scaler flow :start-time start-time ;; specification that the flow envelope ends at DURATION ;; less the DECAY argument, see above :duration (- duration decay))) ;; setting up structures for random and periodic vibrato (periodic-vibrato (make-oscil :frequency vib-rate)) (random-vibrato (make-randi :frequency ran-rate)) ;; next 5 lines identical to slide-flute above (breath (make-randh :frequency (/ sampling-rate 2) :amplitude 1)) (period-samples (floor (/ sampling-rate freq))) (embouchure-samples (floor (* embouchure-size period-samples))) (embouchure (make-delay embouchure-samples)) (bore (make-delay period-samples :initial-element 0.0)) ;; given the value of offset-pos, this line computes the exact ;; index along the bore delay line for the secondary ;; tap -- providing a kind stereo microphone effect (offset (floor (* period-samples offset-pos))) ;; low-pass filter structure, as above (reflection-lowpass-filter (make-one-pole a0 b1)) ;; numerous variables need for the RUN loop (current-excitation 0) (current-difference 0) (current-flow 0) (out-sig 0) (tap-sig 0) (previous-out-sig 0) (previous-tap-sig 0) (dc-blocked-a 0) (dc-blocked-b 0) (previous-dc-blocked-a 0) (previous-dc-blocked-b 0) (delay-sig 0) (emb-sig 0) (cnt 1)) (Run (loop for i from beg to end do ;; first two lines as above (setf delay-sig (delay bore out-sig)) (setf emb-sig (delay embouchure current-difference)) ;; vibrato (random and periodic) is now applied to the output from ;; the enveloped input flow -- ie incoming breath pressure (setf current-flow (+ (* vib-amount (oscil periodic-vibrato)) (* ran-amount (randi random-vibrato)) (env flow-function))) ;; current-difference, current-excitation, and out-sig ;; are exactly as above (setf current-difference (+ (+ current-flow (* noise (* current-flow (randh breath)))) (* fbk-scl1 delay-sig))) (setf current-excitation (cubic-polynomial emb-sig)) (setf out-sig (one-pole reflection-lowpass-filter (+ current-excitation (* fbk-scl2 delay-sig)))) ;; tap-sig is the secondary output from the bore used ;; for the `stereo mike' effect (setf tap-sig (tap bore offset)) ;; DC blocking of the signal for both channels. ;; NB the DC blocker is not in the cicuit. It is applied to the out-sig ;; but the result is not fed back into the system. (setf dc-blocked-a (dc-blocker out-sig previous-out-sig previous-dc-blocked-a)) (setf dc-blocked-b (dc-blocker tap-sig previous-tap-sig previous-dc-blocked-b)) (outa i (* out-scl dc-blocked-a)) (outb i (* out-scl dc-blocked-b)) ;; updating of DC blocker variables for next iteration (setf previous-out-sig out-sig previous-dc-blocked-a dc-blocked-a) (setf previous-tap-sig tap-sig previous-dc-blocked-b dc-blocked-b) (when (= i (* sampling-rate cnt)) (print cnt) (incf cnt))))))
1. With default values:
(with-sound (:channels 2) (slide-flute2 0 3 440 0.55))
2. Two successive notes:
(with-sound (:channels 2) (slide-flute2 0 3 440 0.55) (slide-flute2 3 3 440 0.55))
3. Reading the secondary tap from the center of the flute bore can cause cancellation effects:
(with-sound (:channels 2) (slide-flute2 0 3 440 0.55 :offset-pos 0.5))
Karplus, Kevin, and Alex Strong, "Digital Synthesis of Plucked-String and Drum Timbres" in Computer Music Journal, Vol. 7, No. 2, Summer 1983. Reprinted in the Music Machine (MIT Press).
Jaffe, David, and Julius Smith, "Extensions of the Karplus-Strong Plucked-String Algorithm" in Computer Music Journal, Vol. 7, No. 2, Summer 1983. Reprinted in the Music Machine (MIT Press).
Cook, Perry, "A Meta-Wind-Instrument Physical Model, and a Meta-Controller for Real Time Performance Control", Proceedings of the ICMC, 1992.