Sonification of Hyperspectral Data
1. Various
vocal models for data sonification
This
document
discusses a few popular synthesis methods for voice simulation, and
presents their implementations in different platforms. In addition,
hyperspectral data sonification using these vocal models are
exemplified.
2. Data
Sonification Using a 2-dimensional
Digital Waveguide Mesh
Digital waveguide techniques have been used to
develop efficient physical models of musical instruments since the
early 1990s [Smith 1987, Smith III 2003, Van Duyne and Smith 1993a, Van
Duyne and Smith 1993b]. The digital waveguide model can be used to
reduce the
computational cost of physical models based on numerical integration of
the wave equation by three orders of magnitude by simulating the
traveling waves with digital delay lines.
The one-dimensional wave equation is solved by the
sum of two arbitrary traveling waves, and may be implemented in the
digital domain with a pair of bi-directional delay lines. This
structure is known as the digital waveguide. While each traveling wave
propagates independent of the other, the physical wave amplitude at any
point may be obtained by summing the two traveling waves.
The one-dimensional digital waveguide
shown in Figure 1 can be extended into a two-dimensional digital
waveguide mesh [Van Duyne and Smith 1993a, Van Duyne and
Smith 1993b]. The structure of the 2-D digital waveguide
mesh can be
viewed as a layer of parallel vertical waveguides superimposed on a
layer of parallel horizontal waveguides intersecting each other at
4-port scattering junctions as shown in Figure 2.
Figure 1:
The 1-D Digital Waveguide. The upper rail contains right-going waves,
and the lower rail contains left-going waves.
Figure 2:
The 2-D digital waveguide mesh.
The 2-D digital waveguide mesh is a very useful
model for hyperspectral data sonification since we can have a great
deal of flexibility in mapping high-dimensional data to control
parameters in a mesh. For instance, each data can be mapped to the
initial excitation condition of each junction in a mesh. That way, we
can use N-dimensional data, how large N may be, without
discarding any because there is no physical limitation on the size of
the
2-D mesh. That is, all N dimensions of the data can contribute to
making sounds.
We have used three different methods to map the data
to the mesh:
(1) Create a N-point mesh, for
N-dimensional data, and map each value to the initial excitation
condition of each point in the mesh. The initial condition may be any
type of wave variables - displacement, velocity, or force. Since the
2-D mesh we have used is rectilinear, we can have different types of
mesh with the same N points. For example, when N = 128, we can have
three meshes with different sizes - 2x64, 4x32, or 8x16 mesh.
(2) Create a mesh of size NxM, where N
corresponds to the dimension of the data to be sonified, and M can be
arbitrary. Then we use a plane wave as an excitation along the axis of
N points, and map the data to the initial condition of the plane wave.
(3) Create a
mesh of size NxM, where N corresponds to the dimension of the data to
be sonified, and M can be arbitrary. Instead of mapping the data to an
initial excitation condition as before, we map them to the boundary
condition of the mesh. Since one pole filters are used at the
boundaries,
we can map the data to control the gain or to change the pole location
of the filters. In this case, the initial excitation can be anything -
impulse, plane wave, or a set of impulses.
Each different mapping method yields sounds with
different sonority, but it is hard to tell the difference in sounds
generated using the same method since the quality of the sound
generated by the mesh is pretty much defined by its size rather than
the excitation. We need to develop the techniques to control the 2-D
mesh which will produce sounds as perceptually distant as they can be.
The following table includes sound examples and Matlab generated movies
that show wave propagation on the mesh using a single data point in
28-d data with three different mapping methods describe above.
3. Data Clustering
In the previous sonification method using a 2-D
digital waveguide mesh, the data was mapped to the control parameters
such as the initial point wise excitation, the initial plane wave
excitation, or the boundary condition of the mesh. While this approach
enabled us to map very high-dimensional data without sacrificing any
dimensions, the resulting sounds were barely distinguishable
because with the pseudo random initial condition represented by the
data, the size of the mesh
would dominate in resulting sounds. Furthermore, it was point wise
sonification; that is, a single point in the data sets corresponds to
one mesh, thus not only making the computational costs very high, but
also failing to provide a good data clustering scheme.
We took a new approach to focus on the data
clustering this time instead of sonifying every single data point. We
create an N-point mesh (with proper width X and height Y where XxY=N)
from N data points, and thus one mesh can now represent a number of
data points, or a data cluster. The major drawback of this method,
however, is that now we must reduce the dimension of the data down to a
few in the case of the 2-D rectilinear mesh since a junction in a 2-D
mesh may have only a couple of control parameters. We have chosen four
most significant dimensions in the 128-D data, and have mapped the data
to the wave impedances of the 2-D mesh, where one junction at time n
has four wave impedances in each branch i.e., Rx[n], Ry[n],
Rx[n+1], and Ry[n+1].
Using the test data set - we know which part is
benign or malignant - we created two meshes, one of which contains only
the benign cell data, and the other contains only the malignant cell
data, and used them as the references. The two resulting sounds were
very easily distinguishable. The next step is to create a composite
mesh containing both benign and malignant cell data points.
In our first approach, however, the data were manipulated in such a way
that the left half of the mesh should include only benign cell data,
and the right half should contain only malignant cell data (Figure 3).
Figure3. Composite mesh with both
benign and malignant cell data.
After creating meshes, we used an impulse to excite
them and produce sounds. The followings are sound examples generated
using the above approach with a 8x8 mesh (64 data points).
test_B.wav
and test_M.wav
are sounds examples generated using only one type of data (benign or
malignant), and used as references. test_BM_Bn.wav
are
generated from the composite mesh when we hit a randomly selected point
in the left half plane of the mesh, and so are test_BM_Mn.wav.
4. The Game of
Battleship: Object Identification
As an example of classification and
identification consider a version of the popular game Battleship in
which a player tries to locate objects (ships) on the opponent's hidden
grid (the sea) by guessing coordinates. In the variant described
auditory cues provide more information than the standard response of
'hit' or 'miss'. In the following examples the ocean surface is
represented as a two-dimensional rectilinear waveguide mesh, and a
second 2-D mesh of equal size represents the ocean at a particular
depth. Timbral segregation between surface and submerged regions are
created by setting distinct boundary conditions for each mesh. In the
surface level the pole location was set to 0.05, the default setting
for a metallic plate model, while the one-pole filters at the
boundaries of the submerged mesh were set to 0.8, the default settings
for a wood block.
In order to test our auditory version of battleship
idea, we prepared the following setup configuration. First, we created
a 2-D mesh with a size of 32x32, which alone corresponds to sea of
surface level, and placed two ships with arbitrary sizes at arbitrary
positions. Then we set the durability of each ship by setting the wave
impedance with distinct values. A ship with higher durability is
supposed to have higher wave impedance, which makes sense because wave
will not propagate very well if the impedance is high, thus having
little effect on the ship. Then we created a second mesh with different
boundary conditions as mentioned earlier for the submerged level, and
did
the same configuration except for the ship variables - size, location,
and durability. Figure 4 shows these setups for each sea level.
(a)
(b)
Figure 4 Basic setup for (a) sea
surface level and (b) submerged level. Dark areas indicate ships, and
'X' marks show attack positions. Ships are not scaled correclty.
After the configuration, we used an impulse as a
weapon to attack the ships, and recorded impulses responses as we
changed the attack position - i.e., where an impulse is excited - and
the power of the attack - i.e., the amplitude of an impulse . Before
listening to the resulting sounds, we could expect what they should
sound like by making a simple inference. First, we could easily tell
the difference between the sound
of hitting the ship and that of missing, because the ships are isolated
from the rest of the sea by setting higher impedance inside them. This
decoupling of the ships from the sea, when a ship is hit, would have
the similar effect of having smaller size of mesh, thus making its
impulse response easily distinguishable from that of missing since the
original mesh, which represents sea, is much bigger in size than the
ships,
thus making the effect much less noticeable. Secondly, the ship
variables - size, location, and durability (wave impedance) - will
affect the resulting sounds because wave propagation will vary as
they change. Lastly, the attack position and power will also have an
effect on
the overall wave propagation.
The following table includes Matlab generated movies
as well as sounds.
References
Smith, J. O. (1987)
"Music
applications of digital waveguides".
Technical Report STAN-M-39,
CCRMA, Music Department, Stanford University.
a compendium containing four
related papers and presentation overheads on digital waveguide
reverberation, synthesis, and filtering.
CCRMA technical reports can be
ordered by calling (650)723-4971 or by sending an email request to
info@ccrma.stanford.edu.
Smith III, J. O. (2003).
"Digital Waveguide Modeling
of Musical Instruments".
http://www-ccrma.stanford.edu/~jos/waveguide/.
Van Duyne, S. A. and J. O. Smith (1993a,
Oct.).
"The
2-D digital waveguide mesh".
In Proceedings IEEE Workshop
on Applications of Signal Processing to Audio and Acoustics, New Paltz,
NY, New York. IEEE Press.
Van Duyne, S. A. and J. O. Smith (1993b).
"Physical
modeling with the 2-D digital waveguide mesh".
In Proceedings 1993
International Computer Music Conference, Tokyo, pp. 40-47.
Computer Music Association.
available online at http://www-ccrma.stanford.edu/~jos/pdf/mesh.pdf.
=========================================
Kyogu Lee
Ph.D Candidate
Center for Computer Research in Music and Acoustics
Music Department, Stanford University
kglee(at)ccrma.stanford.edu
=========================================