Sonification of Hyperspectral Data


    1. Various vocal models for data sonification

    This document discusses a few popular synthesis methods for voice simulation, and presents their implementations in different platforms. In addition, hyperspectral data sonification using these vocal models are exemplified.


    2. Data Sonification Using a 2-dimensional Digital Waveguide Mesh

    Digital waveguide techniques have been used to develop efficient physical models of musical instruments since the early 1990s [Smith 1987, Smith III 2003, Van Duyne and Smith 1993a, Van Duyne and Smith 1993b]. The digital waveguide model can be used to reduce the computational cost of physical models based on numerical integration of the wave equation by three orders of magnitude by simulating the traveling waves with digital delay lines.
    The one-dimensional wave equation is solved by the sum of two arbitrary traveling waves, and may be implemented in the digital domain with a pair of bi-directional delay lines. This structure is known as the digital waveguide. While each traveling wave propagates independent of the other, the physical wave amplitude at any point may be obtained by summing the two traveling waves.
    The one-dimensional digital waveguide shown in Figure 1 can be extended into a two-dimensional digital waveguide mesh [Van Duyne and Smith 1993a, Van Duyne and Smith 1993b]. The structure of the 2-D digital waveguide mesh can be viewed as a layer of parallel vertical waveguides superimposed on a layer of parallel horizontal waveguides intersecting each other at 4-port scattering junctions as shown in Figure 2.







Figure 1: The 1-D Digital Waveguide. The upper rail contains right-going waves, and the lower rail contains left-going waves.









Figure 2: The 2-D digital waveguide mesh.



    The 2-D digital waveguide mesh is a very useful model for hyperspectral data sonification since we can have a great deal of flexibility in mapping high-dimensional data to control parameters in a mesh. For instance, each data can be mapped to the initial excitation condition of each junction in a mesh. That way, we can use  N-dimensional data, how large N may be, without discarding any because there is no physical limitation on the size of the 2-D mesh. That is, all N dimensions of the data can contribute to making sounds.

    We have used three different methods to map the data to the mesh:
       (1) Create a N-point mesh, for N-dimensional data, and map each value to the initial excitation condition of each point in the mesh. The initial condition may be any type of wave variables - displacement, velocity, or force. Since the 2-D mesh we have used is rectilinear, we can have different types of mesh with the same N points. For example, when N = 128, we can have three meshes with different sizes - 2x64, 4x32, or 8x16 mesh.
       (2) Create a mesh of size NxM, where N corresponds to the dimension of the data to be sonified, and M can be arbitrary. Then we use a plane wave as an excitation along the axis of N points, and map the data to the initial condition of the plane wave.
       (3)
Create a mesh of size NxM, where N corresponds to the dimension of the data to be sonified, and M can be arbitrary. Instead of mapping the data to an initial excitation condition as before, we map them to the boundary condition of the mesh. Since one pole filters are used at the boundaries, we can map the data to control the gain or to change the pole location of the filters. In this case, the initial excitation can be anything - impulse, plane wave, or a set of impulses.

    Each different mapping method yields sounds with different sonority, but it is hard to tell the difference in sounds generated using the same method since the quality of the sound generated by the mesh is pretty much defined by its size rather than the excitation. We need to develop the techniques to control the 2-D mesh which will produce sounds as perceptually distant as they can be. The following table includes sound examples and Matlab generated movies that show wave propagation on the mesh using a single data point in 28-d data with three different mapping methods describe above.

     
Mapping method
Mesh size
Excitation
Sound
Movie
(1)
4x7
28 impulses
Images_B01_01.wav
Images_B01_01.mov
2x14
28 impulses
Images_B01_02.wav Images_B01_02.mov
(2)
28x28
plane wave
Images_B01_1.wav Images_B01_1.mov
(3-1) filter gain control
28x28
impulse
Images_B01_2.wav Images_B01_2.mov
(3-2) filter pole control
28x28
impulse
Images_B01_3.wav Images_B01_3.mov



    3. Data Clustering
   
    In the previous sonification method using a 2-D digital waveguide mesh, the data was mapped to the control parameters such as the initial point wise excitation, the initial plane wave excitation, or the boundary condition of the mesh. While this approach enabled us to map very high-dimensional data without sacrificing any dimensions, the resulting sounds were barely distinguishable because with the pseudo random initial condition represented by the data, the size of the mesh would dominate in resulting sounds. Furthermore, it was point wise sonification; that is, a single point in the data sets corresponds to one mesh, thus not only making the computational costs very high, but also failing to provide a good data clustering scheme.

    We took a new approach to focus on the data clustering this time instead of sonifying every single data point. We create an N-point mesh (with proper width X and height Y where XxY=N) from N data points, and thus one mesh can now represent a number of data points, or a data cluster. The major drawback of this method, however, is that now we must reduce the dimension of the data down to a few in the case of the 2-D rectilinear mesh since a junction in a 2-D mesh may have only a couple of control parameters. We have chosen four most significant dimensions in the 128-D data, and have mapped the data to the wave impedances of the 2-D mesh, where one junction at time n has four wave impedances in each branch i.e., Rx[n], Ry[n], Rx[n+1], and Ry[n+1].

    Using the test data set - we know which part is benign or malignant - we created two meshes, one of which contains only the benign cell data, and the other contains only the malignant cell data, and used them as the references. The two resulting sounds were very easily distinguishable. The next step is to create a composite mesh containing both benign and malignant cell data points. In our first approach, however, the data were manipulated in such a way that the left half of the mesh should include only benign cell data, and the right half should contain only malignant cell data (Figure 3).




  


Figure3. Composite mesh with both benign and malignant cell data.



    After creating meshes, we used an impulse to excite them and produce sounds. The followings are sound examples generated using the above approach with a 8x8 mesh (64 data points).

      

Reference
Composite (benign + malignant)
Benign
test_B.wav
test_BM_B1.wav
test_BM_B2.wav
Malignant
test_M.wav
test_BM_M1.wav test_BM_M2.wav


    test_B.wav and test_M.wav are sounds examples generated using only one type of data (benign or malignant), and used as references. test_BM_Bn.wav are generated from the composite mesh when we hit a randomly selected point in the left half plane of the mesh, and so are
test_BM_Mn.wav.



    4. The Game of Battleship: Object Identification

    As an example of classification and identification consider a version of the popular game Battleship in which a player tries to locate objects (ships) on the opponent's hidden grid (the sea) by guessing coordinates. In the variant described auditory cues provide more information than the standard response of 'hit' or 'miss'. In the following examples the ocean surface is represented as a two-dimensional rectilinear waveguide mesh, and a second 2-D mesh of equal size represents the ocean at a particular depth. Timbral segregation between surface and submerged regions are created by setting distinct boundary conditions for each mesh. In the surface level the pole location was set to 0.05, the default setting for a metallic plate model, while the one-pole filters at the boundaries of the submerged mesh were set to 0.8, the default settings for a wood block.

    In order to test our auditory version of battleship idea, we prepared the following setup configuration. First, we created a 2-D mesh with a size of 32x32, which alone corresponds to sea of surface level, and placed two ships with arbitrary sizes at arbitrary positions. Then we set the durability of each ship by setting the wave impedance with distinct values. A ship with higher durability is supposed to have higher wave impedance, which makes sense because wave will not propagate very well if the impedance is high, thus having little effect on the ship. Then we created a second mesh with different boundary conditions as mentioned earlier for the submerged level, and did the same configuration except for the ship variables - size, location, and durability. Figure 4 shows these setups for each sea level.




                                                                                      


                                                                                       (a)                                                                                    (b)

Figure 4 Basic setup for (a) sea surface level and (b) submerged level. Dark areas indicate ships, and 'X' marks show attack positions. Ships are not scaled correclty.

      


    After the configuration, we used an impulse as a weapon to attack the ships, and recorded impulses responses as we changed the attack position - i.e., where an impulse is excited - and the power of the attack - i.e., the amplitude of an impulse . Before listening to the resulting sounds, we could expect what they should sound like by making a simple inference. First, we could easily tell the difference between the sound of hitting the ship and that of missing, because the ships are isolated from the rest of the sea by setting higher impedance inside them. This decoupling of the ships from the sea, when a ship is hit, would have the similar effect of having smaller size of mesh, thus making its impulse response easily distinguishable from that of missing since the original mesh, which represents sea, is much bigger in size than the ships, thus making the effect much less noticeable. Secondly, the ship variables - size, location, and durability (wave impedance) - will affect the resulting sounds because wave propagation will vary as they change. Lastly, the attack position and power will also have an effect on the overall wave propagation.

    The following table includes Matlab generated movies as well as sounds.

   

Sea Level
Surface
Underwater
Hit 1
hit1_surface.wav
hit1_surface.mov
hit1_underwater.wav hit1_underwater.mov
Hit 2
hit2_surface.wav hit2_surface.mov hit2_underwater.wav hit2_underwater.mov

Miss
corner_surface.wav corner_surface.mov corner_underwater.wav corner_underwater.mov
center_surface.wav center_surface.mov between_underwater.wav between_underwater.mov

   
   


    References

    Smith, J. O. (1987)
        "Music applications of digital waveguides".
        Technical Report STAN-M-39, CCRMA, Music Department, Stanford University.
        a compendium containing four related papers and presentation overheads on digital waveguide reverberation, synthesis, and filtering.
        CCRMA technical reports can be ordered by calling (650)723-4971 or by sending an email request to info@ccrma.stanford.edu.

    Smith III, J. O. (2003).
        "Digital Waveguide Modeling of Musical Instruments".
        http://www-ccrma.stanford.edu/~jos/waveguide/.

    Van Duyne, S. A. and J. O. Smith (1993a, Oct.).
        "The 2-D digital waveguide mesh".
        In Proceedings IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, New York. IEEE Press.

    Van Duyne, S. A. and J. O. Smith (1993b).
        "Physical modeling with the 2-D digital waveguide mesh".
        In Proceedings 1993 International Computer Music Conference, Tokyo, pp. 40-47. Computer Music Association.
        available online at http://www-ccrma.stanford.edu/~jos/pdf/mesh.pdf.






=========================================
 Kyogu Lee
 Ph.D Candidate
 Center for Computer Research in Music and Acoustics
 Music Department, Stanford University
 kglee(at)ccrma.stanford.edu
=========================================