Vowel Recognition With Neural Networks

The program is capable of

reading and recording .AU files
managing our own proprietary .SET files, which describe where .AU files used for training and testing are located
searching the .AU files for sections that look like vowels
performing an FFT of the selection
windowing the FFT in a logarithmic fashion for a representation that models the basilar membrane
displaying these audio preprocessing steps graphically
passing this data to a neural network
creating, saving and loading neural networks
testing and training the neural networks on the audio data
displaying the error curves for testing and training (picture below)
displaying the weights and thresholds of the neurons themselves (picture above)
recognizing the 5 short German vowels a, e, i, o, and u

The neural network object I created

is completely modular: it can be easily used with other programs
supports two transfer functions: the logistical function and the tanh() function
has 1-7 hidden layers containing 1 to 40 neurons
manages its own statistical data, which can be easily accessed
can be easily loaded and saved by overloading the << and >> operators
provides convenient "getter" functions for internal data structures

Recognition is then performed with a mixture of experts. The first network recognizes A, E/I, or O/U, the second E or I, and the third O or U. It is programmed with C++ under UNIX (g++) with the help of the GUI library Qt from the Norweigan company Trolltech. We have tested the program under both Linux and UNIX, although the record function does not work with Linux because the tool audiorecord is missing. In that case, the sample .SET's of audio files can be used.

After we received the grade, I continued to work in order to:

eliminate an intermittent segment fault bug
add progress bars
make it possible to view the dynamically changing error curves and weights and thresholds while the network is trained
make the GUI displays more attractive

Download the program:

abgabe2.zip
source code -- can be made with the command make (Qt required!)
nett.zip
.SET of test vowel sounds

Documentation (in German):

An example of what the error curves might look like if the networks were first 1) successfully trained, 2) overtrained, and then 3) improperly trained (bad settings used).

Back to the top