The program is capable of
- reading and recording .AU files
- managing our own proprietary .SET files, which describe where .AU files used for training and testing are located
- searching the .AU files for sections that look like vowels
- performing an FFT of the selection
- windowing the FFT in a logarithmic fashion for a representation that models the basilar membrane
- displaying these audio preprocessing steps graphically
- passing this data to a neural network
- creating, saving and loading neural networks
- testing and training the neural networks on the audio data
- displaying the error curves for testing and training (picture below)
- displaying the weights and thresholds of the neurons themselves (picture above)
- recognizing the 5 short German vowels a, e, i, o, and u
The neural network object I created
- is completely modular: it can be easily used with other programs
- supports two transfer functions: the logistical function and the tanh() function
- has 1-7 hidden layers containing 1 to 40 neurons
- manages its own statistical data, which can be easily accessed
- can be easily loaded and saved by overloading the << and >> operators
- provides convenient "getter" functions for internal data structures
Recognition is then performed with a mixture of experts. The first network recognizes A, E/I, or O/U, the second E or I, and the third O or U. It is programmed with C++ under UNIX (g++) with the help of the GUI library
Qt from the Norweigan company Trolltech. We have tested the program under both Linux and UNIX, although the record function does not work with Linux because the tool
audiorecord is missing. In that case, the sample .SET's of audio files can be used.
After we received the grade, I continued to work in order to:
- eliminate an intermittent segment fault bug
- add progress bars
- make it possible to view the dynamically changing error curves and weights and thresholds while the network is trained
- make the GUI displays more attractive
Download the program:
- abgabe2.zip
source code -- can be made with the command make (Qt required!)
- nett.zip
.SET of test vowel sounds
Documentation (in German):
An example of what the error curves might look like if the networks were first 1) successfully trained, 2) overtrained, and then 3) improperly trained (bad settings used). |
|