# Neural Networks

My final project for Music 270B was an exploration of neural networks and their use in the field of music, mostly for the purpose of giving me a better understanding of how they worked. Keep reading for a description of what I did, taken from my original project write-up.

1. Introduction
1. Project Motivation
2. Neural Network Background
1. Neural Network Structure
2. Training a Neural Network
2. Project Stages
1. Perceptrons
2. More Complex Networks
1. Feed-Forward Networks
1. Rhythm classification in a feed-forward network
2. Recurrent Networks
1. Rhythm classification in a recurrent network
3. Practical Challenges with Neural Networks
1. Internal Representations
2. Learning Time
4. Conclusions (?) and Possible Future Work
5. Works Cited/Referenced

## Introduction

### Project Motivation

For many years I have heard the term “neural networks” thrown around in discussions about machine learning, but its meaning remained a mystery to me. I would like to have a better understanding of a number of machine learning–related topics, including neural networks, support vector machines, and genetic programming. Because of that, I jumped at the opportunity to explore neural networks in 270B. The primary goal of my project was to explore neural networks beyond the theoretical and conceptual terms in which Andy Clark’s Mindware [1] discussed them and to gain a better understanding of how they are actually implemented and used. The implementation side appeals to my engineering nature, and the usage side appeals to my musical nature, making this an ideal approach for me to such a complex topic.

### Neural Network Background

#### Neural Network Structure

Neural networks are arranged in an attempt to simulate the way that signals (representing information) are transmitted from one neuron in the human brain to another based on the strength of the connections between the neurons. A neural network therefore consists of one or more neuron “units” and connections between those units. Each connection has an associated weight which represents the strength of the connection, and each neuron-unit may have any number of connections leading to it and/or from it. Neuron-units receive information signals from other units in the form of numbers. Each neuron-unit then sums all of the inputs it received, passes that sum through a limiting function, and then passes its output down each of the connections leading away from it. Based on the weight assigned to each connection, the next neuron-unit in the chain will receive some percentage of the output value that was sent to it. Neuron-units representing the input of the network are, not surprisingly, called “input units”. In a basic feed-forward network, input units do not receive input from other neuron-units, but in networks containing feedback such a scenario is quite possible. Similarly, there are “output units” representing the output of the network which can act as the source of feedback in such a network. The somewhat mysterious part of a network however (mysterious to me, at least), is the presence of “hidden units” which lie between the input and output units and affect how the input signal is modified before it reaches the output. Another mysterious point for me is that hidden units and output units can also receive input from biases, which are essentially input units which always give the value “1” as an output. I have no idea what role these bias units play in changing the output of a network, but they are apparently useful!

#### Training a Neural Network

Training a neural network is a repetitive process that can be extremely time-consuming. During each training iteration, sample inputs (“training examples”) are fed to the network, and the weights connecting the various neurons/units are adjusted depending on the error between the network’s computed output and the expected or desired “correct” network output for a given sample. Ideally, each training iteration results in a small improvement in the network’s overall ability to correctly process each training example, but the level of success often depends on a certain element of luck. In particular, the randomly-selected initial connection weights for the network can have a significant impact on how quickly the network converges on a set of weights providing minimum error (or whether it even converges on the correct output values at all!).

## Project Stages

I began my exploration of neural nets by looking through various engineering-oriented texts on machine learning from the library. Those books proved to (for the most part) be too advanced for my purposes – they contained proofs of why various learning techniques worked, or why some problems can never be solved with certain kinds of networks, etc. but they weren’t helping me understand the actual order of computations a network goes through as it learns and even just as it processes an input to produce an output. I then turned my attention to the book Music and Connectionism, which is primarily a collection of articles from two special issues of the Computer Music Journal released in 1989. Having been written for an audience with little or no experience in neural networks, these articles were a very helpful introduction to the field. In fact, the article by Mark Dolson titled “Machine Tongues XII: Neural Networks”[2] actually reads very much like a high-level tutorial, so my main project focus shifted to following this tutorial and attempting to reproduce Dolson’s results.

### Perceptrons

The first network Dolson describes is not related to a musical topic at all, but it gives a general sense of how neural networks function. To make sure that I understood what he was saying, I attempted to implement this network in Java. The network is a perceptron – the most basic kind of neural network which contains only one neuron, and this perceptron’s task is to determine if one number (represented by one input) is at least two times as large as another number (represented by a second input). Using only what Dolson had written proved to not be quite enough information to allow me to implement a real perceptron - I was having trouble understanding the learning process and exactly how the connection weights were handled. I found a neural network tutorial website [3] which fortunately filled in all of those gaps for me. I successfully trained my perceptron to solve the problem of the logical OR, AND, and NOT functions as described in the tutorial website, and I was subsequently able to successfully train it to complete Dolson’s perceptron task. Interestingly enough, even this simple task of implementing Dolson’s “one number at least twice as large as another” perceptron pointed out an important aspect of neural networks: there isn’t always only one right answer! This perceptron had two inputs (x1 and x2) and therefore two weights (w1 and w2). The output of the network is f(w1*x1 + w2*x2), where f(x)=1 for x>=0 and f(x)=-1 for x<0. As long as w2=-2*w1, this network will give the correct output, and in the case of my implementation the weights tended to converge to w1=0.11 and w2=-0.22 instead of the most obvious (to a human, at least) answer of w1=1 and w2=-2. I think that for some, the fact that neural networks can find these kinds of perfectly valid solutions which are not always the most logical solutions to a human is part of the appeal of neural networks in the first place.

### More Complex Networks

After I felt like I had a fairly good grasp of how perceptrons worked, I moved on in Dolson’s tutorial, hoping to use a pre-existing library of neural network code to replicate some of the music-related experiments he had performed. My intention was to be able to reproduce some of the more complicated network structures used without having to write all of the necessary code myself. I investigated the Java-based Joone (Java Object Oriented Neural Engine) library [4] and was initially excited to find that the creators had included a GUI tool for graphically building, training, and testing many kinds of networks. Following the Joone documentation, I successfully created a feed-forward network with the Joone GUI that solved the XOR problem (an important problem because, unlike OR, AND, and NOT, it cannot be solved by a single perceptron). I hoped that I could then modify that network to reproduce Dolson’s experiments, but I was unable to make it work. I don’t know if I had simply set some parameter incorrectly or if my network was fundamentally built incorrectly (Joone had many undocumented parameters which probably would have made sense to someone with neural network experience but which just served to confuse and complicate things for me). Either way, it didn’t work.

#### Feed-Forward Networks

At this point, I was somewhat inspired by the success I’d had with my perceptron code, so I left Joone behind and began implementing my own more advanced neural network code. Fortunately, at this point I had finally (after several weeks’ wait) been able to get a copy of a helpful and more introductory machine learning textbook [5] from the library, and I found that it contained a very nice description of the back-propagation learning algorithm that is commonly used in feed-forward neural networks. I also came across a website with a description of some object-oriented neural network code [6] which served as an inspiration for the OOP design of my own feed-forward network.

## Practical Challenges with Neural Networks

### Internal Representations

One of the supposed benefits of working with neural networks is the ability to solve challenging problems without having to find a symbolic representation for all aspects of the problem ahead of time. This benefit also happens to be the source of one of the biggest frustrations I found while working with neural networks: the fact that they are completely impossible to debug in the “traditional” programming sense. When your network fails to give the desired output, or doesn’t seem to be learning the way it was intended to, you can’t just open up the network, look at the weights of the all of the connections, and use that data to track down the source of your problem. The weights are too abstract of a representation for the solution to allow me to “read” them the way I would read more symbolic forms of representation in a computer program. Perhaps those who spend significant time working with neural networks eventually develop an intuition for where the source of their problem might lie, but someone like me who is new to the field is pretty much out of luck. I experienced this challenge first-hand while implementing my primitive perceptron and my more powerful feed-forward network. When I didn’t get the results I expected I had very little way to tell if I had a bug in my program, or if the random weights that were chosen to seed the network were just “unlucky”, or if I had configured my network incorrectly (with the wrong number of hidden units or biases, etc,). It was very frustrating!

### Learning Time

Another challenge that comes up very quickly when playing with neural networks is simply the fact that it can take a very long time to train a network well. The larger the network, the more connections it has, the longer it takes to learn a new problem. I wasted more time than I’d like to admit worrying that I had a bug in my program because my network’s output values were not converging to the desired values when in fact it was simply the case that I had not allowed the learning process to be repeated enough times for the error to become small. Almost the inverse of that situation occurred for me as well, where I told the network to stop learning when its mean-squared error went below a certain value. I let it run all night, and it never did converge to a reasonable level of error. I then tried to train the network again (with a new random set of starting weights), and it converged in a matter of minutes. In other words, sometimes you get lucky with your starting weights and sometimes you don’t, and in the cases where you don’t it’s hard to know for sure that “bad” starting weights were the source of your problem. I suppose I’m the kind of person who prefers more reliable, consistent, and predictable tools to work with (although perhaps with more sophisticated implementations, neural networks can be all of those things).

## Conclusions (?) and Possible Future Work

Having now implemented several neural networks along with the well-known back-propagation learning algorithm, I have a much better understanding of how neural networks function and what their strengths and weaknesses are. I hadn’t originally intended to do so much of my own programming in order to learn this much, but probably if I had not implemented these networks on my own (and gone through the process of debugging them!) I would not have been able to understand them as well as I now do. Of course, all of my playing around has only scratched the surface of what neural networks are capable of doing, and I made no attempt to optimize my code in any significant way, so there is a lot of room for improvement if I decide to pursue this further. Right now, I’m not convinced that I like working with neural networks enough to continue playing with them in the near future, but in case the opportunity arises, here are some paths I would like to explore:

• Investigate the potential of subnets and supernets

In his article on algorithmic composition using neural networks, Peter Todd advocates for the benefits of using multiple interconnected networks to solve a single problem [7]. I would be very interested in learning how I could build such a multi-level network.

• Explore melodies

Marvin Minsky and Seymour Papert, in the prologue to their “Expanded Edition” of the book Perceptrons [8], emphasize that a lot of the problems of working with neural networks, and the reason for a lack of significant progress in the field over a fairly large number of years is the fact that it is difficult to determine good representations for the data that neural networks are intended to act upon. When it comes to working with melodies in neural networks, it would be interesting to learn how effective absolute representations of pitch are vs. relative representations of pitch. Relative representations would be useful when it’s the shape of the melody that matters, but absolute representations are useful if one cares about distinguishing between similarly-shaped melodies in different keys.

• Figure out how to determine the ideal number of hidden units

This is something I had hoped to better understand by the time I completed this project, but I didn’t have time to investigate it.

• Add friendly GUI for playing with network parameters and running experiments

Joone’s GUI seemed like a good idea, but it didn’t work well for me. I don’t know that I need something that claims to be quite as fancy, but it would be nice to be able to easily tweak experiment parameters and view results in some sort of user-friendly GUI.

The code I wrote for this project was intended almost entirely for my own experimental purposes and not for others to use or even necessarily read. However, it could easily be extended and cleaned up to allow easier experimentation. In the interest of not having “dirty” code lying around, I’ll probably do this housecleaning to some extent no matter what.

## Works Cited/Referenced

1. Clark, Andy. Mindware: An Introduction to the Philosophy of Cognitive Science. New York: Oxford University Press, 2001.
2. Dolson, Mark. “Machine Tongues XII: Neural Networks.” Music and Connectionism. Ed. Peter M. Todd and D. Gareth Loy. Cambridge, MA: The MIT Press, 1991. 3-19.
3. Pudi, Vikram. “Neural Networks Tutorial”. International Institute of Information Technology, Hyderabad, India. 14 Dec. 2007. http://www.iiit.net/~vikram/nn_intro.html
4. Marrone, Paolo et al. Joone – Java Object Oriented Neural Engine. 14 Dec. 2007. http://www.jooneworld.com/
5. Mitchell, Tom M. Machine Learning. New York: The McGraw-Hill Companies, Inc., 1997.
6. Shiffman, Daniel. Neural Networks. 14 Dec. 2007. http://www.shiffman.net/teaching/nature/nn
7. Todd, Peter M. “A Connectionist Approach to Algorithmic Composition.” Music and Connectionism. Ed. Peter M. Todd and D. Gareth Loy. Cambridge, MA: The MIT Press, 1991. 173-194.
8. Minksy, Marvin L., and Seymour A. Papert. Perceptrons (Expanded Edition). Cambridge, MA: The MIT Press, 1988.

Home
Email me: danielsm (at) ccrma (dot) stanford (dot) edu
Last updated: January 2009