In /usr/ccrma/media/databases/hiphop-gene/ are the following files:
A list of each artist in the dataset. Rather than extracted from tags in the mp3 file, they are hand-entered via categorize.py to ensure correct normalization.
Loosely organized directory of mp3/m4a/etc. files for the base data set.
List of each possible genre in the dataset. Handwritten and used by categorize.py for manual genre entry.
The main catalogue of metadata associated with each WAV file. Currently includes genre and artist(s) info, in addition to file paths of compressed/WAV versions of the audio data.
Directory of uncompressed audio data files. Automatically populated by decompress.py