In /usr/ccrma/media/databases/hiphop-gene/ are the following files:
A list of each artist in the dataset. Rather than extracted from tags in the mp3 file, they are hand-entered via categorize.py to ensure correct normalization.
Loosely organized directory of mp3/m4a/etc. files for the base data set.
List of each possible genre in the dataset. Handwritten and used by categorize.py for manual genre entry.