The features that gave me the best KNN values were Centroid, Flux, MFCC, Kurtosis, Chroma. Initially I chose these features based on intuition that we probably needed a good balance of representation of audio signal including spectral features, rhythmic patterns, MFCC, etc. From there I excluded features (down to 1) or included a lot (up to 7). In my experimentation both of these methods yielded KNN numbers that were less good than my first experiment. The fold accuracy values for my final features were some variation of the following:
fold 0 accuracy: 0.4055
fold 1 accuracy: 0.3905
fold 2 accuracy: 0.3900
fold 3 accuracy: 0.4105
fold 4 accuracy: 0.4065
As a human checking how good this is, I played back some disco to see if the classification did ok. I don’t think it does, because I played back some disco from the training data and it gave disco the value of 0.1–quite infrequently at that.
I extracted features from a traditional Bulgarian folk choral song Bre Petrunko, which is about the desire to go dancing in a village. It has some really fun chords. It’s hard and fun to scream and DJ at the same time. While working on this I often pondered the question, is this a good idea? The answer is, not what I’m paid for, no. But I will work on tweaking more things to maybe make an interesting new experimental DJ style. Right now using the mouse is limiting, I want to free up my hand so I can optimize the use of my hands–I think I wanna try using my DJ controller.
Phase 3 sketch/features to add: