I had a blast finishing up this project. It was very nostalgic for me compiling some of my favorite anime openers and closers into a compilation of concatenative synthesis. It was particularly interesting exploring how an anime opener and closer relate to one another and seeing how creating a model from the pairing manifests as a mosaic. Tbh I’m trying to BS this response because I don’t want to spoil the fun of watching the video. I think it would have been lit exploring each pairing and seeing all the sounds that come from just combining one opener and closer.
Phase 1 was a bit of a challenge because I wasn't sure exactly what I was looking for in terms of sound and the final output. After reading Perry Cook's paper, it seemed like the best model would depend heavily on the input and the desired output. So, I tried a couple things, first varying the MFCC numbers and seeing how that faired with the classifier. I confirmed that 5 MFCCs seemed adequate to get decent accuracy for the model. In my feature extract 21 I found that I was able to get pretty high resolutions with just 20 MFCCs and the centroid:
fold 0 accuracy: 0.4015
fold 1 accuracy: 0.4054
fold 2 accuracy: 0.3985
fold 3 accuracy: 0.4211
fold 4 accuracy: 0.4049
I also found an example where I avoided MFCC all together and only used centroid, flux, and sfm. I'm really curious to see how the output would defer with this model:
fold 0 accuracy: 0.3877
fold 1 accuracy: 0.3838
fold 2 accuracy: 0.3775
fold 3 accuracy: 0.3412
fold 4 accuracy: 0.3760
For some reason, zero crossing really messed my models up. Whenever I included it as a feature, my accuracy dropped drastically which I found surprising. Because of how it was described in the paper, I thought it might be able to distinguish between pieces that have quitter parts or silences, but not sure how that worked. I also experimented with changing the weights of the features and seeing that had any effect on the classifier. It did so maybe it’ll have an affect on the audio itself.
Eventually I ended up going with a 30-feature model that I ended up adding chroma, kurtosis, rolloff, and 13 MFCCs as mentioned in Perry Cook’s paper and got the following accuracy:
fold 0 accuracy: 0.4711
fold 1 accuracy: 0.4392
fold 2 accuracy: 0.4559
fold 3 accuracy: 0.4912
fold 4 accuracy: 0.4706
This was the highest accuracy I was able to achieve, and I do like the sounds that came from it.
For phase 2, I ended up compiling some of my favorite anime openers and closers to be trained I the model. The idea was to have some sort of fusions between the opening and ending song to see what outputs I could get. I eventually tried to play with the weights of the features as well, but not really sure if it changes much. I was able to get some of the vidoes to play at the same time in chunity, but I have some ideas about shuffling the videos around, playing some iconic anime lines to drive to data, and maybe focusing on one anime opener and closer at a time somehow.
SourcesUnravel - Tokyo Ghoul
https://www.youtube.com/watch?v=7aMOurgDB-o
Kaikai Kitan - Jujutsu Kaisen
https://www.youtube.com/watch?v=GwaRztMaoY0
Battlecry - Samuri Champloo
https://www.youtube.com/watch?v=Eq6EYcpWB_c
Tank - Cowboy Bebop
https://www.youtube.com/watch?v=EL-D9LrFJd4
Colors - Code Geass
https://www.youtube.com/watch?v=G8CFuZ9MseQ
My War- Attack on Titan
https://www.youtube.com/watch?v=rwCJvSKzQkc&t=48s
Nothings Carved in Stone Out of Control - Psycho Pass
https://www.youtube.com/watch?v=9StfX1p9LuY
Seishun Kyousoukyoku - Naruto
https://www.youtube.com/watch?v=_ty-Nqm4Pdc
Crazy Noisy Bizarre Town - Jojo
https://www.youtube.com/watch?v=20m00ohYASw
Demons Butterfly -Devilman Crybaby Rap
https://www.youtube.com/watch?v=qg7VHv1UEWo
This Fffire - Cyberpunk: Edgerunners
https://www.youtube.com/watch?v=OifiVCnFKzM
Asterisk - Bleach
https://www.youtube.com/watch?v=_ty-Nqm4Pdc&t=13s
Wind - Naruto
https://www.youtube.com/watch?v=wzoIZO8WbI8
Lost in Paradise -Jujutsu Kaisen
https://www.youtube.com/watch?v=6riDJMI-Y8U
Monster without a name - Psycho-pass
https://www.youtube.com/watch?v=sF0QLtk3YH0
I really want to stay at your house- Cyberpunk: Edgerunners
https://www.youtube.com/watch?v=4O746lUintc
Roundabout
https://www.youtube.com/watch?v=cUyTmUwEUy8
Devilman Rap - Devilman Crybaby; Soundcloud- ghost609
https://www.youtube.com/watch?v=Irik4Sghsts
Saihate - Bleach
https://www.youtube.com/watch?v=LYdCktLgvak
Shiki No Uta - Samurai Champloo
https://www.youtube.com/watch?v=Q7QdUFgfAUk
The Real Folk Blues -Cowboy Bebop
https://www.youtube.com/watch?v=Ru_H5PiyfSA
Great Escape - AOT
https://www.youtube.com/watch?v=sFdzNhJAdco
Kisetsu wa Tsugitsugi Shinde Iku - TG
https://www.youtube.com/watch?v=MAw59zSbGqs
Yuujyou Seishunnka - code gears
https://www.youtube.com/watch?v=ZTmQiCHKPHE
Thanks to Ge, Nick, Alex, Andrew, and Yikai for all there help to make this work