Hit Predict

Using Spotify Data to Predict Billboard Hits

Stanford CS 229 Machine Learning

This quarter in Machine Learning, my group and I approached the Hit Song Science problem, aiming to predict which songs will become Billboard Hot 100 hits. We collated a dataset of approximately 4,000 hit and non-hit songs and extracted each songs audio features from Spotify. We are able to predict the Billboard success of a song with approximately 75% accuracy using several machine-learning algorithms. Enjoy our poster and writeup below!

Stanford CS 229 Machine Learning: Final Writeup (pdf).


I wrote a bit more about the Spotify audio features in one of my Medium articles: Do Songs of the Summer Sound the Same? Here you can see an illustration of the audio features of top songs in 2018 vs. 2008.