- Tensorflow
- Matplotlib
- Librosa
- Numpy
- Pandas
- fma_small
- 8,000 tracks of 30s, 8 balanced genres (7.2gb)
- Mel Spectrograms were created from the first 10 seconds of each song in the dataset
- As these spectrograms are images, we can treat this as an immage classification problem
- We train a CNN to classify these spectrograms into their genres
- An adam optimizer was used for training
- Categorical Crossentropy loss was used since it was a classification problem
- We noticed overfitting, to reduce this we
- Removed instrumental songs from our data since it showed the highest misclassification
- Reduced complexity of our CNN model
- Added regularization and a dropout layer
- We got the following Accuracy and Loss graphs (blue = validation | orange = training)