In this article we present our research on applying different methods for music genre classification on a popular database GTZAN. We show our pipeline for extracting features from raw audio signal and scaling them into usable data. Then we list and describe different traditional methods like k-NN, SVM, Random forest, etc. which use these features. We also review application of deep neural networks in this domain. From our results we can deduce that SVM is the most suitable traditional model with classification accuracy of 78.4%, while CNNs work best when input is sub-sampled and majority voting is introduced at the end. Here our classification accuracy is 93.6%.
- Data exploration
- Read music file
- Vizualize waveform
- Extract time domain features
- Extract frequency domain features
- Extract features from data and save them in dataframe
- Vizualize with PCA or t-SNE
- Run traditional machine learning algorithms (k-NN, Random Forest, Gradient Boosting, SVM, Logistical Regression)
- Preliminary run with default parameters
- Evaluate features with relifF or similar
- Find the best parameters for each method
- Save results
- Extract image features from data
- Run different CNNs on image data
- Run my CNN
- Transfer learning
- CNN + SVM
- CNN on 3s clips
- Evaluate and compare
- Music Genre Classification: A Review of Deep-Learning and Traditional Machine-Learning Approaches
- Music Genre Classification and Recommendation by Using Machine Learning Techniques
- Music Genre Classification: A Comparative Study Between Deep-Learning And Traditional Machine Learning Approaches
- Short Time Fourier Transform based music genre classification
- Musical Genre Classification of Audio Signals