Spotifeature is a project for the Data Wrangling course @ Vrije Universiteit Amsterdam aiming to investigate possible relations between audio features of a playlist and playlist metadata (e.g., popularity measured in followers). The project report outlines the findings of the research.
The code for this project is distributed over multiple notebooks as many parts can be seen as individual (isolated) steps.
visualization_initial.ipynb
is used for gaining a first understanding of the dataset based on which further decisions (e.g., the minimum followers threshold) are based. The notebook offers tools for generating
-
acquisition_playlists.ipynb
is used for processing the initial dataset (1,000,000 public Spotify playlists). The notebook can be used to create 2 different outputs:- The original dataset, stripped of some (unneeded) playlist attributes, serialized as a Python pickle file.
- A list of all unique track IDs (used for audio feature acquisition), also as a Python pickle file.
-
acquisition_features.ipynb
is used for building a dataset of audio features for the track IDs identified in the previous notebook. The resulting dataset is saved as acsv
file.
processing.ipynb
(withprocess_playlist.py
) is used for generating playlist metrics based on the audio features. The results are stored serialized in a Python pickle file.
data_visualizatoin.ipynb
is used to generate all graphs used for feature trend discovery according to the research questions, and the individual graphs used for the report/presentation.
As most of the generated (intermediate) data is substantial in size, the files are stored separately in a Google Drive Folder.
Name | Profile |
---|---|
Lennart K.M. Schulz | GitHub, LinkedIn |
Laura I.M. Stampf | GitHub, LinkedIn |
Dovydas Vadišius | GitHub, LinkedIn |