Skip to content

Latest commit

 

History

History
34 lines (24 loc) · 2.56 KB

README.md

File metadata and controls

34 lines (24 loc) · 2.56 KB

Spotifeature

Spotifeature is a project for the Data Wrangling course @ Vrije Universiteit Amsterdam aiming to investigate possible relations between audio features of a playlist and playlist metadata (e.g., popularity measured in followers). The project report outlines the findings of the research.

Notebooks

The code for this project is distributed over multiple notebooks as many parts can be seen as individual (isolated) steps.

Pre-Tasks

  • visualization_initial.ipynb is used for gaining a first understanding of the dataset based on which further decisions (e.g., the minimum followers threshold) are based. The notebook offers tools for generating

Acquisition & Cleaning

  • acquisition_playlists.ipynb is used for processing the initial dataset (1,000,000 public Spotify playlists). The notebook can be used to create 2 different outputs:

    1. The original dataset, stripped of some (unneeded) playlist attributes, serialized as a Python pickle file.
    2. A list of all unique track IDs (used for audio feature acquisition), also as a Python pickle file.
  • acquisition_features.ipynb is used for building a dataset of audio features for the track IDs identified in the previous notebook. The resulting dataset is saved as a csv file.

Processing

Visualization

  • data_visualizatoin.ipynb is used to generate all graphs used for feature trend discovery according to the research questions, and the individual graphs used for the report/presentation.

Data

As most of the generated (intermediate) data is substantial in size, the files are stored separately in a Google Drive Folder.

Authors

Name Profile
Lennart K.M. Schulz GitHub, LinkedIn
Laura I.M. Stampf GitHub, LinkedIn
Dovydas Vadišius GitHub, LinkedIn