Skip to content
Tom R edited this page Feb 8, 2020 · 15 revisions

Havelovewilltravel documentation

Havelovewilltravel is a project to automatically aggregate ConcertAnnouncements from gigfinder platforms such as Facebook, SongKick, BandsInTown and Setlist.fm. Artists and their references to these gigfinder platforms are maintained via Musicbrainz.

These platforms contain many ConcertAnnouncements, some of which are duplicated across the platforms. Havelovewilltravel does a best effort to automatically de-duplicate such ConcertAnnouncements into Concerts, which refer to the (multiple) ConcertAnnouncement(s).

Simultaneously, other information is also subjected to automatic best efforts to maintain data quality, i.e. venue/organisation information and location information.

However, despite these automatic data quality rules, some manual quality assurance is needed. The speed and quantity with which the information can be aggregated prohibits us from maintaining the data in spreadsheets. Therefore, we are developing a Data Management Tool for Concert Announcements.

"hlwtadmin" is a Python on Django web application that handles

  • the automatic aggregation of concert announcements (via APIs and screenscraping),
  • the automatic rule engine to improve data quality automatically, and
  • a dashboard and tooling for manual Quality Assurance.

This tool is being developed by Tom and Quinten at Kunstenpunt, Belgium.

Data model

Relations

Semantics

Merge functionalities

Automation

QA Lists and procedures

Batch operations

Development

Clone this wiki locally