Hop Suisse - public repo for our ADA project
In this project we performed data scraping, analysis and visualization from the Datasport website, focusing on records of running races.
A more detailed description of the steps we followed can be found below and on the poster.
The project is run by a new team, obtained by merging two previous ADA teams:
Here is a brief list of the steps we went through in our ourject.
Describes guideline and goals of the project.
From the datasport main page, we make requests to extract all names, dates and places of every running competition, as well as the urls where to find the results.
Results :
From every url found in
links2runs.csv
, we get all information about every race, namely the information on each runner: name, age, category, ranking, pace, etc. Note that given the way Datasport displays his records, the parsing step was quite non trivial.Results :
pickle
(temporarily hosted on Google Drive)
- weather :
From
links2runs.csv
we consider every date and place and search for the corresponding weather and temperature in order to investigate correlation between the runners' performances and the weather/temperature. Due to the API used, such weather information for races older than July 2008 is not available.Results :
Extra steps to build on top of
links2runs.csv
a more complete table containing the scraped information plus the weather information and GPS coordinates for each location when available.Results :
Data analysis performed both on particular races like Lausanne Marathon and on the entire dataset - in order to run the analysis on the full dataset, one should first download the
pickle
file
Our goal is to display the gathered data and the analysis on a website, in a more "user-friendly" way than Datasport. Our hopsuisse website is the result, for which we used GitHub Pages, Jekyll, D3.js, Leaflet, and other tools.
We created a short video in order to help visualize the large Datasport dataset in a concise way. This video was inpsired by the popular Hans Rosling's video.