This repository will contain the code and resources used for processing data from Tidepool's Big Data Donation Project.
Note: All code is currently located as the bigdata-processing-pipeline project within tidepool-org/data-analytics and will soon be migrated into this repository for further development.
The Donor Data Pipeline is the set of tools which help transform Tidepool's donated diabetes device data into clean and de-identified datasets ready for analysis.
This pipeline includes code which:
- Collects datasets through Tidepool's API calls
- Cleans and corrects values, timestamps, and data formats
- Creates estimates of local time from device data
- Anonymizes datasets
- Summarizes dataset content and quality
To learn more about the donated data, see Tidepool's list of supported devices and the data model for diabetes device data types.
We will have a dedicated section with setup instructions and examples in this repository soon. For now you can check out the 'Getting Started' section of our data-analytics repo.
Tidepool has a Public Slack and you can reach the Tidepool Data Science Team within the #data-analytics channel. You may also send in your questions to [email protected].
- Add README
- Migrate existing code from bigdata-processing-pipeline project in tidepool-org/data-analytics
- Add 'Getting Started' folder with a guided example