This is the repository for managing and cleaning data associated with the Team Process Mapping project.
The rough plan for its eventual development is as follows:
- data_cleaning/ → scripts for cleaning up the data; these should run in 1 click
- One script per dataset
- data_cleaning_dev/ → .ipynb’s for cleaning data and exploring different ways of cleaning it
- raw_data → original dataset formats
- cleaned_data → dataset after being processed (the ‘output’ folder for this repo)
- vector_data → directory where vector versions of the data are cached after processing.