Synthflow is a Python package for facilitating the end-to-end production of differentially private synthetic data. It has been utilized in the public release of Israel's National Birth Registry. For additional information about the release, please refer to https://birth.dataset.pub.
- Python 3.8
- Poetry
- Clone this repository:
git clone https://github.com/shlomihod/synthflow.git
- Navigate to the directory:
cd synthflow
- Install dependencies using Poetry:
poetry install
Run the synthflow tool with the --help option to see available commands:
poetry run python -m synthflow --help
- execute: Run the synthetic data generation and evaluation process
- evaluate: Evaluate a given synthetic data
- span: Span the space of generation configurations for a given privacy parameters (epsilon, delta)
- parallel: Run the synthetic data generation and evaluation process (execute) in parallel
- report: Generate an evaluation report of an execution
For a complete list of options and flags, refer to the initial command list above.
Run tests using pytest:
pytest
This project is licensed under the MIT License. See LICENSE for details.
For questions or issues, please open an issue.