Fair Tabular Diffusion

NOTE: The code for our method is in src, and the Python script for running experiments using our method is fairtabddpm_opt.py.

Setup

The PyTorch version we used in this project is 2.3.0+cu121, and you can install the required packages by running the following command:

conda create -n ai python=3.10
source activate ai
pip install -r requirements.txt
pip install dgl -f https://data.dgl.ai/wheels/torch-2.3/cu121/repo.html
pip install torch-scatter torch-sparse -f https://data.pyg.org/whl/torch-2.3.0+cu121.html

To download and preprocess the datasets, run the following command:

python build.py

Running Experiments

Under the root directory, run the following commands to reproduce the results of our method:

# run experiments for our method
bash fairtabddpm.sh

To reproduce the results of baseline methods, run the following commands:

# go to baselines directory
cd baselines
# run experiments for baselines
bash codi.sh
bash fairsmote.sh
bash fairtabgan.sh
bash goggle.sh
bash great.sh
bash smote.sh
bash stasy.sh
bash tabddpm.sh
bash tabsyn.sh

Benchmarks

Datasets

Adult
COMPASS
German Credit
Bank Marketing

Baselines

The baseline methods we used in this project are as follows (sorted alphabetically):

To Do

Avoid repeatition to improve the code quality:

Replace exp_config['home'] by importing EXPS_PATH from constant.py in all running scripts
Replace data_config['path'] by importing DB_PATH from constant.py in all running scripts
Delete home of experiments and path of datasets in all config.toml files
Add a new argument --method to optimization scripts and merge all optimization scripts into one
Find commonly used functions in all running scripts and move them to utils.py

Organize the code:

Move fairtabddpm.sh, fairtabddpm_run.py, fairtabddpm_opt.py to baseline directory and rename baseline directory to methods, and edit readme.md accordingly
Move src/evaluate/metrics.py out to the root directory because it is specific to the project

Automate the experiments and evaluations:

Refactor and reorganize assess/present.ipynb with functional programming
Rewrite all the code in assess directory with functional programming

Correct the errors:

The implementation of TabSyn in baselines is incorrect

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fair Tabular Diffusion

Setup

Running Experiments

Benchmarks

Datasets

Baselines

To Do

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 218 Commits
args		args
assess		assess
baselines		baselines
src		src
tabsyn		tabsyn
.gitignore		.gitignore
build.py		build.py
compas.ipynb		compas.ipynb
constant.py		constant.py
data.ipynb		data.ipynb
eval_tabsyn.py		eval_tabsyn.py
fairtabddpm.sh		fairtabddpm.sh
fairtabddpm_opt.py		fairtabddpm_opt.py
fairtabddpm_run.py		fairtabddpm_run.py
lib.py		lib.py
readme.md		readme.md
requirements.txt		requirements.txt
results.ipynb		results.ipynb
setup.cfg		setup.cfg

comp-well-org/fair-tab-diffusion

Folders and files

Latest commit

History

Repository files navigation

Fair Tabular Diffusion

Setup

Running Experiments

Benchmarks

Datasets

Baselines

To Do

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages