Skip to content

The implementation of the PrADA algorithm published in the paper "Privacy-preserving Federated Adversarial Domain Adaptation over Feature Groups for Interpretability"

Notifications You must be signed in to change notification settings

yankang18/prada

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

93 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PrADA: Privacy-preserving Federated Adversarial Domain Adaption over Feature Group for Interpretability

We walk through steps of running experiments on Census Income data. Experiments on PPD loan default follow the same procedure.

0. Download Data

1. Prepare Census Income Data

Update the census_data_creation_config.py file.

census_data_creation = {
    "original_data_dir": "YOUR ORIGINAL DATA DIR",
    "processed_data_dir": "YOUR PROCESSED DATA DIR",
    "train_data_file_name": "census-income.data",
    "test_data_file_name": "census-income.test",
    "positive_sample_ratio": 0.04,
    "number_target_samples": 4000,
    "data_tag": "all4000pos004"
}

Then, run:

python census_prepare_data.py

This would produce data at the location you specified at "processed_data_dir"

2. Perform Experiments

In this section, we show the steps of running following four variants of PrADA

  1. PrADA: apply feature group (FG) based domain adversarial adaptation (DA) with feature group interaction (IR).
  2. PrADA w/o IR: apply feature group based domain adversarial adaptation without feature group interaction
  3. PrADA w/o FG&IR: apply domain adversarial adaptation, but without feature grouping and interaction
  4. PrADA w/o DA&FG&IR: without domain adversarial adaptation, feature grouping, and feature group interaction.

2.1 Run PrADA

  1. Go to directory: prada/experiments/income_census/
  2. Configure hyperparameters using train_config.py file.
    • The using_interaction must be set to True
pre_train_hyperparameters = {
    "using_interaction": True,
    "momentum": 0.99,
    "weight_decay": 0.00001,
    "lr": 6e-4,
    "batch_size": 128,
    "max_epochs": 600,
    "epoch_patience": 3,
    "valid_metric": ('ks', 'auc')
}

fine_tune_hyperparameters = {
    "using_interaction": True,
    "load_global_classifier": False,
    "momentum": 0.99,
    "weight_decay": 0.0,
    "lr": 8e-4,
    "batch_size": 128,
    "valid_metric": ('ks', 'auc')
}
  1. First run pretrain task:
python train_census_fg_adapt_pretrain.py 

Once the training is completed, a pretrain task is returned, e.g.:

20210730_census_fg_adapt_all4000pos004_intrTrue_lr0.0005_bs128_me600_ts1627606557
  1. Run finetune task with pretain task id as a input:
python train_census_fg_target_finetune.py --pretrain_task_id 20210730_census_fg_adapt_all4000pos004_intrTrue_lr0.0005_bs128_me600_ts1627606557

Output test AUC and test KS.

2.2 Run PrADA w/o IR

  1. Go to directory: prada/experiments/income_census/
  2. Configure hyperparameters using train_config.py file.
    • The using_interaction must be set to False
pre_train_hyperparameters = {
    "using_interaction": False,
    "momentum": 0.99,
    "weight_decay": 0.00001,
    "lr": 5e-4,
    "batch_size": 128,
    "max_epochs": 600,
    "epoch_patience": 3,
    "valid_metric": ('ks', 'auc')
}

fine_tune_hyperparameters = {
    "using_interaction": False,
    "load_global_classifier": False,
    "momentum": 0.99,
    "weight_decay": 0.0,
    "lr": 8e-4,
    "batch_size": 128,
    "valid_metric": ('ks', 'auc')
}
  1. First run pretrain task:
python train_census_fg_adapt_pretrain.py 

Once the training is completed, a task is returned, e.g.:

20210730_census_fg_adapt_all4000pos004_intrFalse_lr0.0005_bs128_me600_ts1627606557
  1. Run finetune task with pretain task id as a input:
python train_census_fg_target_finetune.py --pretrain_task_id 20210730_census_fg_adapt_all4000pos004_intrFalse_lr0.0005_bs128_me600_ts1627606557

Output test AUC and test KS.

2.3 Run PrADA w/o FG&IR

  1. Go to directory: prada/experiments/income_census/
  2. Configure hyperparameters using train_config.py file.
    • NOTE: the using_interaction will not be used in this setting because no feature group is applied. Therefore, leave it by default.
pre_train_hyperparameters = {
    "using_interaction": False,
    "momentum": 0.99,
    "weight_decay": 0.00001,
    "lr": 5e-4,
    "batch_size": 128,
    "max_epochs": 600,
    "epoch_patience": 3,
    "valid_metric": ('ks', 'auc')
}

fine_tune_hyperparameters = {
    "using_interaction": False,
    "load_global_classifier": False,
    "momentum": 0.99,
    "weight_decay": 0.0,
    "lr": 8e-4,
    "batch_size": 128,
    "valid_metric": ('ks', 'auc')
}
  1. First run pretrain task:
python train_census_no_fg_adapt_pretrain.py 

Once the training is completed, a task is returned, e.g.:

20210730_census_no_fg_adapt_all4000pos004_lr0.0005_bs128_me600_ts1627612696
  1. Run finetune task with pretain task id as a input:
python train_census_no_fg_target_finetune.py --pretrain_task_id 20210730_census_no_fg_adapt_all4000pos004_lr0.0005_bs128_me600_ts1627612696

Output test AUC and test KS.

2.4 Run PrADA w/o DA&FG&IR

  1. Go to directory: prada/experiments/income_census/
  2. Configure hyperparameters using train_config.py file.
    • the train_data_tag specifies whether you use all samples (source + target) or just target samples for training.
    • NOTE: apply_feature_group specifies whether applying feature grouping or not. In this setting, we always set it to False.
no_adaptation_hyperparameters = {
    "apply_feature_group": False,
    "train_data_tag": 'all',  # can be either 'all' or 'tgt'
    "momentum": 0.99,
    "weight_decay": 0.00001,
    "lr": 5e-4,
    "batch_size": 128,
    "max_epochs": 600,
    "epoch_patience": 2,
    "valid_metric": ('ks', 'auc')
}
  1. Run task:
python train_census_no_adaptation.py 

Once the training is completed, a task is returned, e.g.:

20210730_all4000pos004v8_census_no_ad_wo_fg_all_lr0.0005_bs128_ts1627613954_ve0/

Output test AUC and test KS.

2.5 Run Test

The test AUC and test KS on target test data will be given once the training is completed as shown above. You can also test the trained model on target test data with a separate command, shown as follows:

Go to directory: prada/experiments/income_census/

  • task_id specifies the task of traing the model that you want to test.
  • model_tag specifies the variant of PrADA model. It can be fg (with feature group), no_fg (with no feature group) or no_ad (with no adaptation and no feature group)
python test_census_target.py --task_id 20210731_census_fg_adapt_all4000pos004_intrFalse_lr0.0005_bs128_me600_ts1627682125@target_20210731_rt_glr_lr0.0008_bs128_ts1627683284 --model_tag fg

3. Citation

Accepted for publication in IEEE Transactions on Big Data, 2022. Please kindly cite our paper if you find this code useful for your research.

@article{kang2022prada,
  author={Kang, Yan and He, Yuanqin and Luo, Jiahuan and Fan, Tao and Liu, Yang and Yang, Qiang},
  journal={IEEE Transactions on Big Data}, 
  title={Privacy-preserving Federated Adversarial Domain Adaptation over Feature Groups for Interpretability}, 
  year={2022},
  volume={},
  number={},
  pages={1-12},
  doi={10.1109/TBDATA.2022.3188292}}

About

The implementation of the PrADA algorithm published in the paper "Privacy-preserving Federated Adversarial Domain Adaptation over Feature Groups for Interpretability"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages