Efficient Multi-Task Auxiliary Learning: Selecting Auxiliary Data by Feature Similarity

This repository contains the code for reproducing our paper.

Overview

Multi-task auxiliary learning utilizes auxiliary tasks to improve a primary task, while (1) selecting beneficial auxiliary tasks for a primary task is nontrivial, and (2) when the auxiliary datasets are large, training on all data becomes time-expensive and impractical. Therefore, we propose a time-efficient sampling method to select the data most beneficial to the primary task. The experiments on GLUE show 12x speed improvement compared to fully-trained MT-DNN. The following figure is an illustration of our framework.

To Reproduce our experiment results in Fig 3-5

Environment Settings

Make sure the python version >= 3.7
pip3 install -r requirements.txt
The default CUDA is set to 0. If you want to change the used CUDA, modify the CUDA Variable in BCE_CE.sh, train_TD.sh, RAND.sh

Running Experiment

The default experiment settings:

Train a BCE-TD-MTDNN model and CE-TD-MTDNN model. Notice that this script should be run before running ALL_R.sh and ALL_A.sh

./ALL_TD.sh

Train a Random-Sampling with TO-MTDNN using total 500 auxiliary data. To change the data amount, modify $NUM$ in ALL_R.sh

./ALL_R.sh

Train two TO-MTDNN with TO-MTDNN using total 500 auxiliary data sampled from BCE-TD-MTDNN and CE-TD-MTDNN. To change the data amount, modify $NUM$ in ALL_A.sh

./ALL_A.sh

Execute all the above scripts.

./Train_ALL.sh

Output

When the experiment is done, the outputs are store in the following folders:
    Models --> results/
    TD-MTDNN T-SNE visualization --> figures/
    The final submission file (For GLUE Benchmark) --> final_submission/
    !! Notice that in the final submission scores, only the RTE, MRPC and STS-B scores are real. Since the submission requires all tasks data, we copy the files for other tasks in FAKE_TEST3/ to make the submission zip.

Citation

Please cite our paper if you use SimCSE in your work:

@inproceedings{kung2021efficient,
   title={Efficient Multi-Task Auxiliary Learning: Selecting Auxiliary Data by Feature Similarity},
   author={Po-Nien Kung, Sheng-Siang Yin, Yi-Cheng Chen, Tse-Hsuan Yang, and Yun-Nung Chen},
   booktitle={Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
   year={2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 73 Commits
FAKE_TEST3		FAKE_TEST3
figure		figure
plot_scores		plot_scores
submit		submit
tools		tools
.gitignore		.gitignore
.result_proc.py.swp		.result_proc.py.swp
ALL_A.sh		ALL_A.sh
ALL_R.sh		ALL_R.sh
ALL_TD.sh		ALL_TD.sh
BCE_CE.sh		BCE_CE.sh
CP_USE_DATA.sh		CP_USE_DATA.sh
RAND.sh		RAND.sh
README.md		README.md
custom_dataset.py		custom_dataset.py
custom_trainer.py		custom_trainer.py
modeling_bert.py		modeling_bert.py
post_pipeline.sh		post_pipeline.sh
python_alias.sh		python_alias.sh
requirements.txt		requirements.txt
result_proc.py		result_proc.py
run_all_finetune.sh		run_all_finetune.sh
run_eval_finetune.sh		run_eval_finetune.sh
run_finetune.sh		run_finetune.sh
run_glue.py		run_glue.py
run_predict.sh		run_predict.sh
run_predict_TO_MTDNN.sh		run_predict_TO_MTDNN.sh
run_predict_finetune.sh		run_predict_finetune.sh
run_train.sh		run_train.sh
run_train_TD.sh		run_train_TD.sh
run_vis_hidden.sh		run_vis_hidden.sh
task_disc_to_rank_files.py		task_disc_to_rank_files.py
train_TD.sh		train_TD.sh
vis_hidden.py		vis_hidden.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Efficient Multi-Task Auxiliary Learning: Selecting Auxiliary Data by Feature Similarity

Overview

To Reproduce our experiment results in Fig 3-5

Environment Settings

Running Experiment

Output

Citation

About

Releases

Packages

Contributors 4

Languages

MiuLab/FastMTL

Folders and files

Latest commit

History

Repository files navigation

Efficient Multi-Task Auxiliary Learning: Selecting Auxiliary Data by Feature Similarity

Overview

To Reproduce our experiment results in Fig 3-5

Environment Settings

Running Experiment

Output

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages