Skip to content

Commit 1bf177c

Browse files
committed
update prompt-tuning codes for nlp tasks
1 parent e96acef commit 1bf177c

File tree

12 files changed

+106
-80
lines changed

12 files changed

+106
-80
lines changed

.github/workflows/test_prompt.yml

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
name: UnitTests for Prompt Tuning
2+
3+
on:
4+
schedule:
5+
- cron: '0 8 * * 0'
6+
7+
jobs:
8+
run:
9+
if: (false == contains(github.event.pull_request.title, 'WIP') && github.repository == 'alibaba/FederatedScope')
10+
runs-on: ${{ matrix.os }}
11+
timeout-minutes: 30
12+
strategy:
13+
matrix:
14+
os: [ubuntu-latest]
15+
python-version: ['3.9']
16+
torch-version: ['1.10.1']
17+
torchvision-version: ['0.11.2']
18+
torchaudio-version: ['0.10.1']
19+
env:
20+
OS: ${{ matrix.os }}
21+
PYTHON: '3.9'
22+
steps:
23+
- uses: actions/checkout@master
24+
- name: Setup Python ${{ matrix.python-version }}
25+
uses: actions/setup-python@master
26+
with:
27+
python-version: ${{ matrix.python-version }}
28+
- name: Install PyTorch ${{ matrix.torch-version }}+cpu
29+
run: |
30+
pip install numpy typing-extensions dataclasses
31+
pip install torch==${{ matrix.torch-version}}+cpu torchvision==${{matrix.torchvision-version}}+cpu torchaudio==${{matrix.torchaudio-version}}+cpu -f https://download.pytorch.org/whl/torch_stable.html
32+
- name: Install FS
33+
run: |
34+
pip install -e .[test]
35+
- name: Install Transformers
36+
run: |
37+
pip install transformers==4.21.0
38+
- name: Install Datasets
39+
run: |
40+
pip install datasets
41+
- name: Install lm-eval
42+
run: |
43+
pip install lm-eval
44+
- name: Test Prompt Tuning
45+
run: |
46+
python ../../main.py \
47+
--cfg federatedscope/nlp/prompt_tuning/baseline/config_alter_train.yaml \
48+
data.dataset_name arc_challenge \
49+
data.batch_size 1 \
50+
data.max_seq_len 32 \
51+
grad.grad_accum_count 1 \
52+
federate.client_num 2 \
53+
federate.total_round_num 2 \
54+
federate.make_global_train True \
55+
federate.pl_init_kd True \
56+
federate.pl_kd_cfg_file federatedscope/nlp/prompt_tuning/baseline/config_init_kd_test.yaml \
57+
federate.pl_global_cfg_file federatedscope/nlp/prompt_tuning/baseline/config_global.2.yaml \
58+
model.use_fp16 True \
59+
model.model_type facebook/opt-1.3b \
60+
model.use_prefix_prj False \
61+
model.server_prefix_len 4 \
62+
model.client_prefix_len 4 \
63+
model.num_server_layers 24 \
64+
model.num_client_layers 24 \
65+
model.share_client_layer_param True \
66+
model.client_start_layer_id 0 \
67+
model.num_client_layers_per_cell 1 \
68+
train.optimizer.lr 5e-4 \
69+
train.optimizer.eps 1e-4 \
70+
train.local_update_steps 2 \
71+
outdir exp/arc_challenge \
72+
data.is_debug True \
73+
74+
[ $? -eq 1 ] && exit 1 || echo "Passed"

federatedscope/core/configs/cfg_data.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,6 @@ def extend_data_cfg(cfg):
103103
cfg.data.dataset_name = ''
104104
cfg.data.train_frac = 0.9
105105
cfg.data.num_train_per_client = -1
106-
cfg.data.non_iid_split = False
107106

108107
# --------------- outdated configs ---------------
109108
# TODO: delete this code block

federatedscope/nlp/prompt_tuning/README.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,15 @@
11
## Tunable Soft Prompts are Messengers in Federated Learning
22
The implementation of *Tunable Soft Prompts are Messengers in Federated Learning*.
33

4+
In this study, we propose a novel FL training approach that accomplishes information exchange among participants via tunable soft prompts.
5+
These soft prompts are updated and transmitted between the server and clients, taking over the duty of the global model parameters and serving as messengers to deliver useful knowledge in local data and global models.
46

57
### Installation
68
First of all, you need to install FederatedScope, please refer to [installation](https://github.com/alibaba/FederatedScope#step-1-installation).
79

810
Besides, we need some additional requirements for NLP tasks, including:
9-
* transformers
10-
* datasets
11+
* Transformers
12+
* Datasets
1113
* lm-eval
1214

1315
```bash
@@ -17,7 +19,7 @@ pip install lm-eval
1719
```
1820

1921
### Reproduction
20-
**Prefix-Tuning**
22+
**Prefix-tuning**
2123
```bash
2224
bash run_gpt_prefix.sh $DEVICE # gpt2-xl
2325
bash run_opt_prefix.sh $DEVICE # opt-1.3b
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
federate:
2+
total_round_num: 1
3+
pl_alter_train: False
4+
data:
5+
batch_size: 1
6+
model:
7+
server_prefix_len: 0
8+
client_prefix_len: 0
9+
server_freeze_param: ['model']
10+
client_freeze_param: []
11+
only_use_hidden_loss: True
12+
train:
13+
batch_or_epoch: batch
14+
local_update_steps: 10
15+
optimizer:
16+
type: AdamW
17+
lr: 5e-4
18+
weight_decay: 0.01
19+
scheduler:
20+
type: warmup_step
21+
warmup_ratio: 0.1
22+
grad:
23+
grad_accum_count: 1

federatedscope/nlp/prompt_tuning/run_gpt_fedprompt.sh

Lines changed: 0 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,6 @@ CLIENT_START_LAYER_ID=0
2121
NUM_CLIENT_LAYERS_PER_CELL=48
2222
LR=5e-2
2323
EPS=1e-4
24-
NON_IID_SPLIT=False
2524
MAKE_GLOBAL_TRAIN=True
2625
SHARE_CLIENT_LAYER_PARAM=False
2726
PL_INIT_KD=False
@@ -32,7 +31,6 @@ DEBUG=False
3231
CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
3332
--cfg $CFG \
3433
data.dataset_name arc_challenge \
35-
data.non_iid_split $NON_IID_SPLIT \
3634
data.batch_size $BATCH_SIZE \
3735
data.max_seq_len $MAX_SEQ_LEN \
3836
grad.grad_accum_count $GRAD_ACCUM \
@@ -61,7 +59,6 @@ CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
6159
CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
6260
--cfg $CFG \
6361
data.dataset_name arc_easy \
64-
data.non_iid_split $NON_IID_SPLIT \
6562
data.batch_size $BATCH_SIZE \
6663
data.max_seq_len $MAX_SEQ_LEN \
6764
grad.grad_accum_count $GRAD_ACCUM \
@@ -90,7 +87,6 @@ CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
9087
CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
9188
--cfg $CFG \
9289
data.dataset_name openbookqa \
93-
data.non_iid_split $NON_IID_SPLIT \
9490
data.batch_size $BATCH_SIZE \
9591
data.max_seq_len $MAX_SEQ_LEN \
9692
grad.grad_accum_count $GRAD_ACCUM \
@@ -119,7 +115,6 @@ CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
119115
CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
120116
--cfg $CFG \
121117
data.dataset_name web_questions \
122-
data.non_iid_split $NON_IID_SPLIT \
123118
data.batch_size $BATCH_SIZE \
124119
data.max_seq_len $MAX_SEQ_LEN \
125120
grad.grad_accum_count $GRAD_ACCUM \
@@ -148,7 +143,6 @@ CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
148143
CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
149144
--cfg $CFG \
150145
data.dataset_name hellaswag \
151-
data.non_iid_split $NON_IID_SPLIT \
152146
data.batch_size $BATCH_SIZE \
153147
data.max_seq_len $MAX_SEQ_LEN \
154148
grad.grad_accum_count $GRAD_ACCUM \
@@ -177,7 +171,6 @@ CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
177171
CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
178172
--cfg $CFG \
179173
data.dataset_name piqa \
180-
data.non_iid_split $NON_IID_SPLIT \
181174
data.batch_size $BATCH_SIZE \
182175
data.max_seq_len $MAX_SEQ_LEN \
183176
grad.grad_accum_count $GRAD_ACCUM \
@@ -206,7 +199,6 @@ CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
206199
CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
207200
--cfg $CFG \
208201
data.dataset_name sciq \
209-
data.non_iid_split $NON_IID_SPLIT \
210202
data.batch_size $BATCH_SIZE \
211203
data.max_seq_len $MAX_SEQ_LEN \
212204
grad.grad_accum_count $GRAD_ACCUM \
@@ -235,7 +227,6 @@ CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
235227
CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
236228
--cfg $CFG \
237229
data.dataset_name race \
238-
data.non_iid_split $NON_IID_SPLIT \
239230
data.batch_size $BATCH_SIZE \
240231
data.max_seq_len $MAX_SEQ_LEN \
241232
grad.grad_accum_count $GRAD_ACCUM \

federatedscope/nlp/prompt_tuning/run_gpt_fedprompt_lsr.sh

Lines changed: 1 addition & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -16,12 +16,11 @@ SERVER_PREFIX_LEN=40
1616
CLIENT_PREFIX_LEN=40
1717
NUM_CLIENT=10
1818
NUM_SERVER_LAYERS=48
19-
NUM_CLIENT_LAYERS=48
19+
NUM_CLIENT_LAYERS=1
2020
CLIENT_START_LAYER_ID=0
2121
NUM_CLIENT_LAYERS_PER_CELL=1
2222
LR=5e-3
2323
EPS=1e-4
24-
NON_IID_SPLIT=False
2524
MAKE_GLOBAL_TRAIN=True
2625
SHARE_CLIENT_LAYER_PARAM=False
2726
PL_INIT_KD=False
@@ -32,7 +31,6 @@ DEBUG=False
3231
CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
3332
--cfg $CFG \
3433
data.dataset_name arc_challenge \
35-
data.non_iid_split $NON_IID_SPLIT \
3634
data.batch_size $BATCH_SIZE \
3735
data.max_seq_len $MAX_SEQ_LEN \
3836
grad.grad_accum_count $GRAD_ACCUM \
@@ -61,7 +59,6 @@ CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
6159
CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
6260
--cfg $CFG \
6361
data.dataset_name arc_easy \
64-
data.non_iid_split $NON_IID_SPLIT \
6562
data.batch_size $BATCH_SIZE \
6663
data.max_seq_len $MAX_SEQ_LEN \
6764
grad.grad_accum_count $GRAD_ACCUM \
@@ -90,7 +87,6 @@ CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
9087
CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
9188
--cfg $CFG \
9289
data.dataset_name openbookqa \
93-
data.non_iid_split $NON_IID_SPLIT \
9490
data.batch_size $BATCH_SIZE \
9591
data.max_seq_len $MAX_SEQ_LEN \
9692
grad.grad_accum_count $GRAD_ACCUM \
@@ -119,7 +115,6 @@ CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
119115
CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
120116
--cfg $CFG \
121117
data.dataset_name web_questions \
122-
data.non_iid_split $NON_IID_SPLIT \
123118
data.batch_size $BATCH_SIZE \
124119
data.max_seq_len $MAX_SEQ_LEN \
125120
grad.grad_accum_count $GRAD_ACCUM \
@@ -148,7 +143,6 @@ CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
148143
CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
149144
--cfg $CFG \
150145
data.dataset_name hellaswag \
151-
data.non_iid_split $NON_IID_SPLIT \
152146
data.batch_size $BATCH_SIZE \
153147
data.max_seq_len $MAX_SEQ_LEN \
154148
grad.grad_accum_count $GRAD_ACCUM \
@@ -177,7 +171,6 @@ CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
177171
CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
178172
--cfg $CFG \
179173
data.dataset_name piqa \
180-
data.non_iid_split $NON_IID_SPLIT \
181174
data.batch_size $BATCH_SIZE \
182175
data.max_seq_len $MAX_SEQ_LEN \
183176
grad.grad_accum_count $GRAD_ACCUM \
@@ -206,7 +199,6 @@ CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
206199
CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
207200
--cfg $CFG \
208201
data.dataset_name sciq \
209-
data.non_iid_split $NON_IID_SPLIT \
210202
data.batch_size $BATCH_SIZE \
211203
data.max_seq_len $MAX_SEQ_LEN \
212204
grad.grad_accum_count $GRAD_ACCUM \
@@ -235,7 +227,6 @@ CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
235227
CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
236228
--cfg $CFG \
237229
data.dataset_name race \
238-
data.non_iid_split $NON_IID_SPLIT \
239230
data.batch_size $BATCH_SIZE \
240231
data.max_seq_len $MAX_SEQ_LEN \
241232
grad.grad_accum_count $GRAD_ACCUM \

federatedscope/nlp/prompt_tuning/run_gpt_ours.sh

Lines changed: 0 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,6 @@ CLIENT_START_LAYER_ID=0
2121
NUM_CLIENT_LAYERS_PER_CELL=1
2222
LR=5e-4
2323
EPS=1e-4
24-
NON_IID_SPLIT=False
2524
MAKE_GLOBAL_TRAIN=True
2625
SHARE_CLIENT_LAYER_PARAM=True
2726
PL_INIT_KD=True
@@ -36,7 +35,6 @@ USE_PREFIX_PRJ=False
3635
CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
3736
--cfg $CFG \
3837
data.dataset_name arc_challenge \
39-
data.non_iid_split $NON_IID_SPLIT \
4038
data.batch_size $BATCH_SIZE \
4139
data.max_seq_len $MAX_SEQ_LEN \
4240
grad.grad_accum_count $GRAD_ACCUM \
@@ -69,7 +67,6 @@ USE_PREFIX_PRJ=False
6967
CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
7068
--cfg $CFG \
7169
data.dataset_name arc_easy \
72-
data.non_iid_split $NON_IID_SPLIT \
7370
data.batch_size $BATCH_SIZE \
7471
data.max_seq_len $MAX_SEQ_LEN \
7572
grad.grad_accum_count $GRAD_ACCUM \
@@ -102,7 +99,6 @@ USE_PREFIX_PRJ=False
10299
CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
103100
--cfg $CFG \
104101
data.dataset_name openbookqa \
105-
data.non_iid_split $NON_IID_SPLIT \
106102
data.batch_size $BATCH_SIZE \
107103
data.max_seq_len $MAX_SEQ_LEN \
108104
grad.grad_accum_count $GRAD_ACCUM \
@@ -135,7 +131,6 @@ USE_PREFIX_PRJ=False
135131
CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
136132
--cfg $CFG \
137133
data.dataset_name web_questions \
138-
data.non_iid_split $NON_IID_SPLIT \
139134
data.batch_size $BATCH_SIZE \
140135
data.max_seq_len $MAX_SEQ_LEN \
141136
grad.grad_accum_count $GRAD_ACCUM \
@@ -168,7 +163,6 @@ USE_PREFIX_PRJ=True
168163
CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
169164
--cfg $CFG \
170165
data.dataset_name hellaswag \
171-
data.non_iid_split $NON_IID_SPLIT \
172166
data.batch_size $BATCH_SIZE \
173167
data.max_seq_len $MAX_SEQ_LEN \
174168
grad.grad_accum_count $GRAD_ACCUM \
@@ -201,7 +195,6 @@ USE_PREFIX_PRJ=False
201195
CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
202196
--cfg $CFG \
203197
data.dataset_name piqa \
204-
data.non_iid_split $NON_IID_SPLIT \
205198
data.batch_size $BATCH_SIZE \
206199
data.max_seq_len $MAX_SEQ_LEN \
207200
grad.grad_accum_count $GRAD_ACCUM \
@@ -234,7 +227,6 @@ USE_PREFIX_PRJ=True
234227
CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
235228
--cfg $CFG \
236229
data.dataset_name sciq \
237-
data.non_iid_split $NON_IID_SPLIT \
238230
data.batch_size $BATCH_SIZE \
239231
data.max_seq_len $MAX_SEQ_LEN \
240232
grad.grad_accum_count $GRAD_ACCUM \
@@ -267,7 +259,6 @@ USE_PREFIX_PRJ=True
267259
CUDA_VISIBLE_DEVICES=$DEVICE python ../../main.py \
268260
--cfg $CFG \
269261
data.dataset_name race \
270-
data.non_iid_split $NON_IID_SPLIT \
271262
data.batch_size $BATCH_SIZE \
272263
data.max_seq_len $MAX_SEQ_LEN \
273264
grad.grad_accum_count $GRAD_ACCUM \

0 commit comments

Comments
 (0)