This repository has been archived by the owner on Nov 3, 2023. It is now read-only.
Replies: 1 comment 2 replies
-
Two possible workarounds:
|
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello,
I am train blender bot distilled, but I want to enable person tokens for it. When I add --person-tokens True to the training parameters it outputs an error:
size mismatch for decoder.embeddings.weight: copying a param with shape torch.Size([8008, 2560]) from checkpoint, the shape in current model is torch.Size([8012, 2560]).
It looks like the default dictionary for the model has different size. I'm using "--dict-file zoo:blender/blender_1Bdistill/model.dict"...
Note that the below command works without "-person-tokens" parameter
The full training command is below
parlai train_model -t blended_skill_talk,wizard_of_wikipedia,convai2:normalized,empathetic_dialogues --multitask-weights 1,3,3,3 -veps 0.25 --attention-dropout 0.0 --batchsize 64 --model transformer/generator --embedding-size 2560 --ffn-size 10240 --variant prelayernorm --n-heads 32 --n-positions 128 --n-decoder-layers 12 --history-add-global-end-token end --delimiter ' ' --dict-tokenizer bytelevelbpe --dropout 0.1 --fp16 True --init-model zoo:blender/blender_1Bdistill/model --dict-file zoo:blender/blender_1Bdistill/model.dict --label-truncate 128 --log_every_n_secs 10 -lr 7e-06 --lr-scheduler reduceonplateau --lr-scheduler-patience 3 --optimizer adam --relu-dropout 0.0 --activation gelu --model-parallel true --save-after-valid True --text-truncate 128 --truncate 128 --warmup_updates 100 --fp16-impl mem_efficient --update-freq 2 --gradient-clip 0.1 --skip-generation True -vp 10 -vmt ppl -vmm min --model-file ~/work/nlp/chatbot/tmp/ptoken/test_train_14B --batchsize 32 --embedding-size 2560 --ffn-size 10240 --n-encoder-layers 2 --model-parallel True --dynamic-batching full --beam-size 10 --beam-min-length 20 --beam-context-block-ngram 3 --beam-block-ngram 3 --inference beam --optimizer mem_eff_adam --learningrate 5e-05 --bpe-vocab /checkpoint/parlai/zoo/meena/20200319_meenav0data_tall_2.7B_adamoptimizer/20200319_13.3ppl_200kupdates/model.dict-vocab.json --bpe-merge /checkpoint/parlai/zoo/meena/20200319_meenav0data_tall_2.7B_adamoptimizer/20200319_13.3ppl_200kupdates/model.dict-merges.txt --bpe-add-prefix-space True --person-tokens True
Beta Was this translation helpful? Give feedback.
All reactions