problem with person tokens in blender #3500

PolKul · 2021-03-08T22:11:52Z

PolKul
Mar 8, 2021

Hello,
I am train blender bot distilled, but I want to enable person tokens for it. When I add --person-tokens True to the training parameters it outputs an error:

size mismatch for decoder.embeddings.weight: copying a param with shape torch.Size([8008, 2560]) from checkpoint, the shape in current model is torch.Size([8012, 2560]).

It looks like the default dictionary for the model has different size. I'm using "--dict-file zoo:blender/blender_1Bdistill/model.dict"...

Note that the below command works without "-person-tokens" parameter

The full training command is below

parlai train_model -t blended_skill_talk,wizard_of_wikipedia,convai2:normalized,empathetic_dialogues --multitask-weights 1,3,3,3 -veps 0.25 --attention-dropout 0.0 --batchsize 64 --model transformer/generator --embedding-size 2560 --ffn-size 10240 --variant prelayernorm --n-heads 32 --n-positions 128 --n-decoder-layers 12 --history-add-global-end-token end --delimiter ' ' --dict-tokenizer bytelevelbpe --dropout 0.1 --fp16 True --init-model zoo:blender/blender_1Bdistill/model --dict-file zoo:blender/blender_1Bdistill/model.dict --label-truncate 128 --log_every_n_secs 10 -lr 7e-06 --lr-scheduler reduceonplateau --lr-scheduler-patience 3 --optimizer adam --relu-dropout 0.0 --activation gelu --model-parallel true --save-after-valid True --text-truncate 128 --truncate 128 --warmup_updates 100 --fp16-impl mem_efficient --update-freq 2 --gradient-clip 0.1 --skip-generation True -vp 10 -vmt ppl -vmm min --model-file ~/work/nlp/chatbot/tmp/ptoken/test_train_14B --batchsize 32 --embedding-size 2560 --ffn-size 10240 --n-encoder-layers 2 --model-parallel True --dynamic-batching full --beam-size 10 --beam-min-length 20 --beam-context-block-ngram 3 --beam-block-ngram 3 --inference beam --optimizer mem_eff_adam --learningrate 5e-05 --bpe-vocab /checkpoint/parlai/zoo/meena/20200319_meenav0data_tall_2.7B_adamoptimizer/20200319_13.3ppl_200kupdates/model.dict-vocab.json --bpe-merge /checkpoint/parlai/zoo/meena/20200319_meenav0data_tall_2.7B_adamoptimizer/20200319_13.3ppl_200kupdates/model.dict-merges.txt --bpe-add-prefix-space True --person-tokens True

stephenroller · 2021-03-08T22:30:38Z

stephenroller
Mar 8, 2021

Two possible workarounds:

try adding --special-tok-lst __p1__,__p2__
This is hacky but make a copy of the .dict file and edit it, inserting __p1__\t1 and __p2__\t1 before the FP16 tokens, and then delete the lines corresponding to PAD_2 and PAD_3. The file should stay 8008 lines then.

2 replies

PolKul Mar 10, 2021
Author

Thanks @stephenroller, adding the option --special-tok-lst __p1__,__p2__ helped.

I have another question related to special tokens. How can I feed "knowledge" to the model inputs? I have trained it with --include-knowledge-separator True, but which token is used for knowledge separator?

stephenroller Mar 10, 2021

ParlAI/parlai/tasks/wizard_of_wikipedia/agents.py

Lines 31 to 33 in 503409f

    
           TOKEN_NOCHOSEN = 'no_passages_used' 
        
           TOKEN_KNOWLEDGE = '__knowledge__' 
        
           TOKEN_END_KNOWLEDGE = '__endknowledge__'

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

problem with person tokens in blender #3500

{{title}}

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

problem with person tokens in blender #3500

PolKul Mar 8, 2021

Replies: 1 comment · 2 replies

stephenroller Mar 8, 2021

PolKul Mar 10, 2021 Author

stephenroller Mar 10, 2021

PolKul
Mar 8, 2021

Replies: 1 comment 2 replies

stephenroller
Mar 8, 2021

PolKul Mar 10, 2021
Author