Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

finetune.py optimization.update_freq #35

Open
TaridaGeorge opened this issue Mar 3, 2021 · 3 comments
Open

finetune.py optimization.update_freq #35

TaridaGeorge opened this issue Mar 3, 2021 · 3 comments

Comments

@TaridaGeorge
Copy link

I was wondering why in the finetune.py file you've set update_freq to be 24/NUM_GPU.

    cmd.append("+optimization.update_freq='[" + str(int(24/NUM_GPU)) + "]'")

In the wav2vec Readme https://github.com/pytorch/fairseq/blob/master/examples/wav2vec/README.md they say that the base model was trained using 64 V100 GPUs and as I understood if we want to do more training on the base model we should simulate the number of the GPUs they've used.

Note: you can simulate 64 GPUs by using k GPUs and adding command line parameters (before --config-dir) distributed_training.distributed_world_size=k +optimization.update_freq='[x]' where x = 64/k

Have you found that setting update_freq to be 24/NUM_GPU is better for training or is it a bug?

@mailong25
Copy link
Owner

optimization.update_freq='[x]' where x = 64/k should belong to the pre-train step

@TaridaGeorge
Copy link
Author

And 24 should belong to finetuning? Is it 24 or 8? I saw that for the base model they've used 8 GPUs and for the large model 24.

@mailong25
Copy link
Owner

Yup! the number should follow the wa2vec repo instruction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants