Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] How do I calculate max_tokens max value? #34

Open
TaridaGeorge opened this issue Mar 3, 2021 · 3 comments
Open

[Question] How do I calculate max_tokens max value? #34

TaridaGeorge opened this issue Mar 3, 2021 · 3 comments

Comments

@TaridaGeorge
Copy link

Given that I'm using for training 5 GPUs GeForce GTX 1080 Ti with 10.917GB memory each, how can I calculate the max_tokens so that no memory error occurs?

@TaridaGeorge TaridaGeorge changed the title [Question] How do I calculate max_tokens? [Question] How do I calculate max_tokens max value? Mar 3, 2021
@mailong25
Copy link
Owner

batch_duration (s) = max_tokens / 16000
For example,
if max_tokens is set to 160000, the total audio duration of a batch is limited to 10 seconds.

@TaridaGeorge
Copy link
Author

So my question is .. how many seconds can I have inside a batch? If I set max_tokens to a big number ( 1 200 000 ) I get an error:

2021-03-03 11:40:50 | WARNING | fairseq.trainer | OOM: Ran out of memory with exception: CUDA out of memory. Tried to allocate 210.00 MiB (G
PU 4; 10.92 GiB total capacity; 9.16 GiB already allocated; 147.50 MiB free; 10.18 GiB reserved in total by PyTorch)
Exception raised from malloc at ../c10/cuda/CUDACachingAllocator.cpp:272 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x4d (0x7f9c2042fbdd in /space/homes/user/workspace/lib/python3.6/site-p
ackages/torch/lib/libc10.so)

Does the number of virtual GPUs count?

@mailong25
Copy link
Owner

mailong25 commented Mar 3, 2021

how many seconds can I have inside a batch? --> I can't give you an exact number but It should be as high as possible depending on your GPU memory. You should try with different settings to see how it goes
Try to lower the max_tokens if you encounter a memory error, but the number must not lower than the number of tokens of the longest audio

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants