Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IO Bottleneck while loading data #402

Open
chnk58hoang opened this issue Jan 17, 2025 · 1 comment
Open

IO Bottleneck while loading data #402

chnk58hoang opened this issue Jan 17, 2025 · 1 comment

Comments

@chnk58hoang
Copy link

chnk58hoang commented Jan 17, 2025

I'm trying to train DINO ssl with my own dataset (1.2M samples) and now the training process is very very slow although my dataset is stored as shard files. This is my dataset configuration
Image
Is there any idea to speed up my traning process ? I only use 1 gpu nvidia 3090 24gb
Thank you very much !

@JiJiJiang
Copy link
Collaborator

Resample to 8kHZ could be cpu-consuming. Try to use more num_workers (e.g. 8 or 16).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants