DDP with in-CPU-memory dataset #19543
Unanswered
DucoG
asked this question in
DDP / multi-GPU / multi-node
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi! I'm using Lightning to train a model using multiple GPUs on a large dataset. Loading the data from disk is very slow and my dataset is too big for it to fit in memory once per GPU so I've opted for the following approach but I'm not sure if it will work.
In DataModule.prepare_data I subdivide shards of the datasets over the number of GPU's provided . Making sure that every GPU has the same number of rows. The division is saved to a configuration file on disk.
In DataModule.setup, the correct data shards are loaded per rank according to the configuration file.
Making sure to set use_distributed_sampler = False and setting shuffle = True
Will this approach work? I was also wondering if there are better approaches as I haven't seen any examples like this online and it feels like im trying to reinvent the wheel but failing 😦
Beta Was this translation helpful? Give feedback.
All reactions