-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pipeline Parallelism (Supported? How to?) #827
Comments
we can look into this more in detail, meanwhile, have you tried using mosaicml/composer though for training? Are there specific features you are relying on in Torchtitan? |
I would really appreciate if you could look into it! TorchTitan uses There are many key features like FSDP2, 4D parallelism, FP8, and torch.compile that makes LLaMa models scale well in pretraining. You also get full control over the training loop which is desirable if you want to experiment. |
@casper-hansen So StreamingDataset's We currently enable replication through the |
Would be great to integrate the new DeviceMesh abstraction from pytorch. |
🚀 Feature Request
Supporting TP and SP seems quite easy to do with the `replication parameter:
I have tried various ways to enable PP without success (unexpected high loss). I tried adding
pp
into the equation when computingreplication
andnum_canonical_nodes
, but I cannot get it to function normally because I get an unexpected high loss.Motivation
I want to use the mosaicml streaming library with 4D parallel. Specifically, I rely on TorchTitan as my training tool and have simply swapped in the mosaicml streaming library by modifying the StreamingTextDataset implementation from LLM Foundry.
The text was updated successfully, but these errors were encountered: