Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use distributed-training in single client for llm #4925

Open
hejxiang opened this issue Feb 11, 2025 · 0 comments
Open

How to use distributed-training in single client for llm #4925

hejxiang opened this issue Feb 11, 2025 · 0 comments

Comments

@hejxiang
Copy link

What is your question?

I have successfully completed the federated training of LLM following the link below.
https://flower.ai/docs/framework/tutorial-quickstart-huggingface.html

Now I want to train larger model, so how can I perform fine-tuning of LLM using distributed training on a single client?
I would like to use multiple GPUs or multiple servers on one client. Can I use something like 'accelerate launch', 'torch run', or 'deepspeed' in client ?

https://huggingface.co/docs/trl/example_overview#distributed-training

Thanks !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant