Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PaddleOCR Multi-Machine Distributed Training #70357

Open
ntdgo opened this issue Dec 20, 2024 · 0 comments
Open

PaddleOCR Multi-Machine Distributed Training #70357

ntdgo opened this issue Dec 20, 2024 · 0 comments
Assignees
Labels

Comments

@ntdgo
Copy link

ntdgo commented Dec 20, 2024

请提出你的问题 Please ask your question

Hello,

I am currently working on training an OCR classification model in parallel across two machines and would appreciate some guidance on my setup. Below are the details of my configuration:

I have two computers:
Machine 1:

  • Windows 11
  • GPU: RTX4090
  • Public IP: 212.109.144.125
  • Port open: 6004

Machine 2:

  • Windows 11
  • GPU: RTX3090
  • Public IP: 122.109.144.229

I installed PaddlePaddle using the following command on both machines:
python -m pip install paddlepaddle-gpu==3.0.0b2 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/

The training commands I used were (https://www.paddlepaddle.org.cn/documentation/docs/en/api/paddle/distributed/launch_en.html):

On Machine 1:

python -m paddle.distributed.launch --gpus 0 --master=192.168.1.123:6004 ./PaddleOCR/tools/train.py -c ./configs/cls.yml

On Machine 2:

python -m paddle.distributed.launch --gpus 0 --master=212.109.144.125:6004 ./PaddleOCR/tools/train.py -c ./configs/cls.yml

However, when I start the training, it seems that the two machines are not able to establish a connection and work together as expected. I am wondering if there might be an issue with my setup or the configuration of the training commands.

Could anyone help me identify what might be wrong or suggest how to fix this?

Thank you in advance for your assistance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants