-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QUESTION]How to Use ndtimeline in a Multi-Machine Multi-GPU Environment #55
Comments
Thank you for your interest in veScale! |
“MQHandler” stands for “message queue handler”. We tend to use message queue (MQ) to send metric data. |
thanks! "But I still don't know how to write the MQHandler code. Do I need to create a separate script as a producer to receive messages from consumers? That is, each rank sends its own record information to the consumer, and the corresponding producer receives the rank-record information from different nodes." |
Does it support Muti-Machine and Muti-GPU to use ndtimeline??
Now,I can use single-Machine and Muti-GPU to analyze GPT with the ndtimeline tool,
but I wandered does it support Muti-machine?? how to flush the ndtimeline? and how to deal with the commnication ? how to define custom time-event ?
Thanks!
the following picture using single-Machine and four GPUs
@MingjiHan99 @pengyanghua @MackZackA @JsBlueCat @Meteorix
The text was updated successfully, but these errors were encountered: