Is there any monitor tool for cluster system resources ? #7184
Replies: 6 comments
-
@ piiswrong |
Beta Was this translation helpful? Give feedback.
-
I think |
Beta Was this translation helpful? Give feedback.
-
@formath |
Beta Was this translation helpful? Give feedback.
-
@salemmohammed Yarn is ok for ML like TensorFlow on Yarn or MXNet on Yarn. I guess your meaning is MR or Spark on Yarn is not ok for ML. The only thing that MXNet on Yarn not works to your purpose is that the Yarn part of |
Beta Was this translation helpful? Give feedback.
-
@apache/mxnet-committers: This issue has been inactive for the past 90 days. It has no label and needs triage. For general "how-to" questions, our user forum (and Chinese version) is a good place to get help. |
Beta Was this translation helpful? Give feedback.
-
proposed labels: "Question", "Discussion", "Feature request" |
Beta Was this translation helpful? Give feedback.
-
Hi all,
I am running MXNet on a cluster machines. Is there any code or tool in MXNet system provide how resources are used ? example how much CPU, Memory, and network are being used while I am doing the model training.
I know there are many outside monitoring cluster tool such as Ganglia but I am wondering if there is something built-in.
Sincerely
Beta Was this translation helpful? Give feedback.
All reactions