Polyaxon Python API - How to get run logs and save to file #1523
Unanswered
QaisarRajput
asked this question in
Q&A
Replies: 1 comment
-
Most probably the issue is that you are using the default artifacts store, which is a temp dir, and you have a multi-node cluster, see this. You would need to deploy polyaxon with an artifacts store that is accessible by all nodes, like S3 The artifacts store is the only required connection to be configured with a Polyaxon deployment. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
Context:
I have been running some experiments on EKS. Its working great, but my logs disappear after the run execution. Also while the execution is happening, after arbitrary time pod disconnects and previous logs are lost. EKS/polyaxon/mpi recovers the jobs execution and Launcher pod starts the training from where disconnect happened.
Issue:
The issue is that i want to retain the logs of my runs. I am not able to use persistent volumes yet which can be a solution. What i am trying to use is the polyaxon python api. More specifically i am using RunClient and looking at
get_logs()
andwatch_logs()
.get_logs()' is not returning anything and i think its not intended for this.
watch_logs()` is returning the logs but issue is, its not technically "returning" anything. It seems to be like a stream function, which stdouts on console (jupyter, shell). In my code i am not able to get the logs with this, as it keeps on printing without stop.Question
Is there another way to get the logs through python api? I intend to keep saving snapshot of logs so that even if disconnection happens i can then join the log files later. Open to any suggestions. FYI, i have tried cli too.
polyaxon ops logs -f
its giving me encoding issues.Beta Was this translation helpful? Give feedback.
All reactions