Delays in pilot/JobAgent cycles #6106
-
Technically this is @sfayer's question....
but there are no more waiting jobs, so I would expect it to exit after 10 cycles. It's now more than 2 h later.
Doing a horrible strace hack we find:
which looks a bit like a red herring. Simon reckons something is stuck, but what ? |
Beta Was this translation helpful? Give feedback.
Replies: 9 comments 7 replies
-
Can I see the meaningful part of the result of Also, when you say that "it works in v7r3", do you mean py2 or py3? |
Beta Was this translation helpful? Give feedback.
-
Our v7r3 is all py3, at least we have the Python3Pilots = True flag set, that should do it, no ? |
Beta Was this translation helpful? Give feedback.
-
In a possibly related issue, Simon thinks that you are buffering the whole pilot output in here: |
Beta Was this translation helpful? Give feedback.
-
@martynia You are running v7r3 which should run python3 pilots automatically unless you specifically tell them otherwise. (Which you don't.) |
Beta Was this translation helpful? Give feedback.
-
To answer Federico's question from yesterday. When the node sits there not running any jobs, but also not shutting down I see:
|
Beta Was this translation helpful? Give feedback.
-
And here is the pilot log - note that there hasn't been an update for over 60 min:
|
Beta Was this translation helpful? Give feedback.
-
This actually opens another issue. In my current remote logging system:
This does not addreess an issue when you want to flush a buffer after some period of time , not after n lines. This would help to debug even w/o setting n=1. JM |
Beta Was this translation helpful? Give feedback.
-
This is probably more one for @sfayer :
(it's right now 16:09 UTC.) Any creative ideas anyone ? |
Beta Was this translation helpful? Give feedback.
-
@fstagni I've moved the delay in the pilot finishing to a separate issue: #6116 |
Beta Was this translation helpful? Give feedback.
@fstagni I've moved the delay in the pilot finishing to a separate issue: #6116
@martynia: Can you make a separate discussion for the logging stuff if needed ?