-
Dear all, We have some issues with payloads matched even if the TimeLeft on the queue is too low. Looking at the pilot output, I found that the TimeLeft was caomputed correctly:
about 2.6 hours considering a CPUNormalizationFactor = 28 However a payload with:
was matched and then killed by the batch system when the CPU limit was reached. I thought that CPU Time requirement was not in normalized units and thus that it corresponds in this case to 72 hours, but maybe it's normalized and thus it explains why the payload was matched. Could you please clarify in which unit CPUTime requirement is set? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 7 replies
-
According to my response here: #5912 (comment)
I should really document that somewhere, sorry. |
Beta Was this translation helpful? Give feedback.
Well the original idea is not bad actually, just that it should be named
CPUWork
instead ofCPUTime
😅Users are supposed to run their tasks on a given computer:
CPU Power
of their machineTo get the
CPUWork
of the tasks. From there, they are supposed to submit their jobs through DIRAC.For instance,
taskA
would take 7200s (CPUTime
) at 28 DB12 units (CPUPower
) to run onWN_1
:taskA
needs 7200 x 28 = 201600 DB12.s to run (CPUWork
).Therefore, if a pilot has an allocation of 15000s on
WN_2
, a remote computing resource, which has aCPUPower
of 14 DB12 units, then the DIRAC Matcher will know that it can "safely" fetchtaskA
because: 14400 x 14 (2100…