-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide in the FWJR the CPU and wallclock time for CMSSW subprocesses #11665
Conversation
Jenkins results:
|
@amaltaro I made necessary changes to provide CPU/Mem and time info for each cmsRun job and it would be useful if you'll provide initial review. I still need to adjust unit tests though. |
Jenkins results:
|
Jenkins results:
|
Jenkins results:
|
Jenkins results:
|
Jenkins results:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vkuznet I left a brief review along the code.
Jenkins results:
|
@amaltaro , I spent more time with this PR and end-up with the following observation and changes:
Here I show how I address all aforementioned issues using pure python solution:
Running this script on darwin platform I get the following output:
While running the same command on Debian linux I get the following (I only changed directory for tree command, here I changed
Few observations:
The changes now are committed and I'll wait for unit tests to finish. It also allowed me to address your second issue with changing functionality as I returned to Please review again and ask your question or provide your feedback. |
Jenkins results:
|
Jenkins results:
|
Jenkins results:
|
Alan, I was able to successfully run unit tests with new PR implementation, but I step down from using |
hi @vkuznet So reading your comment above:
May I ask, if you and @amaltaro have already converged on using I am asking because in the issue I am trying to resolve, we will have to support/test a large range of linux kernel versions: 3.* to 6.* And since there have been a change in the |
Sorry, it seems, there is a detail worth mentioning here.
It seems in some cases |
Jenkins results:
|
I just want to note that in #11604 I requested the elapsed wall clock time to be accompanied with a timestamp of the moment the cmsRun was started. |
@makortel Matti, that was an overlook from my side. Valentin did start this implementation by providing start/end timestamps, while I thought we only wanted to know the wallclock time (elapsed time). I guess the wallclocktime-only has been requested for the whole job wrapper, while for each cmsRun you want to have:
Could you please confirm this? |
Actually we'd like to have the start timestamp also for the job wrapper (e.g. to see the time difference from job wrapper startup to starting up cmsRun).
Either 1+2 or 1+3 would suffice. I don't think all three are needed as (2 or 3) can be calculated from 1 and (3 or 2). As for the start timestamp format, I'd think the Unix time to be most convenient. |
userTime = getrusage(RUSAGE_CHILDREN).ru_utime | ||
sysTime = getrusage(RUSAGE_CHILDREN).ru_stime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I've understood, the getrusage()
returns the resources the process (tree) has consumed up to the call point. Therefore, in order to record only the resources consumed by the subprocess launched by the Popen()
call, the user and system time would have to be recorded both before and after the call to Popen()
, and their difference would be the ones stored to the userTime
and sysTime
member variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Matti, good catch, I think you are right here. And I adjusted the code accordingly, see this commit c70cedd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good by eye, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try to add appropriate unit test soon to measure the difference of times of main code flaw and with subprocess call
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Matti, have a look at new unit test in this commit: 9216dae where I tested logic of getting subprocess time based on your observation.
Jenkins results:
|
Jenkins results:
|
Alan, this PR is mostly complete and I need your initial review to understand if all necessary pieces are in place. Please have a look at your convenience. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Valentin, this code is looking good in general. There are just some minor adjustments for you to consider.
@@ -273,7 +275,27 @@ def execute(self, emulator=None): | |||
|
|||
os.environ.update(envOverride) | |||
|
|||
returnCode = subprocess.call(args, stdout=stdoutHandle, stderr=stderrHandle) | |||
startTime = time.time() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great to move all of this under a dedicated method, but given that we have to pass in many parameters and retrieve many out of the method, it might be uglier than keeping this code in this method. Anyhow, if you see I way to isolate all these juggling, please go ahead!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alan, I made code better and safer (by using with
instead of manual close which can never happen due to try/except block) but I think I didn't find way to make it as independent function though and probably better to keep this way.
Jenkins results:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vkuznet Valentin, these changes are looking good to me. Before merging it though, I wanted to ask the following:
a) should we convert all these timing metrics to integers of seconds instead of float? I don't think a fraction of a second hurts these metrics in any shape and form.
b) could you please run another test workflow with this patch in and point me to one of the job logs? I wanted to make sure that we don't messed up with the worker node logging.
c) where is your development for parsing the FJR and pushing it to WMArchive (through the ArchiveDataReporter component)? I assume you are adding it in the PR meant to close this issue #11657?
Alan, regarding your questions: |
@makortel Matti, for the timing measurements, do you have any preference of data type/precision? Right now it's float, but I think seconds in integer would be more than enough. Please let us know your opinion on this. Valentin, for the point c) above, you need to check the this is how the WMArchive documents gets constructed and it's not clear to me that a new section in the FJR will automatically make it to the final WMArchiver document. |
@amaltaro , I added appropriate PR (#11692) which provides metrics to WMArchive service but I was under impression that we separate creation of metrics and their propagation to WMArchive anyway because it would be much easier to test rather then having so many PRs at the same time (I start preparing doc about CPU/Memory, Timing, CMSSW metrics and it has quite a lot of PRs). |
We would prefer float, so that if a less-than-second measurement occurs, a value different from zero would be visible (for exact 0 one could wonder if the value was actually measured or if the 0 is just a default value). But technically integer would work too (time durations of interest are longer than a second). |
@amaltaro , I just checked the job from
and, it is transformed into JSON as following (only showing relevant section):
For completeness, here are the links to full report in different formats: |
For completeness here is a simple script to transform pkl report to dot (txt) and JSON representations
|
663b68b
to
429078f
Compare
Jenkins results:
|
psutil
moduleThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes are looking good to me, thanks!
To avoid another iteration, I updated the PR title and will have to manually fix the commit message when constructing the release notes. Just FYI
UPDATE: PR description was outdated, I am updating it now.
As we keep discussing this and try to have the whole chain functional, I have some updates that are worth sharing.
we can properly find it. Another alternative to see the whole structure of this object is with:
Having said that, the place to debug this further would be: which generates the relevant information stored in the wmagent_jobdump/fwjrs CouchDB database. FYI @vkuznet PS.: Note that the overall timing section defined at the wrapper and called |
@amaltaro , thanks for looking into this but now I doubt that ChangeState.py is a right place to dig in. Instead, as I expected I found that there is internal MariaDB database on WMAgent which has
I would appreciate if you'll review this logic and let me know your feedback. To implement it I'll need to know how to pass MariaDB URI to |
Changing the data source is definitely NOT the right way to proceed. If ArchiveDataReporter is fed with the FJRs stored in CouchDB, what we need to do is to ensure that the CouchDB documents have the proper content. Please investigate the code I referenced above in ChangeState to ensure that the |
@amaltaro , I do not understand what do you mean by saying "changing data source", which data source? In my view there two different FJR in a system. One is generated within job, this is the one where we put metrics. It is written to the pkl file when job is done. And, another which is create by some component in WMA which writes and update it in local CouchDB. If this is the case in order to get CMSSW, or any other job related metrics we need to read job FJR from pkl file because this is where metrics about job are stored. We can read it in ArchiveDataReporter or in ChangeState to complement the FJR which is circulating in WMA and not associated with a job. We verified already that |
Fixes #11660
Status
In development
Description
This PR provides a new FWJR section called
WMCMSSWSubprocess
, which contains the job start and end time (local time). In addition, it also provides wallclocktime, user and system time.Note that this information is not yet propagated through WMArchive, which will be provided with this issue: #11657
Is it backward compatible (if not, which system it affects?)
YES
Related PRs
#11656 , #11657, #11663
External dependencies / deployment changes
None