-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Tests] Add more logging for runtime test timeouts #109998
base: main
Are you sure you want to change the base?
Conversation
Tagging subscribers to this area: @hoyosjs |
@@ -642,6 +642,8 @@ public static bool TryPrintStackTraceFromDmp(string dmpFile, TextWriter outputWr | |||
// The children are sorted in the order they should be dumped | |||
static unsafe IEnumerable<Process> FindChildProcessesByName(Process process, string childName) | |||
{ | |||
Console.WriteLine($"Finding all child processes of '{process.ProcessName}' (ID: {process.Id}) with name '{childName}'"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everywhere else in this code I see an outputWriter being used but in a few spots I also see Console.WriteLine() being used. Can you tell what the distinction is? Its not clear to me when we want to use one and when we use the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the outputWriter
+ errorWriter
are used to write each test's output.txt
+ error.txt
.
I haven't yet pinpointed the logic for how each helix workitem's console.<id>.log
is created, but it looks to me that the console.<id>.log
will stitch together each individually generated xunit test wrapper's error.txt
and output.txt
in addition to the Console.WriteLine
s used.
I think the output.txt
would be used to validate the test result, error.txt
would be used to capture any test specific failures, and any other failures that show up in console.<id>.log
are probably non test-specific (i.e. infrastructure or timeout?) failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the outputWriter + errorWriter are used to write each test's output.txt + error.txt.
Sounds reasonable to me. As long as the output will be visible in the CI output I'm happy but hopefully there is some owner for this code who can confirm we are following the intended convention :)
@@ -788,6 +791,12 @@ public int RunTest(string executable, string outputFile, string errorFile, strin | |||
executable, timeout, (environmentVar != null) ? " from variable " : "", (environmentVar != null) ? TIMEOUT_ENVIRONMENT_VAR : "", | |||
startTime.ToString(), endTime.ToString()); | |||
|
|||
Console.WriteLine("Timed Out with Active Processes:"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The log above said we timed out so I don't think we need to repeat that again. Perhaps output like this:
cmdLine: blablabla Timed Out (timeout in milliseconds: XYZ, start: A, end: B)
Collecting diagnostic information...
Snapshot of processes currently running:
ID ParentId CmdLine
145 9001 Foo.exe arg1 arg2
1458 145 Blah.exe something something
10632 2056 corerun.exe SomeTest.dll --whatever arg more_args
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we already have a straightforward way to grab the ParentIds and the CommandLine args of the process?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was hoping that either the existing enumeration code that handles GetChildren() would have it or you could get it from Process.GetProcesses(). If neither of them provide an easy option feel free to simplify it as much as needed. Maybe it would just be ID and ProcessName as a simpler alternative.
e7a053c
to
9a775c2
Compare
This reverts commit 9a775c2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fine to me, but hopefully at least one of the past editors/owners of this file will also be able to take a look. My knowledge of it is minimal.
The runtime testing infrastructure collects dumps of the test process and all of its
corerun
child processes whenever the runtime test times out.runtime/src/tests/Common/Coreclr.TestWrapper/CoreclrTestWrapperLib.cs
Lines 795 to 807 in e3b3aaa
However, there have been instances where a runtime test times out due to waiting for a subprocess to finish, yet there are logs indicating that the subprocess has finished. See #104670 (comment). Its not clear whether 1) the subprocess is indeed hanging (yet no crashdump is collected) or 2) the subprocess completed yet failed to signal back to the parent process (no crashdump to collect).
This PR aims to clarify, if the Id of the subprocess that the main test process is waiting for is known, whether that Id is part of the list of active processes at timeout.