-
-
Notifications
You must be signed in to change notification settings - Fork 379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
check pid exists when waiting for child #388
check pid exists when waiting for child #388
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #388 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 21 21
Lines 2249 2276 +27
=========================================
+ Hits 2249 2276 +27 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks good, a couple of comments.
@@ -590,7 +590,16 @@ def wait_children_finish(self) -> None: | |||
try: | |||
if any((f.endswith(".viztmp") for f in os.listdir(self.multiprocess_output_dir))): | |||
same_line_print("Wait for child processes to finish, Ctrl+C to skip") | |||
while any((f.endswith(".viztmp") for f in os.listdir(self.multiprocess_output_dir))): | |||
while True: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logic here can be simplified. You don't need to check if remain_viztmp
is empty if you need to iterate through it. Do the for loop directly and your remaining logic will take care of it.
int(f[7:-12])
is a horrible magic number. Yit would be better to use regular expression here because it's self-explainable. Also, if there happen to be a file that does not match the pattern, this implementation will raise an index error or a integer conversion error, which would be a huge confusion to users. For regular expression, if it does not match, you can issue a warning (might be something we change in other places but forgot to match here).
You probably do not need a remain_children
variable, you can iterate through remain_viztmp
directly.
src/viztracer/util.py
Outdated
elif err.errno == errno.EPERM: | ||
# EPERM clearly means there's a process to deny access to | ||
# UNIX | ||
if sys.platform != "win32": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make the logic "positive" here. Do if sys.platform == "win32"
then windows logic, and Unix in else
with a comment stating this is Unix.
tests/test_regression.py
Outdated
|
||
class TestWaitForChild(CmdlineTmpl): | ||
def test_child_process_exits_normally(self): | ||
self.template(["viztracer", "-o", "result.json", "--dump_raw", "cmdline_test.py"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not think too much but is --dump_raw
necessary here?
src/viztracer/main.py
Outdated
if pid_exists(pid): | ||
break | ||
else: # pragma: no cover | ||
color_print("WARNING", f"Can't parse {viztmp_file}, skip") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warning messge is a bit confusing. Let's do something like "Unknown viztmp file {viztmp_file}"
src/viztracer/util.py
Outdated
# Usually it's impossible to run here in viztracer. | ||
return True | ||
return False # pragma: no cover | ||
# UNIX |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment should be inside else
case, not above it.
tests/test_regression.py
Outdated
p.start() | ||
# The main process will join the child in multiprocessing.process._children. | ||
# This is a hack to make sure the main process won't join the child process, | ||
# so we can test the wait_children_finish function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For both comments, let's mention the class (I think it's VizUI.wait_children_finish
?). Because we are discussing in the context of multiprocessing
here and it's easy for people to mistaken this function as a multiprocessing
function.
Fixed it! |
I believe this PR is making the nightly validation unstable - https://github.com/gaogaotiantian/viztracer/actions/runs/7539946132/job/20523513962 (not sure if you have access to it). The issue is the child process only lives for 1 second, and the parent process might need more to reach the trace collection part. github actions is normally much slower than our PC. There are multiple tests that use different timeout on github actions (search for |
I agree with that if using a much slower PC the test will fail. It takes about 0.4 seconds on my computer between |
Added child check when waiting for children to finish, in case that child failed to delete viztmp file. The current
pid_exists
has a issue that if a process return 259 (STILL_ACTIVE) as exit code (which is a really rare case that is nearly impossible I guess).