Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust popen timeouts for testing #8848

Merged
merged 5 commits into from
Oct 29, 2024

Conversation

fjetter
Copy link
Member

@fjetter fjetter commented Aug 30, 2024

This popen utility is often used in our CLI tests. The timeouts here are however not well adjusted to our overall test timeouts. Particularly if this is used inside of an async gen_test / gen_cluster test these timeouts do not match since the gen_cluster default test timeout is 30s.

Therefore, the async test timeout of 30s will always win over this popen timeout but timeouts should be hierarchical since they otherwise overshadow problems. The test test_nanny_worker_port_range_too_many_workers_raises for instance is failing intermittently on CI but there is no way to tell where or why it is stuck because the async timeout hits before anything else is happening.

Generally, 30s and 10s for a process termination / kill is way too generous so I suggest to dial both of them down a bit

@fjetter
Copy link
Member Author

fjetter commented Aug 30, 2024

Same goes for distributed.cli.tests.test_dask_spec.test_errors. Since the asyncio timeout hits we don't even get the subprocess output

Copy link
Contributor

github-actions bot commented Aug 30, 2024

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

    25 files  ±0      25 suites  ±0   10h 25m 5s ⏱️ + 8m 12s
 4 129 tests ±0   4 009 ✅  -  9    110 💤 ±0  10 ❌ + 9 
47 698 runs  ±0  45 589 ✅  - 13  2 095 💤 ±0  14 ❌ +13 

For more details on these failures, see this check.

Results for commit d1f5013. ± Comparison against base commit 36020d6.

♻️ This comment has been updated with latest results.

Comment on lines 1128 to 1129
except subprocess.TimeoutExpired:
print("Process did not terminate after SIGINT; sending SIGKILL")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm OK if the process does not shut down with just SIGINT so I'm catching this and just print the information

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought about this again. I think we should raise after all. Having this not work well is a little unhealthy

@fjetter fjetter merged commit 09ed8af into dask:main Oct 29, 2024
19 of 30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant