Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-17646: Fix flaky KafkaStreamsTest.testStateGlobalThreadClose #17310

Conversation

chenyulin0719
Copy link
Contributor

@chenyulin0719 chenyulin0719 commented Sep 28, 2024

It's regarding KAFKA-17646.

Two issues found in the flaky test:

  1. Racing between streams.close() and shutdownHelper() threads: Resolved it by moving the streams.close() call after confirming the KafkaStreams state is in ERROR state.
  2. Checking for short-lived PENDING_ERROR state: Resolved it by changing the assertion to check the log content State transition from RUNNING to PENDING_ERROR.

Put the verbose explanation under Jira comment.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@github-actions github-actions bot added streams tests Test fixes (including flaky tests) labels Sep 28, 2024
@chenyulin0719 chenyulin0719 changed the title KAFKA-17515: Fix flaky KafkaStreamsTest.testStateGlobalThreadClose KAFKA-17646: Fix flaky KafkaStreamsTest.testStateGlobalThreadClose Sep 28, 2024
Copy link
Contributor

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chenyulin0719 thanks for your patch. overall +1

@chenyulin0719 chenyulin0719 force-pushed the KAFKA-17646-fix-flaky-testStateGlobalThreadClose branch from d6571d0 to 06f299f Compare September 30, 2024 04:56
@chenyulin0719
Copy link
Contributor Author

@chia7712 Thanks for the review. I updated the PR as suggested.

Copy link
Member

@mjsax mjsax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. One question. Won't we leak the thread if we don't join it?

@@ -540,21 +540,18 @@ public void testStateGlobalThreadClose() throws Exception {
waitForCondition(
() -> globalStreamThread.state() == GlobalStreamThread.State.DEAD,
"Thread never stopped.");
globalStreamThread.join();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we remove this line?

Copy link
Contributor Author

@chenyulin0719 chenyulin0719 Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mjsax, there will be no GlobalStreamThread thread triggered since we mock the GlobalStreamThread.start() to change the state.

doAnswer(invocation -> {
globalThreadState.set(GlobalStreamThread.State.RUNNING);
threadStateListenerCapture.getValue().onChange(mock,
GlobalStreamThread.State.RUNNING,
GlobalStreamThread.State.CREATED);
return null;
}).when(mock).start();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. Thanks! Missed this part.

Copy link
Contributor

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

loop 200 times, all pass

@chia7712 chia7712 merged commit 4c90d35 into apache:trunk Oct 1, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-approved streams tests Test fixes (including flaky tests)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants