Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-17459: Stablize reassign_partitions_test.py #17250

Merged
merged 4 commits into from
Sep 24, 2024

Conversation

frankvicky
Copy link
Collaborator

JIRA: KAFKA-17459

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@github-actions github-actions bot added the tests Test fixes (including flaky tests) label Sep 22, 2024
@frankvicky
Copy link
Collaborator Author

frankvicky commented Sep 22, 2024

Test result on my local machine (branch 3.9)

================================================================================
SESSION REPORT (ALL TESTS)
ducktape version: 0.11.4
session_id:       2024-09-22--013
run time:         36 minutes 49.519 seconds
tests run:        16
passed:           16
flaky:            0
failed:           0
ignored:          0
================================================================================
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=False.reassign_from_offset_zero=False.metadata_quorum=ISOLATED_KRAFT.use_new_coordinator=False
status:     PASS
run time:   2 minutes 19.969 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=False.reassign_from_offset_zero=False.metadata_quorum=ISOLATED_KRAFT.use_new_coordinator=True.group_protocol=classic
status:     PASS
run time:   2 minutes 20.799 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=False.reassign_from_offset_zero=False.metadata_quorum=ISOLATED_KRAFT.use_new_coordinator=True.group_protocol=consumer
status:     PASS
run time:   2 minutes 25.826 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=False.reassign_from_offset_zero=False.metadata_quorum=ZK.use_new_coordinator=False
status:     PASS
run time:   2 minutes 35.548 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=False.reassign_from_offset_zero=True.metadata_quorum=ISOLATED_KRAFT.use_new_coordinator=False
status:     PASS
run time:   1 minute 55.080 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=False.reassign_from_offset_zero=True.metadata_quorum=ISOLATED_KRAFT.use_new_coordinator=True.group_protocol=classic
status:     PASS
run time:   1 minute 54.821 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=False.reassign_from_offset_zero=True.metadata_quorum=ISOLATED_KRAFT.use_new_coordinator=True.group_protocol=consumer
status:     PASS
run time:   2 minutes 1.341 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=False.reassign_from_offset_zero=True.metadata_quorum=ZK.use_new_coordinator=False
status:     PASS
run time:   2 minutes 5.771 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=True.reassign_from_offset_zero=False.metadata_quorum=ISOLATED_KRAFT.use_new_coordinator=False
status:     PASS
run time:   2 minutes 34.628 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=True.reassign_from_offset_zero=False.metadata_quorum=ISOLATED_KRAFT.use_new_coordinator=True.group_protocol=classic
status:     PASS
run time:   2 minutes 33.114 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=True.reassign_from_offset_zero=False.metadata_quorum=ISOLATED_KRAFT.use_new_coordinator=True.group_protocol=consumer
status:     PASS
run time:   2 minutes 41.446 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=True.reassign_from_offset_zero=False.metadata_quorum=ZK.use_new_coordinator=False
status:     PASS
run time:   2 minutes 40.422 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=True.reassign_from_offset_zero=True.metadata_quorum=ISOLATED_KRAFT.use_new_coordinator=False
status:     PASS
run time:   2 minutes 6.443 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=True.reassign_from_offset_zero=True.metadata_quorum=ISOLATED_KRAFT.use_new_coordinator=True.group_protocol=classic
status:     PASS
run time:   2 minutes 8.370 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=True.reassign_from_offset_zero=True.metadata_quorum=ISOLATED_KRAFT.use_new_coordinator=True.group_protocol=consumer
status:     PASS
run time:   2 minutes 11.628 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=True.reassign_from_offset_zero=True.metadata_quorum=ZK.use_new_coordinator=False
status:     PASS
run time:   2 minutes 13.226 seconds
--------------------------------------------------------------------------------

@frankvicky frankvicky marked this pull request as ready for review September 22, 2024 15:46
Copy link
Contributor

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@frankvicky thanks for this fix. Could you please test this patch for 3.8 as we will have 3.8.1 release

@@ -176,7 +176,7 @@ def test_reassign_partitions(self, bounce_brokers, reassign_from_offset_zero, me
self.producer = VerifiableProducer(self.test_context, self.num_producers,
self.kafka, self.topic,
throughput=self.producer_throughput,
enable_idempotence=True)
enable_idempotence=True, repeating_keys=100)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add comments?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. 😺

@frankvicky
Copy link
Collaborator Author

Hello @chia7712
Following is the test result on branch 3.8:

================================================================================
SESSION REPORT (ALL TESTS)
ducktape version: 0.11.4
session_id:       2024-09-22--001
run time:         41 minutes 13.783 seconds
tests run:        16
passed:           16
flaky:            0
failed:           0
ignored:          0
================================================================================
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=False.reassign_from_offset_zero=False.metadata_quorum=ISOLATED_KRAFT.use_new_coordinator=False
status:     PASS
run time:   2 minutes 35.159 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=False.reassign_from_offset_zero=False.metadata_quorum=ISOLATED_KRAFT.use_new_coordinator=True.group_protocol=classic
status:     PASS
run time:   2 minutes 35.229 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=False.reassign_from_offset_zero=False.metadata_quorum=ISOLATED_KRAFT.use_new_coordinator=True.group_protocol=consumer
status:     PASS
run time:   2 minutes 39.373 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=False.reassign_from_offset_zero=False.metadata_quorum=ZK.use_new_coordinator=False
status:     PASS
run time:   2 minutes 51.605 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=False.reassign_from_offset_zero=True.metadata_quorum=ISOLATED_KRAFT.use_new_coordinator=False
status:     PASS
run time:   2 minutes 10.545 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=False.reassign_from_offset_zero=True.metadata_quorum=ISOLATED_KRAFT.use_new_coordinator=True.group_protocol=classic
status:     PASS
run time:   2 minutes 4.707 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=False.reassign_from_offset_zero=True.metadata_quorum=ISOLATED_KRAFT.use_new_coordinator=True.group_protocol=consumer
status:     PASS
run time:   2 minutes 13.470 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=False.reassign_from_offset_zero=True.metadata_quorum=ZK.use_new_coordinator=False
status:     PASS
run time:   2 minutes 22.108 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=True.reassign_from_offset_zero=False.metadata_quorum=ISOLATED_KRAFT.use_new_coordinator=False
status:     PASS
run time:   2 minutes 44.410 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=True.reassign_from_offset_zero=False.metadata_quorum=ISOLATED_KRAFT.use_new_coordinator=True.group_protocol=classic
status:     PASS
run time:   2 minutes 54.032 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=True.reassign_from_offset_zero=False.metadata_quorum=ISOLATED_KRAFT.use_new_coordinator=True.group_protocol=consumer
status:     PASS
run time:   2 minutes 53.711 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=True.reassign_from_offset_zero=False.metadata_quorum=ZK.use_new_coordinator=False
status:     PASS
run time:   3 minutes 2.122 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=True.reassign_from_offset_zero=True.metadata_quorum=ISOLATED_KRAFT.use_new_coordinator=False
status:     PASS
run time:   2 minutes 29.158 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=True.reassign_from_offset_zero=True.metadata_quorum=ISOLATED_KRAFT.use_new_coordinator=True.group_protocol=classic
status:     PASS
run time:   2 minutes 25.793 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=True.reassign_from_offset_zero=True.metadata_quorum=ISOLATED_KRAFT.use_new_coordinator=True.group_protocol=consumer
status:     PASS
run time:   2 minutes 33.508 seconds
--------------------------------------------------------------------------------
test_id:    kafkatest.tests.core.reassign_partitions_test.ReassignPartitionsTest.test_reassign_partitions.bounce_brokers=True.reassign_from_offset_zero=True.metadata_quorum=ZK.use_new_coordinator=False
status:     PASS
run time:   2 minutes 37.434 seconds
--------------------------------------------------------------------------------

enable_idempotence=True)
enable_idempotence=True,
# To avoid the reassignment behavior being affected by `partitioner.ignore.keys=true`,
# we set a key for the message to ensure the message order is preserved within the partition.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need this change due to the test behavior, so it would be helpful to emphasize that point.

This test expects that each partition can receive the record, so using a non-null key helps distribute the records more randomly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, I will add some context.

enable_idempotence=True)
enable_idempotence=True,
# This test aims to verify the reassignment without failure, assuming that all partitions have data.
# To avoid the reassignment behavior being affected by `partitioner.ignore.keys=true`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This description is not accurate as the default value of partitioner.ignore.keys is false, and the root cause is that the BuiltInPartitioner tries to send requests to the fast brokers.

Could you please keep the description "This test aims to verify the reassignment without failure, assuming that all partitions have data" and "we set a key for the message to ensure both even data distribution across all partitions"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that I have some misunderstanding before.
I have updated the comment, please take a look 😺

@chia7712 chia7712 merged commit f51fc16 into apache:trunk Sep 24, 2024
9 of 10 checks passed
chia7712 pushed a commit that referenced this pull request Sep 24, 2024
This test expects that each partition can receive the record, so using a non-null key helps distribute the records more randomly.

Reviewers: Chia-Ping Tsai <[email protected]>
chia7712 pushed a commit that referenced this pull request Sep 24, 2024
This test expects that each partition can receive the record, so using a non-null key helps distribute the records more randomly.

Reviewers: Chia-Ping Tsai <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests Test fixes (including flaky tests)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants