-
Notifications
You must be signed in to change notification settings - Fork 14k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MINOR: Cache topic resolution in TopicIds set #17285
MINOR: Cache topic resolution in TopicIds set #17285
Conversation
Looking up topics in a TopicsImage is relatively slow. Cache the results in TopicIds to improve assignor performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@squah-confluent Thanks for the patch. I left a few nits.
* | ||
* Provides an implementation of equals and hashCode based on the underlying TopicsImage. | ||
*/ | ||
public abstract static class BaseTopicResolver implements TopicResolver { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Should we keep it private?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it doesn't need to be public.
return result; | ||
} | ||
|
||
@Override | ||
public String toString() { | ||
return "TopicIds(topicNames=" + topicNames + | ||
", image=" + image + | ||
", resolver=" + resolver + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: If we toString it here, should we add toString methods to the resolvers too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add a toString implementation that returns <class name>(image=<image>)
.
Reviewers: David Arthur <[email protected]>
…ly for new consumer with poll(0) (apache#16982) Reviewers: Lianet Magrans <[email protected]>, TaiJuWu <[email protected]>, Kirk True <[email protected]>, TengYao Chi <[email protected]>
…he#16719) Implement server side changes for epoch bump but keep EndTxn as an unstable API until the client side changes are implemented. EndTxnResponse will return the producer ID and epoch for the transaction. Introduces new tagged fields to the TransactionLogValue to persist the clientTransactionVersion, previousProducerId, and nextProducerId to the log so that the state can be reloaded. See KIP-890 for more details. Small updates to naming of lastProducerId -> PreviousProducerId. Also cleans up the many TransactionMetadata constructors. Reviewers: Artem Livshits <[email protected]>, David Jacot <[email protected]>
…active controller is removed (apache#17146) This change fixes a few issues. KAFKA-17608; KRaft controller crashes when active controller is removed When a control batch is committed, the quorum controller currently increases the last stable offset but fails to create a snapshot for that offset. This causes an issue if the quorum controller renounces and needs to revert to that offset (which has no snapshot present). Since the control batches are no-ops for the quorum controller, it does not need to update its offsets for control records. We skip handle commit logic for control batches. KAFKA-17604; Describe quorum output missing added voters endpoints Describe quorum output will miss endpoints of voters which were added via AddRaftVoter. This is due to a bug in LeaderState's updateVoterAndObserverStates which will pull replica state from observer states map (which does not include endpoints). The fix is to populate endpoints from the lastVoterSet passed into the method. Reviewers: José Armando García Sancio <[email protected]>, Colin P. McCabe <[email protected]>, Chia-Ping Tsai <[email protected]>
…pache#17288) While running large scale performance tests, we noticed that the logging on the ConsumerGroupHeartbeat path took a significant amount of CPU. It is mainly due to the very large data structures that we print out. I made a pass on those logs and I switched some of them to debug. Reviewers: Lianet Magrans <[email protected]>
Reviewers: Chia-Ping Tsai <[email protected]>
…consumer.poll (apache#17165) Reviewers: Lianet Magrans <[email protected]>, TaiJuWu <[email protected]>
…java class (apache#16912) Reviewers: Chia-Ping Tsai <[email protected]>
Reviewers: Chia-Ping Tsai <[email protected]>
…the feature command tool (apache#17093) This patch belongs to the ongoing efforts of implementing KIP-1022. Added feature-dependencies command to look up dependencies for a given feature version supplied by --feature flag. If the feature is not known or the version not yet defined, we throw an error. Examples : bin/kafka-feature feature-dependencies --feature transaction.version=2 transaction.version=2 requires: metadata.version=4 (3.3-IV0) (listing any other version dependencies) bin/kafka-feature feature-dependencies --feature metadata.version=17 metadata.version=17 (3.7-IV2) has no dependencies Reviewers: Justine Olshan <[email protected]>, Artem Livshits <[email protected]>
…pache#17182) This PR simply StreamsMetricsImpl to avoid passing in the unused "metric version" parameter. Reviewers: Matthias J. Sax <[email protected]>
…17259) With EOSv1 removed, we don't need to create a producer per task, and thus can simplify the code by removing KafkaClientSupplier from the deeply nested StreamsProducer, to simplify the code. Reviewers: Bill Bejeck <[email protected]>
This patch allows workflow runs to be controlled by the ci-approved label on Pull Requests. Rather than manually approving each workflow run explicitly, committers can now add the appropriate label and the new "CI Requested" and "PR Labeled" workflows will auto-approve the requested run. Reviewers: Chia-Ping Tsai <[email protected]>
…-common (apache#17289) Reviewers: Chia-Ping Tsai <[email protected]>, David Arthur <[email protected]>
…pache#17295) This is a small patch to change the default value of group.consumer.migration.policy to BIDIRECTIONAL. Reviewers: David Jacot <[email protected]>
…che#17244) Reviewers: Kirk True <[email protected]>, Lianet Magrans <[email protected]>
Reviewers: Colin P. McCabe <[email protected]>, Chia-Ping Tsai <[email protected]>, David Arthur <[email protected]>
… tests from EmbeddedZookeeper to KRaft (apache#17016) Migrate the EmbeddedKafkaCluster from the EmbeddedZookeeper to KRaft Reviewers Bill Bejeck <[email protected]>
Fixes a regression introduced by apache#16669 which inadvertently stopped processing SCRAM arguments from kafka-storage.sh Reviewers: Colin P. McCabe <[email protected]>, Federico Valeri <[email protected]>
Reviewers: Chia-Ping Tsai <[email protected]>
apache#17215) Reviewers: Matthias J. Sax <[email protected]>
This reverts commit 1f9c7ae. This reverts commit 0b97fa6. This reverts commit d9254d5. This reverts commit b905a9f. This reverts commit f438bb4. This reverts commit f0a2d43. This reverts commit 945023d. This reverts commit 2df3ff0. This reverts commit 3b555b8. This reverts commit 2059889. This reverts commit fe5056e. This reverts commit 0589e93. This reverts commit 99e9f81. This reverts commit 813d79c. This reverts commit 5aaf623. This reverts commit 0d8485a. This reverts commit 25b048e. This reverts commit 536dfbb. This reverts commit 655cddd. This reverts commit 48c022c. This reverts commit 03b4cdc. This reverts commit a59e8c5.
…-topic-resolution
…resolution Updates for apache#17285
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@squah-confluent, Seeing that the cache is not actually used outside of tests and benchmarks, I'm guessing this is still WIP.
Does this come down to performance differences between HashMap and PCollectionsImmutableMap?
If we decide we really need faster topic ID to name lookups, I would consider adding it to TopicsImage. Managing a cache outside of the image will be a bit difficult.
Another thing to consider is the lifetime of the cache. Do we really need the ID + name mappings kept in memory forever?
* Provides an implementation of equals, hashCode and toString based on the underlying | ||
* TopicsImage. | ||
*/ | ||
private abstract static class BaseTopicResolver implements TopicResolver { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this abstract class? Can we just have the interface and concrete class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can remove it.
The lifetime of the cache is bound to the call. It is not kept forever.
Yes.
We could consider this separately. At the moment, we don't really have the time to do it. The current strategy seems to be a good tradeoff at the moment given that it is only bound to the call and not kept forever. |
@dajac thanks for the explanation, makes sense. Can we include a javadoc on the class describing the expected lifetime of this class? |
It's used in TargetAssignmentBuilder.build(), which is used by the new group coordinator. I've updated the javadoc to describe the lifetime of the cache. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks.
@mumrah I will merge it. If you have further comments, @squah-confluent can address separately. |
Looking up topics in a TopicsImage is relatively slow. Cache the results
in TopicIds to improve assignor performance. In benchmarks, we see a
noticeable improvement in performance in the heterogeneous case.
Before
After
Based heavily on #16527.
Committer Checklist (excluded from commit message)