Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance: attempt to optimize the ArrayHitCounter by maintaining some state while updating the counter #721

Draft
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

alexklibisz
Copy link
Owner

@alexklibisz alexklibisz commented Aug 28, 2024

Related Issue

#160, #611

Changes

I had two ideas to speed up the ArrayHitCouner here, but they don't seem to be panning out in the benchmarks:

  1. Maintain the histogram that's used to determine the kthGreatest value while updating the counter, i.e., each time we encounter a document.
  2. Maintain the minimum and maximum document ID for a given count while updating the counter, so that we can limit the DocIdSetIterator to start and end at the min/max doc IDs of the documents with counts that qualify to be a candidate. I removed this idea in 2624988.

Testing and Validation

How was it validated?

@alexklibisz alexklibisz changed the title Performance: optimize the hit counter by limiting the number of documents it needs to iterate over Performance: attempt to optimize the hit counter by limiting the number of documents it needs to iterate over Aug 29, 2024
@alexklibisz alexklibisz changed the title Performance: attempt to optimize the hit counter by limiting the number of documents it needs to iterate over Performance: attempt to optimize the ArrayHitCounter by maintaining some state while updating the counter Aug 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant